Uncertainty Theory Third Edition
Baoding Liu Uncertainty Theory Laboratory Department of Mathematical Sciences Tsinghua University Beijing 100084, China
[email protected] http://orsc.edu.cn/liu
c 2008 by UTLAB 3rd Edition c 2008 by WAP Japanese Translation Version c 2007 by Springer-Verlag Berlin 2nd Edition c 2004 by Springer-Verlag Berlin 1st Edition
Reference to this book should be made as follows: Liu B, Uncertainty Theory, 3rd ed., http://orsc.edu.cn/liu/ut.pdf
Contents Preface
ix
1 Probability Theory 1.1 Probability Space . . . . . . . . 1.2 Random Variables . . . . . . . . 1.3 Probability Distribution . . . . . 1.4 Independence . . . . . . . . . . . 1.5 Identical Distribution . . . . . . 1.6 Expected Value . . . . . . . . . 1.7 Variance . . . . . . . . . . . . . 1.8 Moments . . . . . . . . . . . . . 1.9 Critical Values . . . . . . . . . . 1.10 Entropy . . . . . . . . . . . . . . 1.11 Distance . . . . . . . . . . . . . 1.12 Inequalities . . . . . . . . . . . . 1.13 Convergence Concepts . . . . . . 1.14 Conditional Probability . . . . . 1.15 Stochastic Process . . . . . . . . 1.16 Stochastic Calculus . . . . . . . 1.17 Stochastic Differential Equation
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
2 Credibility Theory 2.1 Credibility Space . . . . 2.2 Fuzzy Variables . . . . 2.3 Membership Function . 2.4 Credibility Distribution 2.5 Independence . . . . . . 2.6 Identical Distribution . 2.7 Expected Value . . . . 2.8 Variance . . . . . . . . 2.9 Moments . . . . . . . . 2.10 Critical Values . . . . . 2.11 Entropy . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
53 . 53 . 63 . 65 . 69 . 74 . 79 . 80 . 92 . 95 . 96 . 100
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
1 1 4 7 11 13 14 23 25 26 28 32 32 35 40 43 47 50
vi
Contents
2.12 2.13 2.14 2.15 2.16 2.17 2.18
Distance . . . . . . . . . . Inequalities . . . . . . . . . Convergence Concepts . . . Conditional Credibility . . Fuzzy Process . . . . . . . Fuzzy Calculus . . . . . . . Fuzzy Differential Equation
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
106 107 110 114 119 124 128
3 Chance Theory 3.1 Chance Space . . . . . . . . 3.2 Hybrid Variables . . . . . . . 3.3 Chance Distribution . . . . . 3.4 Expected Value . . . . . . . 3.5 Variance . . . . . . . . . . . 3.6 Moments . . . . . . . . . . . 3.7 Independence . . . . . . . . . 3.8 Identical Distribution . . . . 3.9 Critical Values . . . . . . . . 3.10 Entropy . . . . . . . . . . . . 3.11 Distance . . . . . . . . . . . 3.12 Inequalities . . . . . . . . . . 3.13 Convergence Concepts . . . . 3.14 Conditional Chance . . . . . 3.15 Hybrid Process . . . . . . . . 3.16 Hybrid Calculus . . . . . . . 3.17 Hybrid Differential Equation
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
129 129 137 143 146 149 151 151 153 153 156 157 157 160 164 169 171 174
4 Uncertainty Theory 4.1 Uncertainty Space . . . . 4.2 Uncertain Variables . . . 4.3 Identification Function . 4.4 Uncertainty Distribution 4.5 Expected Value . . . . . 4.6 Variance . . . . . . . . . 4.7 Moments . . . . . . . . . 4.8 Independence . . . . . . . 4.9 Identical Distribution . . 4.10 Critical Values . . . . . . 4.11 Entropy . . . . . . . . . . 4.12 Distance . . . . . . . . . 4.13 Inequalities . . . . . . . . 4.14 Convergence Concepts . . 4.15 Conditional Uncertainty . 4.16 Uncertain Process . . . . 4.17 Uncertain Calculus . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
177 177 181 184 185 187 190 191 193 193 194 195 196 197 200 201 205 209
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
Contents
vii
4.18 Uncertain Differential Equation . . . . . . . . . . . . . . . . 211 A Measurable Sets
213
B Classical Measures
216
C Measurable Functions
218
D Lebesgue Integral
221
E Euler-Lagrange Equation
224
F Maximum Uncertainty Principle
225
G Uncertainty Relations
226
Bibliography
229
List of Frequently Used Symbols
244
Index
245
viii
Contents
Preface There are various types of uncertainty in the real world. Randomness is a basic type of objective uncertainty, and probability theory is a branch of mathematics for studying the behavior of random phenomena. The study of probability theory was started by Pascal and Fermat (1654), and an axiomatic foundation of probability theory was given by Kolmogoroff (1933) in his Foundations of Probability Theory. Probability theory has been widely applied in science and engineering. Chapter 1 will provide the probability theory. Fuzziness is a basic type of subjective uncertainty initiated by Zadeh (1965). Credibility theory is a branch of mathematics for studying the behavior of fuzzy phenomena. The study of credibility theory was started by Liu and Liu (2002), and an axiomatic foundation of credibility theory was given by Liu (2004) in his Uncertainty Theory. Chapter 2 will introduce the credibility theory. Sometimes, fuzziness and randomness simultaneously appear in a system. A hybrid variable was proposed by Liu (2006) as a tool to describe the quantities with fuzziness and randomness. Both fuzzy random variable and random fuzzy variable are instances of hybrid variable. In addition, Li and Liu (2007) introduced the concept of chance measure for hybrid events. After that, chance theory was developed steadily. Essentially, chance theory is a hybrid of probability theory and credibility theory. Chapter 3 will offer the chance theory. In order to deal with general uncertainty, Liu (2007) founded an uncertainty theory in his Uncertainty Theory, and had it become a branch of mathematics based on normality, monotonicity, self-duality, and countable subadditivity axioms. Probability theory, credibility theory and chance theory are three special cases of uncertainty theory. Chapter 4 is devoted to the uncertainty theory. For this new edition the entire text has been totally rewritten. More importantly, uncertain process and uncertain calculus as well as uncertain differential equations have been added. The book is suitable for mathematicians, researchers, engineers, designers, and students in the field of mathematics, information science, operations research, industrial engineering, computer science, artificial intelligence, and management science. The readers will learn the axiomatic approach of uncertainty theory, and find this work a stimulating and useful reference.
x
Preface
Baoding Liu Tsinghua University http://orsc.edu.cn/liu March 5, 2008
Chapter 1
Probability Theory Probability measure is essentially a set function (i.e., a function whose argument is a set) satisfying normality, nonnegativity and countable additivity axioms. Probability theory is a branch of mathematics for studying the behavior of random phenomena. The emphasis in this chapter is mainly on probability space, random variable, probability distribution, independence, identical distribution, expected value, variance, moments, critical values, entropy, distance, convergence almost surely, convergence in probability, convergence in mean, convergence in distribution, conditional probability, stochastic process, renewal process, Brownian motion, stochastic calculus, and stochastic differential equation. The main results in this chapter are well-known. For this reason the credit references are not provided.
1.1
Probability Space
Let Ω be a nonempty set, and A a σ-algebra over Ω. If Ω is countable, usually A is the power set of Ω. If Ω is uncountable, for example Ω = [0, 1], usually A is the Borel algebra of Ω. Each element in A is called an event. In order to present an axiomatic definition of probability, it is necessary to assign to each event A a number Pr{A} which indicates the probability that A will occur. In order to ensure that the number Pr{A} has certain mathematical properties which we intuitively expect a probability to have, the following three axioms must be satisfied: Axiom 1. (Normality) Pr{Ω} = 1. Axiom 2. (Nonnegativity) Pr{A} ≥ 0 for any event A. Axiom 3. (Countable Additivity) For every countable sequence of mutually
2
Chapter 1 - Probability Theory
disjoint events {Ai }, we have (∞ ) ∞ [ X Pr Ai = Pr{Ai }. i=1
(1.1)
i=1
Definition 1.1 The set function Pr is called a probability measure if it satisfies the normality, nonnegativity, and countable additivity axioms. Example 1.1: Let Ω = {ω1 , ω2 , · · · }, and let A be the power set of Ω. Assume that p1 , p2 , · · · are nonnegative numbers such that p1 + p2 + · · · = 1. Define a set function on A as X Pr{A} = pi , A ∈ A. (1.2) ωi ∈A
Then Pr is a probability measure. Example 1.2: Let Ω = [0, 1] and let A be the Borel algebra over Ω. If Pr is the Lebesgue measure, then Pr is a probability measure. Theorem 1.1 Let Ω be a nonempty set, A a σ-algebra over Ω, and Pr a probability measure. Then we have (a) Pr{∅} = 0; (b) Pr is self-dual, i.e., Pr{A} + Pr{Ac } = 1 for any A ∈ A; (c) Pr is increasing, i.e., Pr{A} ≤ Pr{B} whenever A ⊂ B. Proof: (a) Since ∅ and Ω are disjoint events and ∅ ∪ Ω = Ω, we have Pr{∅} + Pr{Ω} = Pr{Ω} which makes Pr{∅} = 0. (b) Since A and Ac are disjoint events and A ∪ Ac = Ω, we have Pr{A} + Pr{Ac } = Pr{Ω} = 1. (c) Since A ⊂ B, we have B = A ∪ (B ∩ Ac ), where A and B ∩ Ac are disjoint events. Therefore Pr{B} = Pr{A} + Pr{B ∩ Ac } ≥ Pr{A}. Probability Continuity Theorem Theorem 1.2 (Probability Continuity Theorem) Let Ω be a nonempty set, A a σ-algebra over Ω, and Pr a probability measure. If A1 , A2 , · · · ∈ A and limi→∞ Ai exists, then n o lim Pr{Ai } = Pr lim Ai . (1.3) i→∞
i→∞
Proof: Step 1: Suppose {Ai } is an increasing sequence. Write Ai → A and A0 = ∅. Then {Ai \Ai−1 } is a sequence of disjoint events and ∞ [ i=1
(Ai \Ai−1 ) = A,
k [ i=1
(Ai \Ai−1 ) = Ak
3
Section 1.1 - Probability Space
for k = 1, 2, · · · Thus we have ∞ ∞ S P Pr{A} = Pr (Ai \Ai−1 ) = Pr {Ai \Ai−1 } i=1
= lim
i=1
k P
k→∞ i=1
Pr {Ai \Ai−1 } = lim Pr k→∞
k S
(Ai \Ai−1 )
i=1
= lim Pr{Ak }. k→∞
Step 2: If {Ai } is a decreasing sequence, then the sequence {A1 \Ai } is clearly increasing. It follows that n o Pr{A1 } − Pr{A} = Pr lim (A1 \Ai ) = lim Pr {A1 \Ai } i→∞
i→∞
= Pr{A1 } − lim Pr{Ai } i→∞
which implies that Pr{Ai } → Pr{A}. Step 3: If {Ai } is a sequence of events such that Ai → A, then for each k, we have ∞ ∞ [ \ Ai . Ai ⊂ Ak ⊂ i=k
i=k
Since Pr is increasing, we have (∞ ) (∞ ) \ [ Pr Ai ≤ Pr{Ak } ≤ Pr Ai . i=k
Note that
i=k
∞ \ i=k
Ai ↑ A,
∞ [
Ai ↓ A.
i=k
It follows from Steps 1 and 2 that Pr{Ai } → Pr{A}. Probability Space Definition 1.2 Let Ω be a nonempty set, A a σ-algebra over Ω, and Pr a probability measure. Then the triplet (Ω, A, Pr) is called a probability space. Example 1.3: Let Ω = {ω1 , ω2 , · · · }, A the power set of Ω, and Pr a probability measure defined by (1.2). Then (Ω, A, Pr) is a probability space. Example 1.4: Let Ω = [0, 1], A the Borel algebra over Ω, and Pr the Lebesgue measure. Then ([0, 1], A, Pr) is a probability space, and sometimes is called Lebesgue unit interval. For many purposes it is sufficient to use it as the basic probability space.
4
Chapter 1 - Probability Theory
Product Probability Space Let (Ωi , Ai , Pri ), i = 1, 2, · · · , n be probability spaces, and Ω = Ω1 × Ω2 × · · · × Ωn , A = A1 × A2 × · · · × An . Note that the probability measures Pri , i = 1, 2, · · · , n are finite. It follows from the product measure theorem that there is a unique measure Pr on A such that Pr{A1 × A2 × · · · × An } = Pr1 {A1 } × Pr2 {A2 } × · · · × Prn {An } for any Ai ∈ Ai , i = 1, 2, · · · , n. This conclusion is called the product probability theorem. The measure Pr is also a probability measure since Pr{Ω} = Pr1 {Ω1 } × Pr2 {Ω2 } × · · · × Prn {Ωn } = 1. Such a probability measure is called the product probability measure, denoted by Pr = Pr1 × Pr2 × · · · × Prn . Definition 1.3 Let (Ωi , Ai , Pri ), i = 1, 2, · · · , n be probability spaces, and Ω = Ω1 × Ω2 × · · · × Ωn , A = A1 × A2 × · · · × An , Pr = Pr1 × Pr2 × · · · × Prn . Then the triplet (Ω, A, Pr) is called the product probability space. Infinite Product Probability Space Let (Ωi , Ai , Pri ), i = 1, 2, · · · be an arbitrary sequence of probability spaces, and Ω = Ω1 × Ω2 × · · · , A = A1 × A2 × · · · (1.4) It follows from the infinite product measure theorem that there is a unique probability measure Pr on A such that Pr {A1 × · · · × An × Ωn+1 × Ωn+2 × · · · } = Pr1 {A1 } × · · · × Prn {An } for any measurable rectangle A1 × · · · × An × Ωn+1 × Ωn+2 × · · · and all n = 1, 2, · · · The probability measure Pr is called the infinite product of Pri , i = 1, 2, · · · and is denoted by Pr = Pr1 × Pr2 × · · ·
(1.5)
Definition 1.4 Let (Ωi , Ai , Pri ), i = 1, 2, · · · be probability spaces, and Ω = Ω1 × Ω2 × · · · , A = A1 × A2 × · · · , Pr = Pr1 × Pr2 × · · · Then the triplet (Ω, A, Pr) is called the infinite product probability space.
1.2
Random Variables
Definition 1.5 A random variable is a measurable function from a probability space (Ω, A, Pr) to the set of real numbers, i.e., for any Borel set B of real numbers, the set {ξ ∈ B} = {ω ∈ Ω ξ(ω) ∈ B} (1.6) is an event.
Section 1.2 - Random Variables
5
Example 1.5: Take (Ω, A, Pr) to be {ω1 , ω2 } with Pr{ω1 } = Pr{ω2 } = 0.5. Then the function ( 0, if ω = ω1 ξ(ω) = 1, if ω = ω2 is a random variable. Example 1.6: Take (Ω, A, Pr) to be the interval [0, 1] with Borel algebra and Lebesgue measure. We define ξ as an identity function from Ω to [0,1]. Since ξ is a measurable function, it is a random variable. Example 1.7: A deterministic number c may be regarded as a special random variable. In fact, it is the constant function ξ(ω) ≡ c on the probability space (Ω, A, Pr). Definition 1.6 A random variable ξ is said to be (a) nonnegative if Pr{ξ < 0} = 0; (b) positive if Pr{ξ ≤ 0} = 0; (c) continuous if Pr{ξ = x} = 0 for each x ∈ <; (d) simple if there exists a finite sequence {x1 , x2 , · · · , xm } such that Pr {ξ 6= x1 , ξ 6= x2 , · · · , ξ 6= xm } = 0;
(1.7)
(e) discrete if there exists a countable sequence {x1 , x2 , · · · } such that Pr {ξ 6= x1 , ξ 6= x2 , · · · } = 0.
(1.8)
Definition 1.7 Let ξ1 and ξ2 be random variables defined on the probability space (Ω, A, Pr). We say ξ1 = ξ2 if ξ1 (ω) = ξ2 (ω) for almost all ω ∈ Ω. Random Vector Definition 1.8 An n-dimensional random vector is a measurable function from a probability space (Ω, A, Pr) to the set of n-dimensional real vectors, i.e., for any Borel set B of
6
Chapter 1 - Probability Theory
which implies that ξ1 is a random variable. A similar process may prove that ξ2 , ξ3 , · · · , ξn are random variables. Conversely, suppose that all ξ1 , ξ2 , · · · , ξn are random variables on the probability space (Ω, A, Pr). We define B = B ⊂
i=1
Next, the class B is a σ-algebra of
i=1
which implies that ∪i Bi ∈ B. Since the smallest σ-algebra containing all open intervals of
∀ω ∈ Ω.
(1.10)
Example 1.8: Let ξ1 and ξ2 be random variables on the probability space (Ω, A, Pr). Then their sum is (ξ1 + ξ2 )(ω) = ξ1 (ω) + ξ2 (ω),
∀ω ∈ Ω
7
Section 1.3 - Probability Distribution
and their product is (ξ1 × ξ2 )(ω) = ξ1 (ω) × ξ2 (ω),
∀ω ∈ Ω.
The reader may wonder whether ξ(ω1 , ω2 , · · · , ωn ) defined by (1.9) is a random variable. The following theorem answers this question. Theorem 1.4 Let ξ be an n-dimensional random vector, and f :
1.3
Probability Distribution
Definition 1.10 The probability distribution Φ: < → [0, 1] of a random variable ξ is defined by Φ(x) = Pr ω ∈ Ω ξ(ω) ≤ x . (1.11) That is, Φ(x) is the probability that the random variable ξ takes a value less than or equal to x. Example 1.9: Assume that the random variables ξ and η have the same probability distribution. One question is whether ξ = η or not. Generally speaking, it is not true. Take (Ω, A, Pr) to be {ω1 , ω2 } with Pr{ω1 } = Pr{ω2 } = 0.5. We now define two random variables as follows, −1, if ω = ω1 1, if ω = ω1 ξ(ω) = η(ω) = 1, if ω = ω2 , −1, if ω = ω2 . Then ξ and η have the same probability distribution, 0, if x < −1 0.5, if − 1 ≤ x < 1 Φ(x) = 1, if x ≥ 1. However, it is clear that ξ 6= η in the sense of Definition 1.7. Theorem 1.5 (Sufficient and Necessary Condition for Probability Distribution) A function Φ : < → [0, 1] is a probability distribution if and only if it is an increasing and right-continuous function with lim Φ(x) = 0;
x→−∞
lim Φ(x) = 1.
x→+∞
(1.12)
8
Chapter 1 - Probability Theory
Proof: For any x, y ∈ < with x < y, we have Φ(y) − Φ(x) = Pr{x < ξ ≤ y} ≥ 0. Thus the probability distribution Φ is increasing. Next, let {εi } be a sequence of positive numbers such that εi → 0 as i → ∞. Then, for every i ≥ 1, we have Φ(x + εi ) − Φ(x) = Pr{x < ξ ≤ x + εi }. It follows from the probability continuity theorem that lim Φ(x + εi ) − Φ(x) = Pr{∅} = 0.
i→∞
Hence Φ is a right-continuous function. Finally, lim Φ(x) = lim Pr{ξ ≤ x} = Pr{∅} = 0,
x→−∞
x→−∞
lim Φ(x) = lim Pr{ξ ≤ x} = Pr{Ω} = 1.
x→+∞
x→+∞
Conversely, it is known there is a unique probability measure Pr on the Borel algebra over < such that Pr{(−∞, x]} = Φ(x) for all x ∈ <. Furthermore, it is easy to verify that the random variable defined by ξ(x) = x from the probability space (<, A, Pr) to < has the probability distribution Φ. Remark 1.1: Theorem 1.5 states that the identity function is a universal function for any probability distribution by defining an appropriate probability space. In fact, there is a universal probability space for any probability distribution by defining an appropriate function. Theorem 1.6 Let Φ be a probability distribution. Then there is a random variable on Lebesgue unit interval ([0, 1], A, Pr) whose probability distribution is just Φ. Proof: Define ξ(ω) = sup{x|Φ(x) ≤ ω} on the Lebesgue unit interval. Then ξ is a random variable whose probability distribution is just Φ because Pr{ξ ≤ y} = Pr ω sup {x|Φ(x) ≤ ω} ≤ y = Pr ω ω ≤ Φ(y) = Φ(y) for any y ∈ <. Theorem 1.7 A random variable ξ with probability distribution Φ is (a) nonnegative if and only if Φ(x) = 0 for all x < 0; (b) positive if and only if Φ(x) = 0 for all x ≤ 0; (c) simple if and only if Φ is a simple function; (d) discrete if and only if Φ is a step function; (e) continuous if and only if Φ is a continuous function.
9
Section 1.3 - Probability Distribution
Proof: The parts (a), (b), (c) and (d) follow immediately from the definition. Next we prove the part (e). If ξ is a continuous random variable, then Pr{ξ = x} = 0. It follows from the probability continuity theorem that lim (Φ(x) − Φ(y)) = lim Pr{y < ξ ≤ x} = Pr{ξ = x} = 0 y↑x
y↑x
which proves the left-continuity of Φ. Since a probability distribution is always right-continuous, Φ is continuous. Conversely, if Φ is continuous, then we immediately have Pr{ξ = x} = 0 for each x ∈ <. Definition 1.11 A continuous random variable is said to be (a) singular if its probability distribution is a singular function; (b) absolutely continuous if its probability distribution is an absolutely continuous function. Probability Density Function Definition 1.12 The probability density function φ: < → [0, +∞) of a random variable ξ is a function such that Z x Φ(x) = φ(y)dy (1.13) −∞
holds for all x ∈ <, where Φ is the probability distribution of the random variable ξ. R +∞ Let φ : < → [0, +∞) be a measurable function such that −∞ φ(x)dx = 1. Then φ Ris the probability density function of some random variable because x Φ(x) = −∞ φ(y)dy is an increasing and continuous function satisfying (1.12). Theorem 1.8 Let ξ be a random variable whose probability density function φ exists. Then for any Borel set B of <, we have Z Pr{ξ ∈ B} = φ(y)dy. (1.14) B
Proof: Let
C be the class of all subsets C of < for which the relation Z Pr{ξ ∈ C} =
φ(y)dy
(1.15)
C
holds. We will show that C contains all Borel sets of <. It follows from the probability continuity theorem and relation (1.15) that C is a monotone class. It is also clear that C contains all intervals of the form (−∞, a], (a, b], (b, ∞) and ∅ since Z a Pr{ξ ∈ (−∞, a]} = Φ(a) = φ(y)dy, −∞
Z Pr{ξ ∈ (b, +∞)} = Φ(+∞) − Φ(b) =
+∞
φ(y)dy, b
10
Chapter 1 - Probability Theory
Z
b
Pr{ξ ∈ (a, b]} = Φ(b) − Φ(a) =
φ(y)dy, a
Z Pr{ξ ∈ ∅} = 0 =
φ(y)dy ∅
where Φ is the probability distribution of ξ. Let F be the algebra consisting of all finite unions of disjoint sets of the form (−∞, a], (a, b], (b, ∞) and ∅. Note that for any disjoint sets C1 , C2 , · · · , Cm of F and C = C1 ∪ C2 ∪ · · · ∪ Cm , we have Pr{ξ ∈ C} =
m X
Pr{ξ ∈ Cj } =
j=1
m Z X j=1
Z φ(y)dy =
Cj
φ(y)dy. C
That is, C ∈ C. Hence we have F ⊂ C. Since the smallest σ-algebra containing F is just the Borel algebra of <, the monotone class theorem implies that C contains all Borel sets of <. Some Special Distributions Uniform Distribution: A random variable ξ has a uniform distribution if its probability density function is defined by 1 , if a ≤ x ≤ b b−a φ(x) = 0, otherwise
(1.16)
denoted by U(a, b), where a and b are given real numbers with a < b. Exponential Distribution: A random variable ξ has an exponential distribution if its probability density function is defined by 1 exp − x , if x ≥ 0 β β φ(x) = (1.17) 0, if x < 0 denoted by EX P(β), where β is a positive number. Normal Distribution: A random variable ξ has a normal distribution if its probability density function is defined by φ(x) =
1 (x − µ)2 √ exp − , 2σ 2 σ 2π
x∈<
denoted by N (µ, σ 2 ), where µ and σ are real numbers.
(1.18)
11
Section 1.4 - Independence
Joint Probability Distribution Definition 1.13 The joint probability distribution Φ :
−∞
−∞
holds for all (x1 , x2 , · · · , xn ) ∈
1.4
Independence
Definition 1.15 The random variables ξ1 , ξ2 , · · · , ξm are said to be independent if (m ) m \ Y Pr {ξi ∈ Bi } = Pr{ξi ∈ Bi } (1.19) i=1
i=1
for any Borel sets B1 , B2 , · · · , Bm of <. Theorem 1.9 Let ξ1 , ξ2 , · · · , ξm be independent random variables, and f1 , f2 , · · · , fn are measurable functions. Then f1 (ξ1 ), f2 (ξ2 ), · · · , fm (ξm ) are independent random variables. Proof: For any Borel sets of B1 , B2 , · · · , Bm of <, we have (m ) (m ) \ \ −1 Pr {fi (ξi ) ∈ Bi } = Pr {ξi ∈ fi (Bi )} i=1
=
m Y
Pr{ξi ∈ fi−1 (Bi )} =
i=1
i=1 m Y
Pr{fi (ξi ) ∈ Bi }.
i=1
Thus f1 (ξ1 ), f2 (ξ2 ), · · · , fm (ξm ) are independent random variables. Theorem 1.10 Let ξi be random variables with probability distributions Φi , i = 1, 2, · · · , m, respectively, and Φ the probability distribution of the random vector (ξ1 , ξ2 , · · · , ξm ). Then ξ1 , ξ2 , · · · , ξm are independent if and only if Φ(x1 , x2 , · · · , xm ) = Φ1 (x1 )Φ2 (x2 ) · · · Φm (xm ) for all (x1 , x2 , · · · , xm ) ∈ <m .
(1.20)
12
Chapter 1 - Probability Theory
Proof: If ξ1 , ξ2 , · · · , ξm are independent random variables, then we have Φ(x1 , x2 , · · · , xm ) = Pr{ξ1 ≤ x1 , ξ2 ≤ x2 , · · · , ξm ≤ xm } = Pr{ξ1 ≤ x1 } Pr{ξ2 ≤ x2 } · · · Pr{ξm ≤ xm } = Φ1 (x1 )Φ2 (x2 ) · · · Φm (xm ) for all (x1 , x2 , · · · , xm ) ∈ <m . Conversely, assume that (1.20) holds. Let x2 , x3 , · · · , xm be fixed real numbers, and C the class of all subsets C of < for which the relation Pr{ξ1 ∈ C, ξ2 ≤ x2 , · · · , ξm ≤ xm } = Pr{ξ1 ∈ C}
m Y
Pr{ξi ≤ xi }
(1.21)
i=2
holds. We will show that C contains all Borel sets of <. It follows from the probability continuity theorem and relation (1.21) that C is a monotone class. It is also clear that C contains all intervals of the form (−∞, a], (a, b], (b, ∞) and ∅. Let F be the algebra consisting of all finite unions of disjoint sets of the form (−∞, a], (a, b], (b, ∞) and ∅. Note that for any disjoint sets C1 , C2 , · · · , Ck of F and C = C1 ∪ C2 ∪ · · · ∪ Ck , we have Pr{ξ1 ∈ C, ξ2 ≤ x2 , · · · , ξm ≤ xm } m P = Pr{ξ1 ∈ Cj , ξ2 ≤ x2 , · · · , ξm ≤ xm } j=1
= Pr{ξ1 ∈ C} Pr{ξ2 ≤ x2 } · · · Pr{ξm ≤ xm }.
That is, C ∈ C. Hence we have F ⊂ C. Since the smallest σ-algebra containing F is just the Borel algebra of <, the monotone class theorem implies that C contains all Borel sets of <. Applying the same reasoning to each ξi in turn, we obtain the independence of the random variables. Theorem 1.11 Let ξi be random variables with probability density functions φi , i = 1, 2, · · · , m, respectively, and φ the probability density function of the random vector (ξ1 , ξ2 , · · · , ξm ). Then ξ1 , ξ2 , · · · , ξm are independent if and only if φ(x1 , x2 , · · · , xm ) = φ1 (x1 )φ2 (x2 ) · · · φm (xm ) (1.22) for almost all (x1 , x2 , · · · , xm ) ∈ <m . Proof: If φ(x1 , x2 , · · · , xm ) = φ1 (x1 )φ2 (x2 ) · · · φm (xm ) a.e., then we have Z x1 Z x2 Z xm Φ(x1 , x2 , · · · , xm ) = ··· φ(t1 , t2 , · · · , tm )dt1 dt2 · · · dtm −∞ x1
Z
−∞ x2
−∞ xm
Z
Z ···
= −∞ Z x1
=
−∞
φ1 (t1 )φ2 (t2 ) · · · φm (tm )dt1 dt2 · · · dtm −∞ Z x2
−∞
Z
xm
φ2 (t2 )dt2 · · ·
φ1 (t1 )dt1 −∞
= Φ1 (x1 )Φ2 (x2 ) · · · Φm (xm )
φm (tm )dtm −∞
13
Section 1.5 - Identical Distribution
for all (x1 , x2 , · · · , xm ) ∈ <m . Thus ξ1 , ξ2 , · · · , ξm are independent. Conversely, if ξ1 , ξ2 , · · · , ξm are independent, then for any (x1 , x2 , · · · , xm ) ∈ <m , we have Φ(x1 , x2 , · · · , xm ) = Φ1 (x1 )Φ2 (x2 ) · · · Φm (xm ). Hence Z
x1
Z
x2
Z
xm
φ1 (t1 )φ2 (t2 ) · · · φm (tm )dt1 dt2 · · · dtm
···
Φ(x1 , x2 , · · · , xm ) = −∞
−∞
−∞
which implies that φ(x1 , x2 , · · · , xm ) = φ1 (x1 )φ2 (x2 ) · · · φm (xm ) a.e. Example 1.10: Let ξ1 , ξ2 , · · · , ξm be independent random variables with probability density functions φ1 , φ2 , · · · , φm , respectively, and f : <m → < a measurable function. Then for any Borel set B of real numbers, the probability Pr{f (ξ1 , ξ2 , · · · , ξm ) ∈ B} is ZZ Z φ1 (x1 )φ2 (x2 ) · · · φm (xm )dx1 dx2 · · · dxm . ··· f (x1 ,x2 ,··· ,xm )∈B
1.5
Identical Distribution
Definition 1.16 The random variables ξ and η are said to be identically distributed if Pr{ξ ∈ B} = Pr{ξ ∈ B} (1.23) for any Borel set B of <. Theorem 1.12 The random variables ξ and η are identically distributed if and only if they have the same probability distribution. Proof: Let Φ and Ψ be the probability distributions of ξ and η, respectively. If ξ and η are identically distributed random variables, then, for any x ∈ <, we have Φ(x) = Pr{ξ ∈ (−∞, x]} = Pr{η ∈ (−∞, x]} = Ψ(x). Thus ξ and η have the same probability distribution. Conversely, assume that ξ and η have the same probability distribution. Let C be the class of all subsets C of < for which the relation Pr{ξ ∈ C} = Pr{η ∈ C}
(1.24)
holds. We will show that C contains all Borel sets of <. It follows from the probability continuity theorem and relation (1.24) that C is a monotone class. It is also clear that C contains all intervals of the form (−∞, a], (a, b], (b, ∞) and ∅ since ξ and η have the same probability distribution. Let F be the algebra consisting of all finite unions of disjoint sets of the form (−∞, a],
14
Chapter 1 - Probability Theory
(a, b], (b, ∞) and ∅. Note that for any disjoint sets C1 , C2 , · · · , Ck of C = C1 ∪ C2 ∪ · · · ∪ Ck , we have Pr{ξ ∈ C} =
k X
Pr{ξ ∈ Cj } =
j=1
k X
F and
Pr{η ∈ Cj } = Pr{η ∈ C}.
j=1
That is, C ∈ C. Hence we have F ⊂ C. Since the smallest σ-algebra containing F is just the Borel algebra of <, the monotone class theorem implies that C contains all Borel sets of <. Theorem 1.13 Let ξ and η be two random variables whose probability density functions exist. Then ξ and η are identically distributed if and only if they have the same probability density function. Proof: It follows from Theorem 1.12 that the random variables ξ and η are identically distributed if and only if they have the same probability distribution, if and only if they have the same probability density function.
1.6
Expected Value
Definition 1.17 Let ξ be a random variable. Then the expected value of ξ is defined by +∞
Z
Z
0
Pr{ξ ≥ r}dr −
E[ξ] =
Pr{ξ ≤ r}dr
(1.25)
−∞
0
provided that at least one of the two integrals is finite. Example 1.11: Let ξ be a uniformly distributed random variable U(a, b). If a ≥ 0, then Pr{ξ ≤ r} ≡ 0 when r ≤ 0, and 1, if r ≤ a (b − r)/(b − a), if a ≤ r ≤ b Pr{ξ ≥ r} = 0, if r ≥ b, Z E[ξ] =
a
Z 1dr +
0
a
b
b−r dr + b−a
Z
!
+∞
0dr b
Z
0
−
0dr = −∞
If b ≤ 0, then Pr{ξ ≥ r} ≡ 0 when r ≥ 0, and 1, if r ≥ b (r − a)/(b − a), if a ≤ r ≤ b Pr{ξ ≤ r} = 0, if r ≤ a,
a+b . 2
15
Section 1.6 - Expected Value +∞
Z
Z
a
0dr −
E[ξ] =
b
Z
r−a dr + b−a
0dr + −∞
0
a
!
0
Z
1dr
=
b
a+b . 2
If a < 0 < b, then (
(b − r)/(b − a), if 0 ≤ r ≤ b 0, if r ≥ b,
(
0, if r ≤ a (r − a)/(b − a), if a ≤ r ≤ 0,
Pr{ξ ≥ r} =
Pr{ξ ≤ r} =
Z E[ξ] = 0
b
b−r dr + b−a
Z
!
+∞
Z
a
−
0dr
Z 0dr +
−∞
b
a
0
r−a dr b−a
=
a+b . 2
Thus we always have the expected value (a + b)/2. Example 1.12: Let ξ be an exponentially distributed random variable EX P(β). Then its expected value is β. Example 1.13: Let ξ be a normally distributed random variable N (µ, σ 2 ). Then its expected value is µ. Example 1.14: Assume that ξ is a discrete random variable taking values xi with probabilities pi , i = 1, 2, · · · , m, respectively. It follows from the definition of expected value operator that
E[ξ] =
m X
pi xi .
i=1
Theorem 1.14 Let ξ be a random variable whose probability density function φ exists. If the Lebesgue integral Z
+∞
xφ(x)dx −∞
is finite, then we have Z
+∞
E[ξ] =
xφ(x)dx. −∞
(1.26)
16
Chapter 1 - Probability Theory
Proof: It follows from Definition 1.17 and Fubini Theorem that Z 0 Z +∞ Pr{ξ ≤ r}dr Pr{ξ ≥ r}dr − E[ξ] = −∞
0 +∞
Z
Z
+∞
= +∞ Z
0
Z
x
Z
0
Z
0
φ(x)dr dx −
= +∞
Z =
x
0
Z xφ(x)dx +
xφ(x)dx −∞
0
Z
φ(x)dx dr
φ(x)dr dx −∞
0
0
r
−∞
−∞
r
0
Z
Z φ(x)dx dr −
+∞
=
xφ(x)dx. −∞
The theorem is proved. Theorem 1.15 Let ξ be a random variable with probability distribution Φ. If the Lebesgue-Stieltjes integral Z +∞ xdΦ(x) −∞
is finite, then we have Z
+∞
E[ξ] =
xdΦ(x).
(1.27)
−∞
R +∞ Proof: Since the Lebesgue-Stieltjes integral −∞ xdΦ(x) is finite, we immediately have Z y Z +∞ Z 0 Z 0 lim xdΦ(x) = xdΦ(x), lim xdΦ(x) = xdΦ(x) y→+∞
0
y→−∞
0
and Z
+∞
lim
y→+∞
Z xdΦ(x) = 0,
y
y
lim
y→−∞
−∞
y
xdΦ(x) = 0. −∞
It follows from Z +∞ xdΦ(x) ≥ y lim Φ(z) − Φ(y) = y(1 − Φ(y)) ≥ 0, z→+∞
y
Z
y
xdΦ(x) ≤ y Φ(y) − lim Φ(z) = yΦ(y) ≤ 0, z→−∞
−∞
that lim y (1 − Φ(y)) = 0,
y→+∞
lim yΦ(y) = 0.
y→−∞
if y > 0,
if y < 0
17
Section 1.6 - Expected Value
Let 0 = x0 < x1 < x2 < · · · < xn = y be a partition of [0, y]. Then we have n−1 X
xi (Φ(xi+1 ) − Φ(xi )) →
xdΦ(x) 0
i=0
and
n−1 X
y
Z
y
Z (1 − Φ(xi+1 ))(xi+1 − xi ) →
Pr{ξ ≥ r}dr 0
i=0
as max{|xi+1 − xi | : i = 0, 1, · · · , n − 1} → 0. Since n−1 X
xi (Φ(xi+1 ) − Φ(xi )) −
i=0
n−1 X
(1 − Φ(xi+1 ))(xi+1 − xi ) = y(Φ(y) − 1) → 0
i=0
as y → +∞. This fact implies that Z +∞ Z Pr{ξ ≥ r}dr = 0
+∞
xdΦ(x).
0
A similar way may prove that Z 0 Z − Pr{ξ ≤ r}dr = −∞
0
xdΦ(x).
−∞
Thus (1.27) is verified by the above two equations. Linearity of Expected Value Operator Theorem 1.16 Let ξ and η be random variables with finite expected values. Then for any numbers a and b, we have E[aξ + bη] = aE[ξ] + bE[η].
(1.28)
Proof: Step 1: We first prove that E[ξ + b] = E[ξ] + b for any real number b. When b ≥ 0, we have Z ∞ Z 0 E[ξ + b] = Pr{ξ + b ≥ r}dr − Pr{ξ + b ≤ r}dr −∞ 0
0
Z
∞
Z Pr{ξ ≥ r − b}dr −
=
Pr{ξ ≤ r − b}dr −∞
0
Z
b
(Pr{ξ ≥ r − b} + Pr{ξ < r − b}) dr
= E[ξ] + 0
= E[ξ] + b. If b < 0, then we have Z E[ξ + b] = E[ξ] −
0
(Pr{ξ ≥ r − b} + Pr{ξ < r − b}) dr = E[ξ] + b. b
18
Chapter 1 - Probability Theory
Step 2: We prove that E[aξ] = aE[ξ] for any real number a. If a = 0, then the equation E[aξ] = aE[ξ] holds trivially. If a > 0, we have ∞
Z
Z
0
Pr{aξ ≥ r}dr −
E[aξ] = ∞
Z
Pr{aξ ≤ r}dr −∞ Z 0
0
n n ro ro Pr ξ ≤ dr − dr Pr ξ ≥ a a −∞ 0 Z ∞ n Z 0 n ro r ro r =a Pr ξ ≥ d −a Pr ξ ≤ d a a a a 0 −∞ =
= aE[ξ]. If a < 0, we have ∞
Z
Z
0
Pr{aξ ≥ r}dr −
E[aξ] = Z
Pr{aξ ≤ r}dr −∞ Z 0
0 ∞
n n ro ro Pr ξ ≤ dr − dr Pr ξ ≥ a a 0 −∞ Z ∞ n Z 0 n ro r ro r =a Pr ξ ≥ d −a Pr ξ ≤ d a a a a 0 −∞ =
= aE[ξ]. Step 3: We prove that E[ξ + η] = E[ξ] + E[η] when both ξ and η are nonnegative simple random variables taking values a1 , a2 , · · · , am and b1 , b2 , · · · , bn , respectively. Then ξ + η is also a nonnegative simple random variable taking values ai + bj , i = 1, 2, · · · , m, j = 1, 2, · · · , n. Thus we have E[ξ + η] =
m P n P
=
i=1 j=1 m P n P
=
i=1 j=1 m P
n P
i=1
j=1
(ai + bj ) Pr{ξ = ai , η = bj } ai Pr{ξ = ai , η = bj } +
ai Pr{ξ = ai } +
m P n P
bj Pr{ξ = ai , η = bj }
i=1 j=1
bj Pr{η = bj }
= E[ξ] + E[η]. Step 4: We prove that E[ξ + η] = E[ξ] + E[η] when both ξ and η are nonnegative random variables. For every i ≥ 1 and every ω ∈ Ω, we define k − 1 , if k − 1 ≤ ξ(ω) < k , k = 1, 2, · · · , i2i 2i 2i 2i ξi (ω) = i, if i ≤ ξ(ω),
19
Section 1.6 - Expected Value
k − 1 , if k − 1 ≤ η(ω) < k , k = 1, 2, · · · , i2i 2i 2i 2i ηi (ω) = i, if i ≤ η(ω). Then {ξi }, {ηi } and {ξi + ηi } are three sequences of nonnegative simple random variables such that ξi ↑ ξ, ηi ↑ η and ξi + ηi ↑ ξ + η as i → ∞. Note that the functions Pr{ξi > r}, Pr{ηi > r}, Pr{ξi + ηi > r}, i = 1, 2, · · · are also simple. It follows from the probability continuity theorem that Pr{ξi > r} ↑ Pr{ξ > r}, ∀r ≥ 0 as i → ∞. Since the expected value E[ξ] exists, we have Z
+∞
Z
+∞
Pr{ξi > r}dr →
E[ξi ] =
Pr{ξ > r}dr = E[ξ] 0
0
as i → ∞. Similarly, we may prove that E[ηi ] → E[η] and E[ξi +ηi ] → E[ξ+η] as i → ∞. It follows from Step 3 that E[ξ + η] = E[ξ] + E[η]. Step 5: We prove that E[ξ + η] = E[ξ] + E[η] when ξ and η are arbitrary random variables. Define ( ( ξ(ω), if ξ(ω) ≥ −i η(ω), if η(ω) ≥ −i ξi (ω) = ηi (ω) = −i, otherwise, −i, otherwise. Since the expected values E[ξ] and E[η] are finite, we have lim E[ξi ] = E[ξ],
i→∞
lim E[ηi ] = E[η],
i→∞
lim E[ξi + ηi ] = E[ξ + η].
i→∞
Note that (ξi + i) and (ηi + i) are nonnegative random variables. It follows from Steps 1 and 4 that E[ξ + η] = lim E[ξi + ηi ] i→∞
= lim (E[(ξi + i) + (ηi + i)] − 2i) i→∞
= lim (E[ξi + i] + E[ηi + i] − 2i) i→∞
= lim (E[ξi ] + i + E[ηi ] + i − 2i) i→∞
= lim E[ξi ] + lim E[ηi ] i→∞
i→∞
= E[ξ] + E[η]. Step 6: The linearity E[aξ + bη] = aE[ξ] + bE[η] follows immediately from Steps 2 and 5. The theorem is proved.
20
Chapter 1 - Probability Theory
Product of Independent Random Variables Theorem 1.17 Let ξ and η be independent random variables with finite expected values. Then the expected value of ξη exists and E[ξη] = E[ξ]E[η].
(1.29)
Proof: Step 1: We first prove the case where both ξ and η are nonnegative simple random variables taking values a1 , a2 , · · · , am and b1 , b2 , · · · , bn , respectively. Then ξη is also a nonnegative simple random variable taking values ai bj , i = 1, 2, · · · , m, j = 1, 2, · · · , n. It follows from the independence of ξ and η that E[ξη] =
m P n P
=
i=1 j=1 m P n P
ai bj Pr{ξ = ai , η = bj } ai bj Pr{ξ = ai } Pr{η = bj }
i=1 j=1
=
m P
ai Pr{ξ = ai }
i=1
n P
! bj Pr{η = bj }
j=1
= E[ξ]E[η]. Step 2: Next we prove the case where ξ and η are nonnegative random variables. For every i ≥ 1 and every ω ∈ Ω, we define k − 1 , if k − 1 ≤ ξ(ω) < k , k = 1, 2, · · · , i2i 2i 2i 2i ξi (ω) = i, if i ≤ ξ(ω), k − 1 , if k − 1 ≤ η(ω) < k , k = 1, 2, · · · , i2i 2i 2i 2i ηi (ω) = i, if i ≤ η(ω). Then {ξi }, {ηi } and {ξi ηi } are three sequences of nonnegative simple random variables such that ξi ↑ ξ, ηi ↑ η and ξi ηi ↑ ξη as i → ∞. It follows from the independence of ξ and η that ξi and ηi are independent. Hence we have E[ξi ηi ] = E[ξi ]E[ηi ] for i = 1, 2, · · · It follows from the probability continuity theorem that Pr{ξi > r}, i = 1, 2, · · · are simple functions such that Pr{ξi > r} ↑ Pr{ξ > r}, for all r ≥ 0 as i → ∞. Since the expected value E[ξ] exists, we have Z +∞ Z +∞ E[ξi ] = Pr{ξi > r}dr → Pr{ξ > r}dr = E[ξ] 0
0
as i → ∞. Similarly, we may prove that E[ηi ] → E[η] and E[ξi ηi ] → E[ξη] as i → ∞. Therefore E[ξη] = E[ξ]E[η].
21
Section 1.6 - Expected Value
Step 3: Finally, if ξ and η are arbitrary independent random variables, then the nonnegative random variables ξ + and η + are independent and so are ξ + and η − , ξ − and η + , ξ − and η − . Thus we have E[ξ + η + ] = E[ξ + ]E[η + ],
E[ξ + η − ] = E[ξ + ]E[η − ],
E[ξ − η + ] = E[ξ − ]E[η + ],
E[ξ − η − ] = E[ξ − ]E[η − ].
It follows that E[ξη] = E[(ξ + − ξ − )(η + − η − )] = E[ξ + η + ] − E[ξ + η − ] − E[ξ − η + ] + E[ξ − η − ] = E[ξ + ]E[η + ] − E[ξ + ]E[η − ] − E[ξ − ]E[η + ] + E[ξ − ]E[η − ] = (E[ξ + ] − E[ξ − ]) (E[η + ] − E[η − ]) = E[ξ + − ξ − ]E[η + − η − ] = E[ξ]E[η] which proves the theorem. Expected Value of Function of Random Variable Theorem 1.18 Let ξ be a random variable with probability distribution Φ, and f : < → < a measurable function. If the Lebesgue-Stieltjes integral Z +∞ f (x)dΦ(x) −∞
is finite, then we have Z
+∞
E[f (ξ)] =
f (x)dΦ(x).
(1.30)
−∞
Proof: It follows from the definition of expected value operator that Z +∞ Z 0 E[f (ξ)] = Pr{f (ξ) ≥ r}dr − Pr{f (ξ) ≤ r}dr. (1.31) −∞
0
If f is a nonnegative simple measurable a1 , a2 , f (x) = ··· am ,
function, i.e., if x ∈ B1 if x ∈ B2 if x ∈ Bm
where B1 , B2 , · · · , Bm are mutually disjoint Borel sets, then we have Z +∞ m X E[f (ξ)] = Pr{f (ξ) ≥ r}dr = ai Pr{ξ ∈ Bi } =
0 m X i=1
i=1
Z ai
Z
+∞
dΦ(x) = Bi
f (x)dΦ(x). −∞
22
Chapter 1 - Probability Theory
We next prove the case where f is a nonnegative measurable function. Let f1 , f2 , · · · be a sequence of nonnegative simple functions such that fi ↑ f as i → ∞. We have proved that +∞
Z
Z
+∞
Pr{fi (ξ) ≥ r}dr =
E[fi (ξ)] =
fi (x)dΦ(x). −∞
0
In addition, it is also known that Pr{fi (ξ) > r} ↑ Pr{f (ξ) > r} as i → ∞ for r ≥ 0. It follows from the monotone convergence theorem that Z
+∞
E[f (ξ)] =
Pr{f (ξ) > r}dr 0 +∞
Z
Pr{fi (ξ) > r}dr
= lim
i→∞
0
Z
+∞
= lim
i→∞
Z
fi (x)dΦ(x) −∞
+∞
=
f (x)dΦ(x). −∞
Finally, if f is an arbitrary measurable function, then we have f = f + − f − and E[f (ξ)] = E[f + (ξ) − f − (ξ)] = E[f + (ξ)] − E[f − (ξ)] Z +∞ Z +∞ + = f (x)dΦ(x) − f − (x)dΦ(x) −∞ +∞
−∞
Z =
f (x)dΦ(x). −∞
The theorem is proved. Sum of a Random Number of Random Variables Theorem 1.19 (Wald Identity) Assume that {ξi } is a sequence of iid random variables, and η is a positive random integer (i.e., a random variable taking “positive integer” values) that is independent of the sequence {ξi }. Then we have " η # X E ξi = E[η]E[ξ1 ]. (1.32) i=1
Proof: Since η is independent of the sequence {ξi }, we have ( η ) ∞ X X Pr ξi ≥ r = Pr{η = k} Pr {ξ1 + ξ2 + · · · + ξk ≥ r} . i=1
k=1
23
Section 1.7 - Variance
If ξi are nonnegative random variables, then we have " η # Z ( η ) +∞ X X E ξi = Pr ξi ≥ r dr 0
i=1
Z =
i=1 ∞ +∞ X
0
=
=
=
Pr{η = k} Pr {ξ1 + ξ2 + · · · + ξk ≥ r} dr
k=1
∞ X k=1 ∞ X k=1 ∞ X
+∞
Z
Pr {ξ1 + ξ2 + · · · + ξk ≥ r} dr
Pr{η = k} 0
Pr{η = k} (E[ξ1 ] + E[ξ2 ] + · · · + E[ξk ]) Pr{η = k}kE[ξ1 ]
(by iid hypothesis)
k=1
= E[η]E[ξ1 ]. If ξi are arbitrary random variables, then ξi = ξi+ − ξi− , and " η # " η # " η # η X X X X + − + − E ξi = E (ξi − ξi ) = E ξi − ξi i=1
i=1
" =E
η X
i=1
#
"
ξi+ − E
i=1
=
E[η](E[ξ1+ ]
η X
i=1
# ξi− = E[η]E[ξ1+ ] − E[η]E[ξ1− ]
i=1
−
E[ξ1− ])
= E[η]E[ξ1+ − ξ1− ] = E[η]E[ξ1 ].
The theorem is thus proved.
1.7
Variance
Definition 1.18 Let ξ be a random variable with finite expected value e. Then the variance of ξ is defined by V [ξ] = E[(ξ − e)2 ]. The variance of a random variable provides a measure of the spread of the distribution around its expected value. A small value of variance indicates that the random variable is tightly concentrated around its expected value; and a large value of variance indicates that the random variable has a wide spread around its expected value. Example 1.15: Let ξ be a uniformly distributed random variable U(a, b). Then its expected value is (a + b)/2 and variance is (b − a)2 /12. Example 1.16: Let ξ be an exponentially distributed random variable EX P(β). Then its expected value is β and variance is β 2 .
24
Chapter 1 - Probability Theory
Example 1.17: Let ξ be a normally distributed random variable N (µ, σ 2 ). Then its expected value is µ and variance is σ 2 . Theorem 1.20 If ξ is a random variable whose variance exists, a and b are real numbers, then V [aξ + b] = a2 V [ξ]. Proof: It follows from the definition of variance that V [aξ + b] = E (aξ + b − aE[ξ] − b)2 = a2 E[(ξ − E[ξ])2 ] = a2 V [ξ]. Theorem 1.21 Let ξ be a random variable with expected value e. Then V [ξ] = 0 if and only if Pr{ξ = e} = 1. Proof: If V [ξ] = 0, then E[(ξ − e)2 ] = 0. Thus we have Z
+∞
Pr{(ξ − e)2 ≥ r}dr = 0
0 2
which implies Pr{(ξ−e) ≥ r} = 0 for any r > 0. Hence we have Pr{(ξ−e)2 = 0} = 1, i.e., Pr{ξ = e} = 1. Conversely, if Pr{ξ = e} = 1, then we have Pr{(ξ − e)2 = 0} = 1 and Pr{(ξ − e)2 ≥ r} = 0 for any r > 0. Thus Z V [ξ] =
+∞
Pr{(ξ − e)2 ≥ r}dr = 0.
0
Theorem 1.22 If ξ1 , ξ2 , · · · , ξn are independent random variables with finite expected values, then V [ξ1 + ξ2 + · · · + ξn ] = V [ξ1 ] + V [ξ2 ] + · · · + V [ξn ].
(1.33)
Proof: It follows from the definition of variance that n P V ξi = E (ξ1 + ξ2 + · · · + ξn − E[ξ1 ] − E[ξ2 ] − · · · − E[ξn ])2 i=1
=
n P
n−1 n P P E (ξi − E[ξi ])2 + 2 E [(ξi − E[ξi ])(ξj − E[ξj ])] .
i=1
i=1 j=i+1
Since ξ1 , ξ2 , · · · , ξn are independent, E [(ξi − E[ξi ])(ξj − E[ξj ])] = 0 for all i, j with i 6= j. Thus (1.33) holds. Maximum Variance Theorem Let ξ be a random variable that takes values in [a, b], but whose probability distribution is otherwise arbitrary. If its expected value is given, what is the possible maximum variance? The maximum variance theorem will answer this question, thus playing an important role in treating games against nature.
25
Section 1.8 - Moments
Theorem 1.23 (Edmundson-Madansky Inequality) Let f be a convex function on [a, b], and ξ a random variable that takes values in [a, b] and has expected value e. Then E[f (ξ)] ≤
b−e e−a f (a) + f (b). b−a b−a
(1.34)
Proof: For each ω ∈ Ω, we have a ≤ ξ(ω) ≤ b and ξ(ω) =
ξ(ω) − a b − ξ(ω) a+ b. b−a b−a
It follows from the convexity of f that f (ξ(ω)) ≤
b − ξ(ω) ξ(ω) − a f (a) + f (b). b−a b−a
Taking expected values on both sides, we obtain (1.34). Theorem 1.24 (Maximum Variance Theorem) Let ξ be a random variable that takes values in [a, b] and has expected value e. Then V [ξ] ≤ (e − a)(b − e) and equality holds if the random variable ξ is determined by b−e , if x = a b−a Pr{ξ = x} = e − a , if x = b. b−a
(1.35)
(1.36)
Proof: It follows from Theorem 1.23 immediately by defining f (x) = (x−e)2 . It is also easy to verify that the random variable determined by (1.36) has variance (e − a)(b − e). The theorem is proved.
1.8
Moments
Definition 1.19 Let ξ be a random variable, and k a positive number. Then (a) the expected value E[ξ k ] is called the kth moment; (b) the expected value E[|ξ|k ] is called the kth absolute moment; (c) the expected value E[(ξ − E[ξ])k ] is called the kth central moment; (d) the expected value E[|ξ −E[ξ]|k ] is called the kth absolute central moment. Note that the first central moment is always 0, the first moment is just the expected value, and the second central moment is just the variance. Theorem 1.25 Let ξ be a nonnegative random variable, and k a positive number. Then the k-th moment Z +∞ k E[ξ ] = k rk−1 Pr{ξ ≥ r}dr. (1.37) 0
26
Chapter 1 - Probability Theory
Proof: It follows from the nonnegativity of ξ that Z ∞ Z ∞ Z E[ξ k ] = Pr{ξ k ≥ x}dx = Pr{ξ ≥ r}drk = k 0
0
∞
rk−1 Pr{ξ ≥ r}dr.
0
The theorem is proved. Theorem 1.26 Let ξ be a random variable that takes values in [a, b] and has expected value e. Then for any positive integer k, the kth absolute moment and kth absolute central moment satisfy the following inequalities, E[|ξ|k ] ≤ E[|ξ − e|k ] ≤
b−e k e−a k |a| + |b| , b−a b−a
b−e e−a (e − a)k + (b − e)k . b−a b−a
(1.38)
(1.39)
Proof: It follows from Theorem 1.23 immediately by defining f (x) = |x|k and f (x) = |x − e|k .
1.9
Critical Values
Let ξ be a random variable. In order to measure it, we may use its expected value. Alternately, we may employ α-optimistic value and α-pessimistic value as a ranking measure. Definition 1.20 Let ξ be a random variable, and α ∈ (0, 1]. Then ξsup (α) = sup r Pr {ξ ≥ r} ≥ α
(1.40)
is called the α-optimistic value of ξ, and ξinf (α) = inf r Pr {ξ ≤ r} ≥ α
(1.41)
is called the α-pessimistic value of ξ. This means that the random variable ξ will reach upwards of the αoptimistic value ξsup (α) at least α of time, and will be below the α-pessimistic value ξinf (α) at least α of time. The optimistic value is also called percentile. Theorem 1.27 Let ξ be a random variable. Then we have Pr{ξ ≥ ξsup (α)} ≥ α,
Pr{ξ ≤ ξinf (α)} ≥ α
(1.42)
where ξsup (α) and ξinf (α) are the α-optimistic and α-pessimistic values of the random variable ξ, respectively.
27
Section 1.9 - Critical Values
Proof: It follows from the definition of the optimistic value that there exists an increasing sequence {ri } such that Pr{ξ ≥ ri } ≥ α and ri ↑ ξsup (α) as i → ∞. Since {ω|ξ(ω) ≥ ri } ↓ {ω|ξ(ω) ≥ ξsup (α)}, it follows from the probability continuity theorem that Pr{ξ ≥ ξsup (α)} = lim Pr{ξ ≥ ri } ≥ α. i→∞
The inequality Pr{ξ ≤ ξinf (α)} ≥ α may be proved similarly. Example 1.18: Note that Pr{ξ ≥ ξsup (α)} > α and Pr{ξ ≤ ξinf (α)} > α may hold. For example, ( 0 with probability 0.4 ξ= 1 with probability 0.6. If α = 0.8, then ξsup (0.8) = 0 which makes Pr{ξ ≥ ξsup (0.8)} = 1 > 0.8. In addition, ξinf (0.8) = 1 and Pr{ξ ≤ ξinf (0.8)} = 1 > 0.8. Theorem 1.28 Let ξ be a random variable. Then we have (a) ξinf (α) is an increasing and left-continuous function of α; (b) ξsup (α) is a decreasing and left-continuous function of α. Proof: (a) It is easy to prove that ξinf (α) is an increasing function of α. Next, we prove the left-continuity of ξinf (α) with respect to α. Let {αi } be an arbitrary sequence of positive numbers such that αi ↑ α. Then {ξinf (αi )} is an increasing sequence. If the limitation is equal to ξinf (α), then the leftcontinuity is proved. Otherwise, there exists a number z ∗ such that lim ξinf (αi ) < z ∗ < ξinf (α).
i→∞
Thus Pr{ξ ≤ z ∗ } ≥ αi for each i. Letting i → ∞, we get Pr{ξ ≤ z ∗ } ≥ α. Hence z ∗ ≥ ξinf (α). A contradiction proves the left-continuity of ξinf (α) with respect to α. The part (b) may be proved similarly. Theorem 1.29 Let ξ be a random variable. Then we have (a) if α > 0.5, then ξinf (α) ≥ ξsup (α); (b) if α ≤ 0.5, then ξinf (α) ≤ ξsup (α). Proof: Part (a): Write ξ(α) = (ξinf (α) + ξsup (α))/2. If ξinf (α) < ξsup (α), then we have 1 ≥ Pr{ξ < ξ(α)} + Pr{ξ > ξ(α)} ≥ α + α > 1. A contradiction proves ξinf (α) ≥ ξsup (α). Part (b): Assume that ξinf (α) > ξsup (α). It follows from the definition of ξinf (α) that Pr{ξ ≤ ξ(α)} < α. Similarly, it follows from the definition of ξsup (α) that Pr{ξ ≥ ξ(α)} < α. Thus 1 ≤ Pr{ξ ≤ ξ(α)} + Pr{ξ ≥ ξ(α)} < α + α ≤ 1. A contradiction proves ξinf (α) ≤ ξsup (α). The theorem is proved.
28
Chapter 1 - Probability Theory
Theorem 1.30 Let ξ be a random variable. Then we have (a) if c ≥ 0, then (cξ)sup (α) = cξsup (α) and (cξ)inf (α) = cξinf (α); (b) if c < 0, then (cξ)sup (α) = cξinf (α) and (cξ)inf (α) = cξsup (α). Proof: (a) If c = 0, then it is obviously valid. When c > 0, we have (cξ)sup (α) = sup {r | Pr{cξ ≥ r} ≥ α} = c sup {r/c | Pr {ξ ≥ r/c} ≥ α} = cξsup (α). A similar way may prove that (cξ)inf (α) = cξinf (α). (b) In order to prove this part, it suffices to verify that (−ξ)sup (α) = −ξinf (α) and (−ξ)inf (α) = −ξsup (α). In fact, for any α ∈ (0, 1], we have (−ξ)sup (α) = sup{r | Pr{−ξ ≥ r} ≥ α} = − inf{−r | Pr{ξ ≤ −r} ≥ α} = −ξinf (α). Similarly, we may prove that (−ξ)inf (α) = −ξsup (α). The theorem is proved.
1.10
Entropy
Given a random variable, what is the degree of difficulty of predicting the specified value that the random variable will take? In order to answer this question, Shannon [207] defined a concept of entropy as a measure of uncertainty. Entropy of Discrete Random Variables Definition 1.21 Let ξ be a discrete random variable taking values xi with probabilities pi , i = 1, 2, · · · , respectively. Then its entropy is defined by H[ξ] = −
∞ X
pi ln pi .
(1.43)
i=1
It should be noticed that the entropy depends only on the number of values and their probabilities and does not depend on the actual values that the random variable takes. Theorem 1.31 Let ξ be a discrete random variable taking values xi with probabilities pi , i = 1, 2, · · · , respectively. Then H[ξ] ≥ 0
(1.44)
and equality holds if and only if there exists an index k such that pk = 1, i.e., ξ is essentially a deterministic number.
29
Section 1.10 - Entropy
S(t) .... ........ ........................ .... ........ ....... ... ...... ..... ...... ... ....... ...... ... ... ..... ..... ... ... ..... ....... ... . . . ...................................................................................................................................................................... ..... .. ..... ... ..... ... .... .... ... .... ... .... .... ... .... ... .... .... ... ... ... ... ... .... ... .
0
1
t
Figure 1.1: Function S(t) = −t ln t is concave Proof: The nonnegativity is clear. In addition, H[ξ] = 0 if and only if pi = 0 or 1 for each i. That is, there exists one and only one index k such that pk = 1. The theorem is proved. This theorem states that the entropy of a discrete random variable reaches its minimum 0 when the random variable degenerates to a deterministic number. In this case, there is no uncertainty. Theorem 1.32 Let ξ be a simple random variable taking values xi with probabilities pi , i = 1, 2, · · · , n, respectively. Then H[ξ] ≤ ln n
(1.45)
and equality holds if and only if pi ≡ 1/n for all i = 1, 2, · · · , n. Proof: Since the function S(t) is a concave function of t and p1 + p2 + · · · + pn = 1, we have ! ! n n n X 1X 1X pi ln pi = ln n − pi ln pi ≤ −n n i=1 n i=1 i=1 which implies that H[ξ] ≤ ln n and equality holds if and only if p1 = p2 = · · · = pn , i.e., pi ≡ 1/n for all i = 1, 2, · · · , n. This theorem states that the entropy of a simple random variable reaches its maximum ln n when all outcomes are equiprobable. In this case, there is no preference among all the values that the random variable will take. Entropy of Absolutely Continuous Random Variables Definition 1.22 Let ξ be a random variable with probability density function φ. Then its entropy is defined by Z +∞ H[ξ] = − φ(x) ln φ(x)dx. (1.46) −∞
30
Chapter 1 - Probability Theory
Example 1.19: Let ξ be a uniformly distributed random variable on [a, b]. Then its entropy is H[ξ] = ln(b − a). This example shows that the entropy of absolutely continuous random variable may assume both positive and negative values since ln(b − a) < 0 if b − a < 1; and ln(b − a) > 0 if b − a > 1. Example 1.20: Let ξ be an exponentially distributed random variable with expected value β. Then its entropy is H[ξ] = 1 + ln β. Example 1.21: Let ξ be a normally distributed random variable with √ expected value e and variance σ 2 . Then its entropy is H[ξ] = 1/2 + ln 2πσ. Theorem 1.33 Let ξ be an absolutely continuous random variable. Then H[aξ + b] = |a|H[ξ] for any real numbers a and b. Proof: It follows immediately from the definition. Maximum Entropy Principle Given some constraints, for example, expected value and variance, there are usually multiple compatible probability distributions. For this case, we would like to select the distribution that maximizes the value of entropy and satisfies the prescribed constraints. This method is often referred to as the maximum entropy principle (Jaynes [67]). Example 1.22: Let ξ be an absolutely continuous random variable on [a, b]. The maximum entropy principle attempts to find the probability density function φ(x) that maximizes the entropy Z
b
−
φ(x) ln φ(x)dx a
subject to the natural constraint Z L=−
b
Rb a
φ(x)dx = 1. The Lagrangian is ! Z b
φ(x) ln φ(x)dx − λ a
φ(x)dx − 1 . a
It follows from the Euler-Lagrange equation that the maximum entropy probability density function meets ln φ(x) + 1 + λ = 0 and has the form φ(x) = exp(−1 − λ). Substituting it into the natural constraint, we get 1 , a≤x≤b φ∗ (x) = b−a which is just the uniformly distributed random variable, and the maximum entropy is H[ξ ∗ ] = ln(b − a).
31
Section 1.10 - Entropy
Example 1.23: Let ξ be an absolutely continuous random variable on [0, ∞). Assume that the expected value of ξ is prescribed to be β. The maximum entropy probability density function φ(x) should maximize the entropy Z +∞ − φ(x) ln φ(x)dx 0
subject to the constraints Z +∞ φ(x)dx = 1, 0
+∞
Z
xφ(x)dx = β. 0
The Lagrangian is Z Z ∞ φ(x) ln φ(x)dx − λ1 L=−
∞
Z φ(x)dx − 1 − λ2
xφ(x)dx − β .
0
0
0
∞
The maximum entropy probability density function meets Euler-Lagrange equation ln φ(x) + 1 + λ1 + λ2 x = 0 and has the form φ(x) = exp(−1 − λ1 − λ2 x). Substituting it into the constraints, we get 1 x , x≥0 φ∗ (x) = exp − β β which is just the exponentially distributed random variable, and the maximum entropy is H[ξ ∗ ] = 1 + ln β. Example 1.24: Let ξ be an absolutely continuous random variable on (−∞, +∞). Assume that the expected value and variance of ξ are prescribed to be µ and σ 2 , respectively. The maximum entropy probability density function φ(x) should maximize the entropy Z +∞ − φ(x) ln φ(x)dx −∞
subject to the constraints Z +∞ Z +∞ φ(x)dx = 1, xφ(x)dx = µ, −∞
Z
−∞
+∞
φ(x)dx − 1
−∞
Z −λ2
(x − µ)2 φ(x)dx = σ 2 .
−∞
The Lagrangian is Z Z +∞ L=− φ(x) ln φ(x)dx − λ1 −∞
+∞
Z +∞ xφ(x)dx − µ − λ3
−∞
+∞
−∞
2
(x − µ) φ(x)dx − σ
2
.
32
Chapter 1 - Probability Theory
The maximum entropy probability density function meets Euler-Lagrange equation ln φ(x) + 1 + λ1 + λ2 x + λ3 (x − µ)2 = 0 and has the form φ(x) = exp(−1 − λ1 − λ2 x − λ3 (x − µ)2 ). Substituting it into the constraints, we get (x − µ)2 1 , x∈< φ∗ (x) = √ exp − 2σ 2 σ 2π which is just the normally distributed random variable, and the maximum √ entropy is H[ξ ∗ ] = 1/2 + ln 2πσ.
1.11
Distance
Distance is a powerful concept in many disciplines of science and engineering. This section introduces the distance between random variables. Definition 1.23 The distance between random variables ξ and η is defined as d(ξ, η) = E[|ξ − η|]. (1.47) Theorem 1.34 Let ξ, η, τ be random variables, and let d(·, ·) be the distance. Then we have (a) (Nonnegativity) d(ξ, η) ≥ 0; (b) (Identification) d(ξ, η) = 0 if and only if ξ = η; (c) (Symmetry) d(ξ, η) = d(η, ξ); (d) (Triangle Inequality) d(ξ, η) ≤ d(ξ, τ ) + d(η, τ ). Proof: The parts (a), (b) and (c) follow immediately from the definition. The part (d) is proved by the following relation, E[|ξ − η|] ≤ E[|ξ − τ | + |η − τ |] = E[|ξ − τ |] + E[|η − τ |].
1.12
Inequalities
It is well-known that there are several inequalities in probability theory, such as Markov inequality, Chebyshev inequality, H¨older’s inequality, Minkowski inequality, and Jensen’s inequality. They play an important role in both theory and applications. Theorem 1.35 Let ξ be a random variable, and f a nonnegative measurable function. If f is even (i.e., f (x) = f (−x) for any x ∈ <) and increasing on [0, ∞), then for any given number t > 0, we have Pr{|ξ| ≥ t} ≤
E[f (ξ)] . f (t)
(1.48)
33
Section 1.12 - Inequalities
Proof: It is clear that Pr{|ξ| ≥ f −1 (r)} is a monotone decreasing function of r on [0, ∞). It follows from the nonnegativity of f (ξ) that Z
+∞
Pr{f (ξ) ≥ r}dr
E[f (ξ)] = 0
Z
+∞
Pr{|ξ| ≥ f −1 (r)}dr
= 0
Z
f (t)
Pr{|ξ| ≥ f −1 (r)}dr
≥ 0
Z ≥
f (t)
dr · Pr{|ξ| ≥ f −1 (f (t))}
0
= f (t) · Pr{|ξ| ≥ t} which proves the inequality. Theorem 1.36 (Markov Inequality) Let ξ be a random variable. Then for any given numbers t > 0 and p > 0, we have Pr{|ξ| ≥ t} ≤
E[|ξ|p ] . tp
(1.49)
Proof: It is a special case of Theorem 1.35 when f (x) = |x|p . Example 1.25: For any given positive number t, we define a random variable as follows, ( 0 with probability 1/2 ξ= t with probability 1/2. Then Pr{ξ ≥ t} = 1/2 = E[|ξ|p ]/tp . Theorem 1.37 (Chebyshev Inequality) Let ξ be a random variable whose variance V [ξ] exists. Then for any given number t > 0, we have Pr {|ξ − E[ξ]| ≥ t} ≤
V [ξ] . t2
(1.50)
Proof: It is a special case of Theorem 1.35 when the random variable ξ is replaced with ξ − E[ξ] and f (x) = x2 . Example 1.26: For any given positive number t, we define a random variable as follows, ( −t with probability 1/2 ξ= t with probability 1/2. Then Pr{|ξ − E[ξ]| ≥ t} = 1 = V [ξ]/t2 .
34
Chapter 1 - Probability Theory
Theorem 1.38 (H¨ older’s Inequality) Let p and q be two positive real numbers with 1/p+1/q = 1, and let ξ and η be random variables with E[|ξ|p ] < ∞ and E[|η|q ] < ∞. Then we have E[|ξη|] ≤
p p
p E[|ξ|p ] q E[|η|q ].
(1.51)
Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume√E[|ξ|p ] > 0 and E[|η|q ] > 0. It is easy to prove that the function √ f (x, y) = p x q y is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
Letting x0 = E[|ξ|p ], y0 = E[|η|q ], x = |ξ|p and y = |η|q , we have f (|ξ|p , |η|q ) − f (E[|ξ|p ], E[|η|q ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|q − E[|η|q ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|q )] ≤ f (E[|ξ|p ], E[|η|q ]). Hence the inequality (1.51) holds. Theorem 1.39 (Minkowski Inequality) Let p be a real number with p ≥ 1, and let ξ and η be random variables with E[|ξ|p ] < ∞ and E[|η|p ] < ∞. Then we have p p p p E[|ξ + η|p ] ≤ p E[|ξ|p ] + p E[|η|p ]. (1.52) Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume √ E[|ξ|p ] > 0 and E[|η|p ] > 0. It is easy to prove that the function √ f (x, y) = ( p x + p y)p is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
Letting x0 = E[|ξ|p ], y0 = E[|η|p ], x = |ξ|p and y = |η|p , we have f (|ξ|p , |η|p ) − f (E[|ξ|p ], E[|η|p ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|p − E[|η|p ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|p )] ≤ f (E[|ξ|p ], E[|η|p ]). Hence the inequality (1.52) holds.
35
Section 1.13 - Convergence Concepts
Theorem 1.40 (Jensen’s Inequality) Let ξ be a random variable, and f : < → < a convex function. If E[ξ] and E[f (ξ)] are finite, then f (E[ξ]) ≤ E[f (ξ)].
(1.53)
Especially, when f (x) = |x|p and p ≥ 1, we have |E[ξ]|p ≤ E[|ξ|p ]. Proof: Since f is a convex function, for each y, there exists a number k such that f (x) − f (y) ≥ k · (x − y). Replacing x with ξ and y with E[ξ], we obtain f (ξ) − f (E[ξ]) ≥ k · (ξ − E[ξ]). Taking the expected values on both sides, we have E[f (ξ)] − f (E[ξ]) ≥ k · (E[ξ] − E[ξ]) = 0 which proves the inequality.
1.13
Convergence Concepts
There are four main types of convergence concepts of random sequence: convergence almost surely (a.s.), convergence in probability, convergence in mean, and convergence in distribution. Table 1.1: Relations among Convergence Concepts Convergence Almost Surely Convergence in Mean
& %
Convergence in Probability
→
Convergence in Distribution
Definition 1.24 Suppose that ξ, ξ1 , ξ2 , · · · are random variables defined on the probability space (Ω, A, Pr). The sequence {ξi } is said to be convergent a.s. to ξ if and only if there exists a set A ∈ A with Pr{A} = 1 such that lim |ξi (ω) − ξ(ω)| = 0
i→∞
(1.54)
for every ω ∈ A. In that case we write ξi → ξ, a.s. Definition 1.25 Suppose that ξ, ξ1 , ξ2 , · · · are random variables defined on the probability space (Ω, A, Pr). We say that the sequence {ξi } converges in probability to ξ if lim Pr {|ξi − ξ| ≥ ε} = 0 (1.55) i→∞
for every ε > 0.
36
Chapter 1 - Probability Theory
Definition 1.26 Suppose that ξ, ξ1 , ξ2 , · · · are random variables with finite expected values on the probability space (Ω, A, Pr). We say that the sequence {ξi } converges in mean to ξ if lim E[|ξi − ξ|] = 0.
(1.56)
i→∞
In addition, the sequence {ξi } is said to converge in mean square to ξ if lim E[|ξi − ξ|2 ] = 0.
i→∞
(1.57)
Definition 1.27 Suppose that Φ, Φ1 , Φ2 , · · · are the probability distributions of random variables ξ, ξ1 , ξ2 , · · · , respectively. We say that {ξi } converges in distribution to ξ if Φi → Φ at any continuity point of Φ. Convergence Almost Surely vs. Convergence in Probability Theorem 1.41 Suppose that ξ, ξ1 , ξ2 , · · · are random variables defined on the probability space (Ω, A, Pr). Then {ξi } converges a.s. to ξ if and only if, for every ε > 0, we have (∞ ) [ lim Pr {|ξi − ξ| ≥ ε} = 0. (1.58) n→∞
i=n
Proof: For every i ≥ 1 and ε > 0, we define n o X = ω ∈ Ω lim ξi (ω) 6= ξ(ω) , i→∞ Xi (ε) = ω ∈ Ω |ξi (ω) − ξ(ω)| ≥ ε . It is clear that X=
[
∞ ∞ [ \
ε>0
n=1 i=n
! Xi (ε) .
Note that ξi → ξ, a.s. if and only if Pr{X} = 0. That is, ξi → ξ, a.s. if and only if ) (∞ ∞ \ [ Xi (ε) = 0 Pr n=1 i=n
for every ε > 0. Since ∞ [ i=n
Xi (ε) ↓
∞ [ ∞ \
Xi (ε),
n=1 i=n
it follows from the probability continuity theorem that (∞ ) (∞ ∞ ) [ \ [ lim Pr Xi (ε) = Pr Xi (ε) = 0. n→∞
The theorem is proved.
i=n
n=1 i=n
37
Section 1.13 - Convergence Concepts
Theorem 1.42 Suppose that ξ, ξ1 , ξ2 , · · · are random variables defined on the probability space (Ω, A, Pr). If {ξi } converges a.s. to ξ, then {ξi } converges in probability to ξ. Proof: It follows from the convergence a.s. and Theorem 1.41 that (∞ ) [ lim Pr {|ξi − ξ| ≥ ε} = 0 n→∞
i=n
for each ε > 0. For every n ≥ 1, since {|ξn − ξ| ≥ ε} ⊂
∞ [
{|ξi − ξ| ≥ ε},
i=n
we have Pr{|ξn − ξ| ≥ ε} → 0 as n → ∞. Hence the theorem holds. Example 1.27: Convergence in probability does not imply convergence a.s. For example, take (Ω, A, Pr) to be the interval [0, 1] with Borel algebra and Lebesgue measure. For any positive integer i, there is an integer j such that i = 2j + k, where k is an integer between 0 and 2j − 1. We define a random variable on Ω by ( 1, if k/2j ≤ ω ≤ (k + 1)/2j ξi (ω) = 0, otherwise for i = 1, 2, · · · and ξ = 0. For any small number ε > 0, we have Pr {|ξi − ξ| ≥ ε} =
1 →0 2j
as i → ∞. That is, the sequence {ξi } converges in probability to ξ. However, for any ω ∈ [0, 1], there is an infinite number of intervals of the form [k/2j , (k+ 1)/2j ] containing ω. Thus ξi (ω) 6→ 0 as i → ∞. In other words, the sequence {ξi } does not converge a.s. to ξ. Convergence in Probability vs. Convergence in Mean Theorem 1.43 Suppose that ξ, ξ1 , ξ2 , · · · are random variables defined on the probability space (Ω, A, Pr). If the sequence {ξi } converges in mean to ξ, then {ξi } converges in probability to ξ. Proof: It follows from the Markov inequality that, for any given number ε > 0, E[|ξi − ξ|] Pr {|ξi − ξ| ≥ ε} ≤ →0 ε as i → ∞. Thus {ξi } converges in probability to ξ.
38
Chapter 1 - Probability Theory
Example 1.28: Convergence in probability does not imply convergence in mean. For example, take (Ω, A, Pr) to be {ω1 , ω2 , · · · } with Pr{ωj } = 1/2j for j = 1, 2, · · · The random variables are defined by i 2 , if j = i ξi {ωj } = 0, otherwise for i = 1, 2, · · · and ξ = 0. For any small number ε > 0, we have Pr {|ξi − ξ| ≥ ε} =
1 → 0. 2i
That is, the sequence {ξi } converges in probability to ξ. However, we have E [|ξi − ξ|] = 2i ·
1 = 1. 2i
That is, the sequence {ξi } does not converge in mean to ξ. Convergence Almost Surely vs. Convergence in Mean Example 1.29: Convergence a.s. does not imply convergence in mean. For example, take (Ω, A, Pr) to be {ω1 , ω2 , · · · } with Pr{ωj } = 1/2j for j = 1, 2, · · · The random variables are defined by i 2 , if j = i ξi {ωj } = 0, otherwise for i = 1, 2, · · · and ξ = 0. Then {ξi } converges a.s. to ξ. However, the sequence {ξi } does not converge in mean to ξ. Example 1.30: Convergence in mean does not imply convergence a.s. For example, take (Ω, A, Pr) to be the interval [0, 1] with Borel algebra and Lebesgue measure. For any positive integer i, there is an integer j such that i = 2j + k, where k is an integer between 0 and 2j − 1. We define a random variable on Ω by ( 1, if k/2j ≤ ω ≤ (k + 1)/2j ξi (ω) = 0, otherwise for i = 1, 2, · · · and ξ = 0. Then E [|ξi − ξ|] =
1 → 0. 2j
That is, the sequence {ξi } converges in mean to ξ. However, {ξi } does not converge a.s. to ξ.
Section 1.14 - Conditional Probability
39
Convergence in Probability vs. Convergence in Distribution Theorem 1.44 Suppose that ξ, ξ1 , ξ2 , · · · are random variables defined on the probability space (Ω, A, Pr). If the sequence {ξi } converges in probability to ξ, then {ξi } converges in distribution to ξ. Proof: Let x be any given continuity point of the distribution Φ. On the one hand, for any y > x, we have {ξi ≤ x} = {ξi ≤ x, ξ ≤ y} ∪ {ξi ≤ x, ξ > y} ⊂ {ξ ≤ y} ∪ {|ξi − ξ| ≥ y − x} which implies that Φi (x) ≤ Φ(y) + Pr{|ξi − ξ| ≥ y − x}. Since {ξi } converges in probability to ξ, we have Pr{|ξi − ξ| ≥ y − x} → 0. Thus we obtain lim supi→∞ Φi (x) ≤ Φ(y) for any y > x. Letting y → x, we get lim sup Φi (x) ≤ Φ(x). (1.59) i→∞
On the other hand, for any z < x, we have {ξ ≤ z} = {ξ ≤ z, ξi ≤ x} ∪ {ξ ≤ z, ξi > x} ⊂ {ξi ≤ x} ∪ {|ξi − ξ| ≥ x − z} which implies that Φ(z) ≤ Φi (x) + Pr{|ξi − ξ| ≥ x − z}. Since Pr{|ξi − ξ| ≥ x − z} → 0, we obtain Φ(z) ≤ lim inf i→∞ Φi (x) for any z < x. Letting z → x, we get Φ(x) ≤ lim inf Φi (x). i→∞
(1.60)
It follows from (1.59) and (1.60) that Φi (x) → Φ(x). The theorem is proved. Example 1.31: Convergence in distribution does not imply convergence in probability. For example, take (Ω, A, Pr) to be {ω1 , ω2 } with Pr{ω1 } = Pr{ω2 } = 0.5, and −1, if ω = ω1 ξ(ω) = 1, if ω = ω2 . We also define ξi = −ξ for all i. Then ξi and ξ are identically distributed. Thus {ξi } converges in distribution to ξ. But, for any small number ε > 0, we have Pr{|ξi − ξ| > ε} = Pr{Ω} = 1. That is, the sequence {ξi } does not converge in probability to ξ.
40
1.14
Chapter 1 - Probability Theory
Conditional Probability
We consider the probability of an event A after it has been learned that some other event B has occurred. This new probability of A is called the conditional probability of A given B. Definition 1.28 Let (Ω, A, Pr) be a probability space, and A, B ∈ the conditional probability of A given B is defined by Pr{A|B} =
Pr{A ∩ B} Pr{B}
A.
Then
(1.61)
provided that Pr{B} > 0. Theorem 1.45 Let (Ω, A, Pr) be a probability space, and B an event with Pr{B} > 0. Then Pr{·|B} defined by (1.61) is a probability measure, and (Ω, A, Pr{·|B}) is a probability space. Proof: It is sufficient to prove that Pr{·|B} satisfies the normality, nonnegativity and countable additivity axioms. At first, we have Pr{Ω|B} =
Pr{Ω ∩ B} Pr{B} = = 1. Pr{B} Pr{B}
Secondly, for any A ∈ A, the set function Pr{A|B} is nonnegative. Finally, for any countable sequence {Ai } of mutually disjoint events, we have ∞ ∞ P S (∞ ) Pr Pr{Ai ∩ B} X Ai ∩ B ∞ [ i=1 = i=1 = Pr{Ai |B}. Pr Ai |B = Pr{B} Pr{B} i=1 i=1 Thus Pr{·|B} is a probability measure. Furthermore, (Ω, A, Pr{·|B}) is a probability space. Theorem 1.46 (Bayes Formula) Let the events A1 , A2 , · · · , An form a partition of the space Ω such that Pr{Ai } > 0 for i = 1, 2, · · · , n, and let B be an event with Pr{B} > 0. Then we have Pr{Ak } Pr{B|Ak } Pr{Ak |B} = P n Pr{Ai } Pr{B|Ai }
(1.62)
i=1
for k = 1, 2, · · · , n. Proof: Since A1 , A2 , · · · , An form a partition of the space Ω, we have Pr{B} =
n X i=1
Pr{Ai ∩ B} =
n X i=1
Pr{Ai } Pr{B|Ai }
41
Section 1.14 - Conditional Probability
which is also called the formula for total probability. Thus, for any k, we have Pr{Ak |B} =
Pr{Ak ∩ B} Pr{Ak } Pr{B|Ak } = P . n Pr{B} Pr{Ai } Pr{B|Ai } i=1
The theorem is proved. Remark 1.2: Especially, let A and B be two events with Pr{A} > 0 and Pr{B} > 0. Then A and Ac form a partition of the space Ω, and the Bayes formula is Pr{A} Pr{B|A} Pr{A|B} = . (1.63) Pr{B} Remark 1.3: In statistical applications, the events A1 , A2 , · · · , An are often called hypotheses. Furthermore, for each i, the Pr{Ai } is called the prior probability of Ai , and Pr{Ai |B} is called the posterior probability of Ai after the occurrence of event B. Example 1.32: Let ξ be an exponentially distributed random variable with expected value β. Then for any real numbers a > 0 and x > 0, the conditional probability of ξ ≥ a + x given ξ ≥ a is Pr{ξ ≥ a + x|ξ ≥ a} = exp(−x/β) = Pr{ξ ≥ x} which means that the conditional probability is identical to the original probability. This is the so-called memoryless property of exponential distribution. In other words, it is as good as new if it is functioning on inspection. Definition 1.29 The conditional probability distribution Φ: < → [0, 1] of a random variable ξ given B is defined by Φ(x|B) = Pr {ξ ≤ x|B}
(1.64)
provided that Pr{B} > 0. Example 1.33: Let ξ and η be random variables. Then the conditional probability distribution of ξ given η = y is Φ(x|η = y) = Pr {ξ ≤ x|η = y} =
Pr{ξ ≤ x, η = y} Pr{η = y}
provided that Pr{η = y} > 0. Definition 1.30 The conditional probability density function φ of a random variable ξ given B is a nonnegative function such that Z x Φ(x|B) = φ(y|B)dy, ∀x ∈ < (1.65) −∞
where Φ(x|B) is the conditional probability distribution of ξ given B.
42
Chapter 1 - Probability Theory
Example 1.34: Let (ξ, η) be a random vector with joint probability density function ψ. Then the marginal probability density functions of ξ and η are Z +∞ Z +∞ ψ(x, y)dx, ψ(x, y)dy, g(y) = f (x) = −∞
−∞
respectively. Furthermore, we have Z Z x Z y ψ(r, t)drdt = Pr{ξ ≤ x, η ≤ y} = −∞
y
−∞
−∞
Z
x
−∞
ψ(r, t) dr g(t)dt g(t)
which implies that the conditional probability distribution of ξ given η = y is Z x ψ(r, y) dr, a.s. (1.66) Φ(x|η = y) = −∞ g(y) and the conditional probability density function of ξ given η = y is φ(x|η = y) =
ψ(x, y) =Z g(y)
ψ(x, y) +∞
,
a.s.
(1.67)
ψ(x, y)dx
−∞
Note that (1.66) and (1.67) are defined only for g(y) 6= 0. In fact, the set {y|g(y) = 0} has probability 0. Especially, if ξ and η are independent random variables, then ψ(x, y) = f (x)g(y) and φ(x|η = y) = f (x). Definition 1.31 Let ξ be a random variable. Then the conditional expected value of ξ given B is defined by Z +∞ Z 0 E[ξ|B] = Pr{ξ ≥ r|B}dr − Pr{ξ ≤ r|B}dr (1.68) −∞
0
provided that at least one of the two integrals is finite. Following conditional probability and conditional expected value, we also have conditional variance, conditional moments, conditional critical values, conditional entropy as well as conditional convergence. Definition 1.32 Let ξ be a nonnegative random variable representing lifetime. Then the hazard rate (or failure rate) is Pr{ξ ≤ x + ∆ ξ > x} h(x) = lim . (1.69) ∆↓0 ∆ The hazard rate tells us the probability of a failure just after time x when it is functioning at time x. If ξ has probability distribution Φ and probability density function φ, then the hazard rate h(x) =
φ(x) . 1 − Φ(x)
Example 1.35: Let ξ be an exponentially distributed random variable with expected value β. Then its hazard rate h(x) ≡ 1/β.
43
Section 1.15 - Stochastic Process
1.15
Stochastic Process
Definition 1.33 Let T be an index set and (Ω, A, Pr) be a probability space. A stochastic process is a measurable function from T × (Ω, A, Pr) to the set of real numbers, i.e., for each t ∈ T and any Borel set B of real numbers, the set {ω ∈ Ω X(t, ω) ∈ B} (1.70) is an event. That is, a stochastic process Xt (ω) is a function of two variables such that the function Xt∗ (ω) is a random variable for each t∗ . For each fixed ω ∗ , the function Xt (ω ∗ ) is said to be a sample path of the stochastic process. A stochastic process Xt (ω) is said to be sample-continuous if the sample path is continuous for almost all ω. Definition 1.34 A stochastic process Xt is said to have independent increments if Xt1 − Xt0 , Xt2 − Xt1 , · · · , Xtk − Xtk−1 (1.71) are independent random variables for any times t0 < t1 < · · · < tk . A stochastic process Xt is said to have stationary increments if, for any given t > 0, the increments Xs+t − Xs are identically distributed random variables for all s > 0. Renewal Process Definition 1.35 Let ξ1 , ξ2 , · · · be iid positive random variables. Define S0 = 0 and Sn = ξ1 + ξ2 + · · · + ξn for n ≥ 1. Then the stochastic process (1.72) Nt = max n Sn ≤ t n≥0
is called a renewal process. If ξ1 , ξ2 , · · · denote the interarrival times of successive events. Then Sn can be regarded as the waiting time until the occurrence of the nth event, and Nt is the number of renewals in (0, t]. Each sample path of Nt is a right-continuous and increasing step function taking only nonnegative integer values. Furthermore, the size of each jump of Nt is always 1. In other words, Nt has at most one renewal at each time. In particular, Nt does not jump at time 0. Since Nt ≥ n if and only if Sn ≤ t, we have Pr{Nt ≥ n} = Pr{Sn ≤ t}.
(1.73)
Theorem 1.47 Let Nt be a renewal process. Then we have E[Nt ] =
∞ X n=1
Pr{Sn ≤ t}.
(1.74)
44
Chapter 1 - Probability Theory
N. t 4 3 2 1 0
... .......... ... .. ........... .............................. .. ... ... .. .. ... .......... ......................................................... .. ... .. .. .... .. .. .. .. .. .......... ....................................... .. ... .. .. .. ... .. .. ... .. .. .. .. ... .. ......................................................... ......... .. .. .. .. .. .... .. .. .. .. ... .. .. .. . .. . ...................................................................................................................................................................................................................................... ... ... ... ... ... .... .... .... .... ... 1 ... 2 3 ... 4 ... ... ... .. .. .. .. ..
ξ
S0
ξ
ξ
S1
ξ
S2
S3
t
S4
Figure 1.2: A Sample Path of Renewal Process Proof: Since Nt takes only nonnegative integer values, we have Z ∞ ∞ Z n X E[Nt ] = Pr{Nt ≥ r}dr = Pr{Nt ≥ r}dr 0
=
∞ X
n=1 ∞ X
Pr{Nt ≥ n} =
n=1
n−1
Pr{Sn ≤ t}.
n=1
The theorem is proved. Example 1.36: A renewal process Nt is called a Poisson process with intensity λ if ξ1 , ξ2 , · · · are iid exponentially distributed random variables with expected value 1/λ. It has been proved that Pr{Nt = n} = exp(−λt)
(λt)n , n!
n = 0, 1, 2, · · ·
E[Nt ] = λt.
(1.75) (1.76)
Theorem 1.48 (Renewal Theorem) Let Nt be a renewal process with interarrival times ξ1 , ξ2 , · · · Then we have lim
t→∞
E[Nt ] 1 = . t E[ξ1 ]
(1.77)
Brownian Motion In 1828 the botanist Brown observed irregular movement of pollen suspended in liquid. This movement is now known as Brownian motion. Bachelier used Brownian motion as a model of stock prices in 1900. An equation for Brownian motion was obtained by Einstein in 1905, and a rigorous mathematical definition of Brownian motion was given by Wiener in 1931. For this reason, Brownian motion is also called Wiener process.
45
Section 1.15 - Stochastic Process
Definition 1.36 A stochastic process Bt is said to be a Brownian motion (also called Wiener process) if (i) B0 = 0 and Bt is sample-continuous, (ii) Bt has stationary and independent increments, (iii) every increment Bs+t − Bs is a normally distributed random variable with expected value et and variance σ 2 t. The parameters e and σ are called the drift and diffusion coefficients, respectively. The Brownian motion is said to be standard if e = 0 and σ = 1. Any Brownian motion may be represented by et+σBt where Bt is a standard Brownian motion. B. t
.... ......... .... .. ... ... ..... ..... ... ... ... ........ ... ........ ... . ... ...... .... ....... ..... .... . ... ... . ...... ... ... ....... .. ... ... .. . ... ........... . . ... .. ...... ... ... ... . . ... . . .. ........ ... . . . . ... . . . . ... .... .. ......... ... .... ...... ... . ........ . . ....... ......... ..... . ... .............. . .. ... ...... .. ... . ... .... ... ... ... ... .. .... ... .... ......... ................. ... .. ... .. .. .. ... .. .. ... ........ ...... ...... ... ... ... ... .... ... ... .. ... ..... ... .. ............. .... ... . ... ... . ... . ... . . .. ... .... ... ... . ... . ... ... .... ... ......... ... ... ...... .. ...............................................................................................................................................................................................................................................................
t
Figure 1.3: A Sample Path of Standard Brownian Motion
Theorem 1.49 (Existence Theorem) There is a Brownian motion. Proof: Without loss of generality, we only prove that there is a standard Brownian motion Bt on the range of t ∈ [0, 1]. First, let ξ(r) r represents rational numbers in [0, 1] be a countable sequence of independently and normally distributed random variables with expected value zero and variance one. For each integer n, we define a stochastic process k X i k √1 ξ , if t = (k = 0, 1, · · · , n) n n n i=1 Xn (t) = linear, otherwise. Since the limit lim Xn (t)
n→∞
exists almost surely, we may verify that the limit meets the conditions of standard Brownian motion. Hence there is a standard Brownian motion.
46
Chapter 1 - Probability Theory
Remark 1.4: Suppose that Bt is a standard Brownian motion. It has been proved that X1 (t) = −Bt , (1.78) X2 (t) = aBt/a2 ,
(1.79)
X3 (t) = Bt+s − Bs
(1.80)
are each a version of standard Brownian motion. Almost all Brownian paths have an infinite variation and are differentiable nowhere. Furthermore, the squared variation of Brownian motion on [0, t] is equal to t both in mean square and almost surely. On the one hand, almost all Brownian paths are infinitely long within any time interval. On the other hand, it is impossible that a pollen moves at a speed of infinity. Is it a dilemma? Example 1.37: Let Bt be a Brownian motion with drift 0. Then for any level x > 0 and any time t > 0, we have Pr max Bs ≥ x = 2 Pr{Bt ≥ x} (1.81) 0≤s≤t
which is the so-called reflection principle. For any level x < 0 and any time t > 0, we have Pr min Bs ≤ x = 2 Pr{Bt ≤ x}. (1.82) 0≤s≤t
Example 1.38: Let Bt be a Brownian motion with drift e > 0 and diffusion coefficient σ. Then the first passage time τ that the Brownian motion reaches the barrier x > 0 has the probability density function (x − et)2 x exp − , t>0 (1.83) φ(t) = √ 2σ 2 t σ 2πt3 whose expected value and variance are E[τ ] =
x , e
V [τ ] =
xσ 2 . e3
(1.84)
However, if the drift e = 0, then the expected value E[τ ] is infinite. Definition 1.37 Let Bt be a standard Brownian motion. Then et + σBt is a Brownian motion, and the stochastic process Gt = exp(et + σBt ) is called a geometric Brownian motion.
(1.85)
47
Section 1.16 - Stochastic Calculus
Geometric Brownian motion Gt is an important model for stock prices. For each t > 0, the Gt has a lognormal distribution whose probability density function is (ln z − et)2 1 exp − φ(z) = √ , z≥0 (1.86) 2σ 2 t σ 2πt and has expected value and variance as follows, E[Gt ] = exp et + σ 2 t/2 , V [Gt ] = exp(2et + 2σ 2 t) − exp(2et + σ 2 t). In addition, the first passage time that a geometric Brownian motion Gt reaches the barrier x > 1 is just the time that the Brownian motion with drift e and diffusion σ reaches ln x.
1.16
Stochastic Calculus
Let Bt be a standard Brownian motion, and dt an infinitesimal time interval. Then dBt = Bt+dt − Bt is a stochastic process such that, for each t, the dBt is a normally distributed random variable with E[dBt ] = 0,
V [dBt ] = dt,
E[dBt2 ] = dt,
V [dBt2 ] = 2dt2 .
Definition 1.38 Let Xt be a stochastic process and let Bt be a standard Brownian motion. For any partition of closed interval [a, b] with a = t1 < t2 < · · · < tk+1 = b, the mesh is written as ∆ = max |ti+1 − ti |. 1≤i≤k
Then the Ito integral of Xt with respect to Bt is Z
b
Xt dBt = lim
∆→0
a
k X
Xti · (Bti+1 − Bti )
(1.87)
i=1
provided that the limit exists in mean square and is a random variable. Example 1.39: Let Bt be a standard Brownian motion. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have Z
s
dBt = lim 0
∆→0
k X i=1
(Bti+1 − Bti ) ≡ Bs − B0 = Bs .
48
Chapter 1 - Probability Theory
Example 1.40: Let Bt be a standard Brownian motion. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have sBs =
k X
ti+1 Bti+1 − ti Bti
i=1
=
k X
ti (Bti+1 − Bti ) +
i=1 Z s
→
k X
Bti+1 (ti+1 − ti )
i=1
Z
s
Bt dt
tdBt + 0
0
as ∆ → 0. It follows that Z
s
Z
s
tdBt = sBs − 0
Bt dt. 0
Example 1.41: Let Bt be a standard Brownian motion. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have Bs2 =
k X
Bt2i+1 − Bt2i
i=1
=
k X
Bti+1 − Bti
2
+2
i=1
k X
Bti Bti+1 − Bti
i=1 s
Z →s+2
Bt dBt 0
as ∆ → 0. That is, Z
s
Bt dBt = 0
1 2 1 B − s. 2 s 2
Theorem 1.50 (Ito Formula) Let Bt be a standard Brownian motion, and let h(t, b) be a twice continuously differentiable function. Define Xt = h(t, Bt ). Then we have the following chain rule dXt =
∂h ∂h 1 ∂2h (t, Bt )dt + (t, Bt )dBt + (t, Bt )dt. ∂t ∂b 2 ∂b2
(1.88)
Proof: Since the function h is twice continuously differentiable, by using Taylor series expansion, the infinitesimal increment of Xt has a second-order approximation ∆Xt =
∂h ∂h 1 ∂2h (t, Bt )∆t + (t, Bt )∆Bt + (t, Bt )(∆Bt )2 ∂t ∂b 2 ∂b2 +
1 ∂2h ∂2h (t, Bt )(∆t)2 + (t, Bt )∆t∆Bt . 2 2 ∂t ∂t∂b
49
Section 1.16 - Stochastic Calculus
Since we can ignore the terms (∆t)2 and ∆t∆Bt and replace (∆Bt )2 with ∆t, the Ito formula is obtained because it makes Z Z s Z s ∂h ∂h 1 s ∂2h (t, Bt )dt Xs = X0 + (t, Bt )dt + (t, Bt )dBt + 2 0 ∂b2 0 ∂t 0 ∂b for any s ≥ 0. Remark 1.5: The infinitesimal increment dBt in (1.88) may be replaced with the Ito process dYt = ut dt + vt dBt (1.89) where ut is an absolutely integrable stochastic process, and vt is a square integrable stochastic process, thus producing dh(t, Yt ) =
∂h 1 ∂2h ∂h (t, Yt )dt + (t, Yt )dYt + (t, Yt )vt2 dt. ∂t ∂b 2 ∂b2
(1.90)
Remark 1.6: Assume that B1t , B2t , · · · , Bmt are standard Brownian motions, and h(t, b1 , b2 , · · · , bm ) is a twice continuously differentiable function. Define Xt = h(t, B1t , B2t , · · · , Bmt ). Then we have the following multi-dimensional Ito formula dXt =
m m X ∂h ∂h 1 X ∂2h dt + dt. dBit + ∂t ∂bi 2 i=1 ∂b2i i=1
(1.91)
Example 1.42: Ito formula is the chain rule for differentiation. Applying Ito formula, we obtain d(tBt ) = Bt dt + tdBt . Hence we have s
Z sBs =
Z
s
d(tBt ) = 0
That is, Z
Z
s
Bt dt + 0
s
tdBt . 0
s
Z tdBt = sBs −
Bt dt.
0
0
Example 1.43: Let Bt be a standard Brownian motion. By using Ito formula d(Bt2 ) = 2Bt dBt + dt, we obtain Bs2 =
Z 0
s
d(Bt2 ) = 2
Z
s
Z Bt dBt +
0
s
Z dt = 2
0
s
Bt dBt + s. 0
50
Chapter 1 - Probability Theory
It follows that
Z
s
1 2 1 B − s. 2 s 2
Bt dBt = 0
Example 1.44: Let Bt be a standard Brownian motion. By using Ito formula d(Bt3 ) = 3Bt2 dBt + 3Bt dt, we have Bs3
s
Z
d(Bt3 )
=
Bt2 dBt
=3
0
That is
s
Z 0
s
Z
Bt2 dBt
0
s
Z +3
Bt dt. 0
1 = Bs3 − 3
s
Z
Bt dt. 0
Theorem 1.51 (Integration by Parts) Suppose that Bt is a standard Brownian motion and F (t) is an absolutely continuous function. Then Z s Z s F (t)dBt = F (s)Bs − Bt dF (t). (1.92) 0
0
Proof: By defining h(t, Bt ) = F (t)Bt and using the Ito formula, we get d(F (t)Bt ) = Bt dF (t) + F (t)dBt . Thus
Z F (s)Bs =
s
Z d(F (t)Bt ) =
0
s
Z Bt dF (t) +
0
s
F (t)dBt 0
which is just (1.92).
1.17
Stochastic Differential Equation
This section introduces a type of stochastic differential equations driven by Brownian motion. Definition 1.39 Suppose Bt is a standard Brownian motion, and f and g are some given functions. Then dXt = f (t, Xt )dt + g(t, Xt )dBt
(1.93)
is called a stochastic differential equation. A solution is a stochastic process Xt that satisfies (1.93) identically in t. Remark 1.7: Note that there is no precise definition for the terms dXt , dt and dBt in the stochastic differential equation (1.93). The mathematically meaningful form is the stochastic integral equation Z s Z s Xs = X0 + f (t, Xt )dt + g(t, Xt )dBt . (1.94) 0
0
51
Section 1.17 - Stochastic Differential Equation
However, the differential form is convenient for us. This is the main reason why we accept the differential form. Example 1.45: Let Bt be a standard Brownian motion. Then the stochastic differential equation dXt = adt + bdBt has a solution Xt = at + bBt which is just a Brownian motion with drift coefficient a and diffusion coefficient b. Example 1.46: Let Bt be a standard Brownian motion. Then the stochastic differential equation dXt = aXt dt + bXt dBt has a solution
Xt = exp
a−
b2 2
t + bBt
which is just a geometric Brownian motion. Example 1.47: Let Bt be a standard Brownian motion. Then the stochastic differential equations 1 dXt = − Xt dt − Yt dBt 2 1 dY = − Y dt + X dB t t t t 2 have a solution (Xt , Yt ) = (cos Bt , sin Bt ) which is called a Brownian motion on unit circle since Xt2 + Yt2 ≡ 1.
Chapter 2
Credibility Theory The concept of fuzzy set was initiated by Zadeh [245] via membership function in 1965. In order to measure a fuzzy event, Zadeh [248] proposed the concept of possibility measure. Although possibility measure has been widely used, it has no self-duality property. However, a self-dual measure is absolutely needed in both theory and practice. In order to define a self-dual measure, Liu and Liu [126] presented the concept of credibility measure. In addition, a sufficient and necessary condition for credibility measure was given by Li and Liu [100]. Credibility theory, founded by Liu [129] in 2004 and refined by Liu [132] in 2007, is a branch of mathematics for studying the behavior of fuzzy phenomena. The emphasis in this chapter is mainly on credibility measure, credibility space, fuzzy variable, membership function, credibility distribution, independence, identical distribution, expected value, variance, moments, critical values, entropy, distance, convergence almost surely, convergence in credibility, convergence in mean, convergence in distribution, conditional credibility, fuzzy process, fuzzy calculus, and fuzzy differential equation.
2.1
Credibility Space
Let Θ be a nonempty set, and P the power set of Θ (i.e., the larggest σalgebra over Θ). Each element in P is called an event. In order to present an axiomatic definition of credibility, it is necessary to assign to each event A a number Cr{A} which indicates the credibility that A will occur. In order to ensure that the number Cr{A} has certain mathematical properties which we intuitively expect a credibility to have, we accept the following four axioms: Axiom 1. (Normality) Cr{Θ} = 1. Axiom 2. (Monotonicity) Cr{A} ≤ Cr{B} whenever A ⊂ B. Axiom 3. (Self-Duality) Cr{A} + Cr{Ac } = 1 for any event A.
54
Chapter 2 - Credibility Theory
Axiom 4. (Maximality) Cr {∪i Ai } = supi Cr{Ai } for any events {Ai } with supi Cr{Ai } < 0.5. Definition 2.1 (Liu and Liu [126]) The set function Cr is called a credibility measure if it satisfies the normality, monotonicity, self-duality, and maximality axioms. Example 2.1: Let Θ = {θ1 , θ2 }. For this case, there are only four events: ∅, {θ1 }, {θ2 }, Θ. Define Cr{∅} = 0, Cr{θ1 } = 0.7, Cr{θ2 } = 0.3, and Cr{Θ} = 1. Then the set function Cr is a credibility measure because it satisfies the four axioms. Example 2.2: Let Θ be a nonempty set. Define Cr{∅} = 0, Cr{Θ} = 1 and Cr{A} = 1/2 for any subset A (excluding ∅ and Θ). Then the set function Cr is a credibility measure. Theorem 2.1 Let Θ be a nonempty set, P the power set of Θ, and Cr the credibility measure. Then Cr{∅} = 0 and 0 ≤ Cr{A} ≤ 1 for any A ∈ P. Proof: It follows from Axioms 1 and 3 that Cr{∅} = 1 − Cr{Θ} = 1 − 1 = 0. Since ∅ ⊂ A ⊂ Θ, we have 0 ≤ Cr{A} ≤ 1 by using Axiom 2. Theorem 2.2 Let Θ be a nonempty set, P the power set of Θ, and Cr the credibility measure. Then for any A, B ∈ P, we have Cr{A ∪ B} = Cr{A} ∨ Cr{B} if Cr{A ∪ B} ≤ 0.5,
(2.1)
Cr{A ∩ B} = Cr{A} ∧ Cr{B} if Cr{A ∩ B} ≥ 0.5.
(2.2)
The above equations hold for not only finite number of events but also infinite number of events. Proof: If Cr{A ∪ B} < 0.5, then Cr{A} ∨ Cr{B} < 0.5 by using Axiom 2. Thus the equation (2.1) follows immediately from Axiom 4. If Cr{A ∪ B} = 0.5 and (2.1) does not hold, then we have Cr{A} ∨ Cr{B} < 0.5. It follows from Axiom 4 that Cr{A ∪ B} = Cr{A} ∨ Cr{B} < 0.5. A contradiction proves (2.1). Next we prove (2.2). Since Cr{A ∩ B} ≥ 0.5, we have Cr{Ac ∪ B c } ≤ 0.5 by the self-duality. Thus Cr{A ∩ B} = 1 − Cr{Ac ∪ B c } = 1 − Cr{Ac } ∨ Cr{B c } = Cr{A} ∧ Cr{B}. The theorem is proved.
55
Section 2.1 - Credibility Space
Credibility Subadditivity Theorem Theorem 2.3 (Liu [129], Credibility Subadditivity Theorem) The credibility measure is subadditive. That is, Cr{A ∪ B} ≤ Cr{A} + Cr{B}
(2.3)
for any events A and B. In fact, credibility measure is not only finitely subadditive but also countably subadditive. Proof: The argument breaks down into three cases. Case 1: Cr{A} < 0.5 and Cr{B} < 0.5. It follows from Axiom 4 that Cr{A ∪ B} = Cr{A} ∨ Cr{B} ≤ Cr{A} + Cr{B}. Case 2: Cr{A} ≥ 0.5. For this case, by using Axioms 2 and 3, we have Cr{Ac } ≤ 0.5 and Cr{A ∪ B} ≥ Cr{A} ≥ 0.5. Then Cr{Ac } = Cr{Ac ∩ B} ∨ Cr{Ac ∩ B c } ≤ Cr{Ac ∩ B} + Cr{Ac ∩ B c } ≤ Cr{B} + Cr{Ac ∩ B c }. Applying this inequality, we obtain Cr{A} + Cr{B} = 1 − Cr{Ac } + Cr{B} ≥ 1 − Cr{B} − Cr{Ac ∩ B c } + Cr{B} = 1 − Cr{Ac ∩ B c } = Cr{A ∪ B}. Case 3: Cr{B} ≥ 0.5. This case may be proved by a similar process of Case 2. The theorem is proved. Remark 2.1: For any events A and B, it follows from the credibility subadditivity theorem that the credibility measure is null-additive, i.e., Cr{A ∪ B} = Cr{A} + Cr{B} if either Cr{A} = 0 or Cr{B} = 0. Theorem 2.4 Let {Bi } be a decreasing sequence of events with Cr{Bi } → 0 as i → ∞. Then for any event A, we have lim Cr{A ∪ Bi } = lim Cr{A\Bi } = Cr{A}.
i→∞
i→∞
(2.4)
Proof: It follows from the monotonicity axiom and credibility subadditivity theorem that Cr{A} ≤ Cr{A ∪ Bi } ≤ Cr{A} + Cr{Bi }
56
Chapter 2 - Credibility Theory
for each i. Thus we get Cr{A ∪ Bi } → Cr{A} by using Cr{Bi } → 0. Since (A\Bi ) ⊂ A ⊂ ((A\Bi ) ∪ Bi ), we have Cr{A\Bi } ≤ Cr{A} ≤ Cr{A\Bi } + Cr{Bi }. Hence Cr{A\Bi } → Cr{A} by using Cr{Bi } → 0. Theorem 2.5 A credibility measure on Θ is additive if and only if there are at most two elements in Θ taking nonzero credibility values. Proof: Suppose that the credibility measure is additive. If there are more than two elements taking nonzero credibility values, then we may choose three elements θ1 , θ2 , θ3 such that Cr{θ1 } ≥ Cr{θ2 } ≥ Cr{θ3 } > 0. If Cr{θ1 } ≥ 0.5, it follows from Axioms 2 and 3 that Cr{θ2 , θ3 } ≤ Cr{Θ \ {θ1 }} = 1 − Cr{θ1 } ≤ 0.5. By using Axiom 4, we obtain Cr{θ2 , θ3 } = Cr{θ2 } ∨ Cr{θ3 } < Cr{θ2 } + Cr{θ3 }. This is in contradiction with the additivity assumption. If Cr{θ1 } < 0.5, then Cr{θ3 } ≤ Cr{θ2 } < 0.5. It follows from Axiom 4 that Cr{θ2 , θ3 } ∧ 0.5 = Cr{θ2 } ∨ Cr{θ3 } < 0.5 which implies that Cr{θ2 , θ3 } = Cr{θ2 } ∨ Cr{θ3 } < Cr{θ2 } + Cr{θ3 }. This is also in contradiction with the additivity assumption. Hence there are at most two elements taking nonzero credibility values. Conversely, suppose that there are at most two elements, say θ1 and θ2 , taking nonzero credibility values. Let A and B be two disjoint events. The argument breaks down into two cases. Case 1: If either Cr{A} = 0 or Cr{B} = 0 is true, then we have Cr{A ∪ B} = Cr{A} + Cr{B} by using the credibility subadditivity theorem. Case 2: Cr{A} > 0 or Cr{B} > 0. For this case, without loss of generality, we suppose that θ1 ∈ A and θ2 ∈ B. Note that Cr{(A ∪ B)c } = 0. It follows from Axiom 3 and the credibility subadditivity theorem that Cr{A ∪ B} = Cr{A ∪ B ∪ (A ∪ B)c } = Cr{Θ} = 1, Cr{A} + Cr{B} = Cr{A ∪ (A ∪ B)c } + Cr{B} = 1. Hence Cr{A ∪ B} = Cr{A} + Cr{B}. The additivity is proved. Remark 2.2: Theorem 2.5 states that a credibility measure is identical with probability measure if there are effectively two elements in the universal set.
57
Section 2.1 - Credibility Space
Credibility Semicontinuity Law Generally speaking, the credibility measure is neither lower semicontinuous nor upper semicontinuous. However, we have the following credibility semicontinuity law. Theorem 2.6 (Liu [129], Credibility Semicontinuity Law) For any events A1 , A2 , · · · , we have n o lim Cr{Ai } = Cr lim Ai (2.5) i→∞
i→∞
if one of the following conditions is satisfied: (a) Cr {A} ≤ 0.5 and Ai ↑ A; (b) lim Cr{Ai } < 0.5 and Ai ↑ A; i→∞
(c) Cr {A} ≥ 0.5 and Ai ↓ A;
(d) lim Cr{Ai } > 0.5 and Ai ↓ A. i→∞
Proof: (a) Since Cr{A} ≤ 0.5, we have Cr{Ai } ≤ 0.5 for each i. It follows from Axiom 4 that Cr{A} = Cr {∪i Ai } = sup Cr{Ai } = lim Cr{Ai }. i
i→∞
(b) Since limi→∞ Cr{Ai } < 0.5, we have supi Cr{Ai } < 0.5. It follows from Axiom 4 that Cr{A} = Cr {∪i Ai } = sup Cr{Ai } = lim Cr{Ai }. i
i→∞
(c) Since Cr{A} ≥ 0.5 and Ai ↓ A, it follows from the self-duality of credibility measure that Cr{Ac } ≤ 0.5 and Aci ↑ Ac . Thus Cr{Ai } = 1 − Cr{Aci } → 1 − Cr{Ac } = Cr{A} as i → ∞. (d) Since limi→∞ Cr{Ai } > 0.5 and Ai ↓ A, it follows from the self-duality of credibility measure that lim Cr{Aci } = lim (1 − Cr{Ai }) < 0.5
i→∞
i→∞
and Aci ↑ Ac . Thus Cr{Ai } = 1 − Cr{Aci } → 1 − Cr{Ac } = Cr{A} as i → ∞. The theorem is proved. Credibility Asymptotic Theorem Theorem 2.7 (Credibility Asymptotic Theorem) For any events A1 , A2 , · · · , we have lim Cr{Ai } ≥ 0.5, if Ai ↑ Θ, (2.6) i→∞
lim Cr{Ai } ≤ 0.5,
i→∞
if Ai ↓ ∅.
(2.7)
58
Chapter 2 - Credibility Theory
Proof: Assume Ai ↑ Θ. If limi→∞ Cr{Ai } < 0.5, it follows from the credibility semicontinuity law that Cr{Θ} = lim Cr{Ai } < 0.5 i→∞
which is in contradiction with Cr{Θ} = 1. The first inequality is proved. The second one may be verified similarly. Credibility Extension Theorem Suppose that the credibility of each singleton is given. Is the credibility measure fully and uniquely determined? This subsection will answer the question. Theorem 2.8 Suppose that Θ is a nonempty set. If Cr is a credibility measure, then we have sup Cr{θ} ≥ 0.5, θ∈Θ
Cr{θ∗ } + sup Cr{θ} = 1 if Cr{θ∗ } ≥ 0.5.
(2.8)
θ6=θ ∗
We will call (2.8) the credibility extension condition. Proof: If sup Cr{θ} < 0.5, then by using Axiom 4, we have 1 = Cr{Θ} = sup Cr{θ} < 0.5. θ∈Θ
This contradiction proves sup Cr{θ} ≥ 0.5. We suppose that θ∗ ∈ Θ is a point with Cr{θ∗ } ≥ 0.5. It follows from Axioms 3 and 4 that Cr{Θ \ {θ∗ }} ≤ 0.5, and Cr{Θ \ {θ∗ }} = sup Cr{θ}. θ6=θ ∗
Hence the second formula of (2.8) is true by the self-duality of credibility measure. Theorem 2.9 (Li and Liu [100], Credibility Extension Theorem) Suppose that Θ is a nonempty set, and Cr{θ} is a nonnegative function on Θ satisfying the credibility extension condition (2.8). Then Cr{θ} has a unique extension to a credibility measure as follows,
Cr{A} =
sup Cr{θ}, θ∈A
if sup Cr{θ} < 0.5 θ∈A
1 − sup Cr{θ}, if sup Cr{θ} ≥ 0.5. θ∈Ac
θ∈A
(2.9)
59
Section 2.1 - Credibility Space
Proof: We first prove that the set function Cr{A} defined by (2.9) is a credibility measure. Step 1: By the credibility extension condition sup Cr{θ} ≥ 0.5, we have θ∈Θ
Cr{Θ} = 1 − sup Cr{θ} = 1 − 0 = 1. θ∈∅
Step 2: If A ⊂ B, then B c ⊂ Ac . The proof breaks down into two cases. Case 1: sup Cr{θ} < 0.5. For this case, we have θ∈A
Cr{A} = sup Cr{θ} ≤ sup Cr{θ} ≤ Cr{B}. θ∈A
θ∈B
Case 2: sup Cr{θ} ≥ 0.5. For this case, we have sup Cr{θ} ≥ 0.5, and θ∈A
θ∈B
Cr{A} = 1 − sup Cr{θ} ≤ 1 − sup Cr{θ} = Cr{B}. θ∈Ac
θ∈B c
Step 3: In order to prove Cr{A} + Cr{Ac } = 1, the argument breaks down into two cases. Case 1: sup Cr{θ} < 0.5. For this case, we have sup Cr{θ} ≥ 0.5. Thus, θ∈Ac
θ∈A c
Cr{A} + Cr{A } = sup Cr{θ} + 1 − sup Cr{θ} = 1. θ∈A
θ∈A
Case 2: sup Cr{θ} ≥ 0.5. For this case, we have sup Cr{θ} ≤ 0.5, and θ∈Ac
θ∈A c
Cr{A} + Cr{A } = 1 − sup Cr{θ} + sup Cr{θ} = 1. θ∈Ac
θ∈Ac
Step 4: For any collection {Ai } with supi Cr{Ai } < 0.5, we have Cr{∪i Ai } = sup Cr{θ} = sup sup Cr{θ} = sup Cr{Ai }. θ∈∪i Ai
i
θ∈Ai
i
Thus Cr is a credibility measure because it satisfies the four axioms. Finally, let us prove the uniqueness. Assume that Cr1 and Cr2 are two credibility measures such that Cr1 {θ} = Cr2 {θ} for each θ ∈ Θ. Let us prove that Cr1 {A} = Cr2 {A} for any event A. The argument breaks down into three cases. Case 1: Cr1 {A} < 0.5. For this case, it follows from Axiom 4 that Cr1 {A} = sup Cr1 {θ} = sup Cr2 {θ} = Cr2 {A}. θ∈A
θ∈A
Case 2: Cr1 {A} > 0.5. For this case, we have Cr1 {Ac } < 0.5. It follows from the first case that Cr1 {Ac } = Cr2 {Ac } which implies Cr1 {A} = Cr2 {A}. Case 3: Cr1 {A} = 0.5. For this case, we have Cr1 {Ac } = 0.5, and Cr2 {A} ≥ sup Cr2 {θ} = sup Cr1 {θ} = Cr1 {A} = 0.5, θ∈A
θ∈A
c
Cr2 {A } ≥ sup Cr2 {θ} = sup Cr1 {θ} = Cr1 {Ac } = 0.5. θ∈Ac
θ∈Ac
Hence Cr2 {A} = 0.5 = Cr1 {A}. The uniqueness is proved.
60
Chapter 2 - Credibility Theory
Credibility Space Definition 2.2 Let Θ be a nonempty set, P the power set of Θ, and Cr a credibility measure. Then the triplet (Θ, P, Cr) is called a credibility space. Example 2.3: The triplet (Θ, P, Cr) is a credibility space if Θ = {θ1 , θ2 , · · · }, Cr{θi } ≡ 1/2 for i = 1, 2, · · ·
(2.10)
Note that the credibility measure is produced by the credibility extension theorem as follows, 0, if A = ∅ 1, if A = Θ Cr{A} = 1/2, otherwise. Example 2.4: The triplet (Θ, P, Cr) is a credibility space if Θ = {θ1 , θ2 , · · · }, Cr{θi } = i/(2i + 1) for i = 1, 2, · · ·
(2.11)
By using the credibility extension theorem, we obtain the following credibility measure, i sup , if A is finite 2i + 1 θi ∈A Cr{A} = i 1 − sup , if A is infinite. c 2i + 1 θi ∈A Example 2.5: The triplet (Θ, P, Cr) is a credibility space if Θ = {θ1 , θ2 , · · · }, Cr{θ1 } = 1/2, Cr{θi } = 1/i for i = 2, 3, · · ·
(2.12)
For this case, the credibility measure is sup 1/i, if A contains neither θ1 nor θ2 θi ∈A 1/2, if A contains only one of θ1 and θ2 Cr{A} = 1 − sup 1/i, if A contains both θ1 and θ2 . θi ∈Ac
Example 2.6: The triplet (Θ, P, Cr) is a credibility space if Θ = [0, 1],
Cr{θ} = θ/2 for θ ∈ Θ.
For this case, the credibility measure is 1 sup θ, if sup θ < 1 2 θ∈A θ∈A Cr{A} = 1 1 − sup θ, if sup θ = 1. 2 θ∈Ac θ∈A
(2.13)
61
Section 2.1 - Credibility Space
Product Credibility Measure Product credibility measure may be defined in multiple ways. This book accepts the following axiom. Axiom 5. (Product Credibility Axiom) Let Θk be nonempty sets on which Crk are credibility measures, k = 1, 2, · · · , n, respectively, and Θ = Θ1 ×Θ2 × · · · × Θn . Then Cr{(θ1 , θ2 , · · · , θn )} = Cr1 {θ1 } ∧ Cr2 {θ2 } ∧ · · · ∧ Crn {θn }
(2.14)
for each (θ1 , θ2 , · · · , θn ) ∈ Θ. Theorem 2.10 (Product Credibility Theorem) Let Θk be nonempty sets on which Crk are the credibility measures, k = 1, 2, · · · , n, respectively, and Θ = Θ1 × Θ2 × · · · × Θn . Then Cr = Cr1 ∧ Cr2 ∧ · · · ∧ Crn defined by Axiom 5 has a unique extension to a credibility measure on Θ as follows, sup min Crk {θk }, (θ1 ,θ2 ··· ,θn )∈A 1≤k≤n if sup min Crk {θk } < 0.5 (θ1 ,θ2 ,··· ,θn )∈A 1≤k≤n (2.15) Cr{A} = min Crk {θk }, 1− sup (θ1 ,θ2 ,··· ,θn )∈Ac 1≤k≤n if sup min Crk {θk } ≥ 0.5. (θ1 ,θ2 ,··· ,θn )∈A 1≤k≤n
Proof: For each θ = (θ1 , θ2 , · · · , θn ) ∈ Θ, we have Cr{θ} = Cr1 {θ1 } ∧ Cr2 {θ2 } ∧ · · · ∧ Crn {θn }. Let us prove that Cr{θ} satisfies the credibility extension condition. Since sup Cr{θk } ≥ 0.5 for each k, we have θk ∈Θk
sup Cr{θ} = θ∈Θ
sup
min Crk {θk } ≥ 0.5.
(θ1 ,θ2 ,··· ,θn )∈Θ 1≤k≤n
Now we suppose that θ ∗ = (θ1∗ , θ2∗ , · · · , θn∗ ) is a point with Cr{θ ∗ } ≥ 0.5. Without loss of generality, let i be the index such that Cr{θ ∗ } = min Crk {θk∗ } = Cri {θi∗ }. 1≤k≤n
(2.16)
We also immediately have Crk {θk∗ } ≥ 0.5,
k = 1, 2, · · · , n;
(2.17)
Crk {θk∗ } + sup Crk {θk } = 1,
k = 1, 2, · · · , n;
(2.18)
sup Cri {θi } ≥ sup Crk {θk },
k = 1, 2, · · · , n;
(2.19)
∗ θk 6=θk
θi 6=θi∗
∗ θk 6=θk
62
Chapter 2 - Credibility Theory
sup Crk {θk } ≤ 0.5,
k = 1, · · · , n.
(2.20)
∗ θk 6=θk
It follows from (2.17) and (2.20) that sup Cr{θ} = θ6=θ ∗
≥
min Crk {θk }
sup
∗ ) 1≤k≤n (θ1 ,θ2 ,··· ,θn )6=(θ1∗ ,θ2∗ ,··· ,θn ∗ sup min Crk {θk } ∧ Cri {θi } θi 6=θ ∗ 1≤k≤i−1
∧
i
min
i+1≤k≤n
Crk {θk∗ }
= sup Cri {θi }. θi 6=θi∗
We next suppose that sup Cr{θ} > sup Cri {θi }.
θ6=θ ∗
θi 6=θi∗
Then there is a point (θ10 , θ20 , · · · , θn0 ) 6= (θ1∗ , θ2∗ , · · · , θn∗ ) such that min Crk {θk0 } > sup Cri {θi }.
1≤k≤n
θi 6=θi∗
Let j be one of the index such that θj0 6= θj∗ . Then Crj {θj0 } > sup Cri {θi }. θi 6=θi∗
That is, sup Crj {θj } > sup Cri {θi }
θj 6=θj∗
θi 6=θi∗
which is in contradiction with (2.19). Thus sup Cr{θ} = sup Cri {θi }.
θ6=θ ∗
(2.21)
θi 6=θi∗
It follows from (2.16), (2.18) and (2.21) that Cr{θ ∗ } + sup Cr{θ} = Cri {θi∗ } + sup Cri {θi } = 1. θ6=θ ∗
θi 6=θi∗
Thus Cr satisfies the credibility extension condition. It follows from the credibility extension theorem that Cr{A} is just the unique extension of Cr{θ}. The theorem is proved. Definition 2.3 Let (Θk , Pk , Crk ), k = 1, 2, · · · , n be credibility spaces, Θ = Θ1 × Θ2 × · · · × Θn and Cr = Cr1 ∧ Cr2 ∧ · · · ∧ Crn . Then (Θ, P, Cr) is called the product credibility space of (Θk , Pk , Crk ), k = 1, 2, · · · , n.
63
Section 2.2 - Fuzzy Variables
Theorem 2.11 (Infinite Product Credibility Theorem) Suppose that Θk are nonempty sets, Crk the credibility measures on Pk , k = 1, 2, · · · , respectively. Let Θ = Θ1 × Θ2 × · · · Then sup inf Crk {θk }, (θ1 ,θ2 ,··· )∈A 1≤k<∞ if sup inf Crk {θk } < 0.5 (θ1 ,θ2 ,··· )∈A 1≤k<∞ Cr{A} = 1− sup inf Crk {θk }, (θ1 ,θ2 ,··· )∈Ac 1≤k<∞ if sup inf Crk {θk } ≥ 0.5 (θ1 ,θ2 ,··· )∈A 1≤k<∞
is a credibility measure on
P.
Proof: Like Theorem 2.10 except Cr{(θ1 , θ2 , · · · )} = inf 1≤k<∞ Crk {θk }. Definition 2.4 Let (Θk , Pk , Crk ), k = 1, 2, · · · be credibility spaces. Define Θ = Θ1 × Θ2 × · · · and Cr = Cr1 ∧ Cr2 ∧ · · · Then (Θ, P, Cr) is called the infinite product credibility space of (Θk , Pk , Crk ), k = 1, 2, · · ·
2.2
Fuzzy Variables
Definition 2.5 A fuzzy variable is a (measurable) function from a credibility space (Θ, P, Cr) to the set of real numbers. Example 2.7: Take (Θ, P, Cr) to be {θ1 , θ2 } with Cr{θ1 } = Cr{θ2 } = 0.5. Then the function ( 0, if θ = θ1 ξ(θ) = 1, if θ = θ2 is a fuzzy variable. Example 2.8: Take (Θ, P, Cr) to be the interval [0, 1] with Cr{θ} = θ/2 for each θ ∈ [0, 1]. Then the identity function ξ(θ) = θ is a fuzzy variable. Example 2.9: A crisp number c may be regarded as a special fuzzy variable. In fact, it is the constant function ξ(θ) ≡ c on the credibility space (Θ, P, Cr). Remark 2.3: Since a fuzzy variable ξ is a function on a credibility space, for any set B of real numbers, the set {ξ ∈ B} = θ ∈ Θ ξ(θ) ∈ B (2.22) is always an element in P. In other words, the fuzzy variable ξ is always a measurable function and {ξ ∈ B} is always an event.
64
Chapter 2 - Credibility Theory
Definition 2.6 A fuzzy variable ξ is said to be (a) nonnegative if Cr{ξ < 0} = 0; (b) positive if Cr{ξ ≤ 0} = 0; (c) continuous if Cr{ξ = x} is a continuous function of x; (d) simple if there exists a finite sequence {x1 , x2 , · · · , xm } such that Cr {ξ 6= x1 , ξ 6= x2 , · · · , ξ 6= xm } = 0;
(2.23)
(e) discrete if there exists a countable sequence {x1 , x2 , · · · } such that Cr {ξ 6= x1 , ξ 6= x2 , · · · } = 0.
(2.24)
Definition 2.7 Let ξ1 and ξ2 be fuzzy variables defined on the credibility space (Θ, P, Cr). We say ξ1 = ξ2 if ξ1 (θ) = ξ2 (θ) for almost all θ ∈ Θ. Fuzzy Vector Definition 2.8 An n-dimensional fuzzy vector is defined as a function from a credibility space (Θ, P, Cr) to the set of n-dimensional real vectors. Theorem 2.12 The vector (ξ1 , ξ2 , · · · , ξn ) is a fuzzy vector if and only if ξ1 , ξ2 , · · · , ξn are fuzzy variables. Proof: Write ξ = (ξ1 , ξ2 , · · · , ξn ). Suppose that ξ is a fuzzy vector. Then ξ1 , ξ2 , · · · , ξn are functions from Θ to <. Thus ξ1 , ξ2 , · · · , ξn are fuzzy variables. Conversely, suppose that ξi are fuzzy variables defined on the credibility spaces (Θi , Pi , Cri ), i = 1, 2, · · · , n, respectively. It is clear that (ξ1 , ξ2 , · · · , ξn ) is a function from the product credibility space (Θ, P, Cr) to
(2.25)
65
Section 2.3 - Membership Function
Example 2.10: Let ξ1 and ξ2 be fuzzy variables on the credibility space (Θ, P, Cr). Then their sum is (ξ1 + ξ2 )(θ) = ξ1 (θ) + ξ2 (θ),
∀θ ∈ Θ
and their product is (ξ1 × ξ2 )(θ) = ξ1 (θ) × ξ2 (θ),
∀θ ∈ Θ.
The reader may wonder whether ξ(θ1 , θ2 , · · · , θn ) defined by (2.25) is a fuzzy variable. The following theorem answers this question. Theorem 2.13 Let ξ be an n-dimensional fuzzy vector, and f :
2.3
Membership Function
Definition 2.10 Let ξ be a fuzzy variable defined on the credibility space (Θ, P, Cr). Then its membership function is derived from the credibility measure by µ(x) = (2Cr{ξ = x}) ∧ 1, x ∈ <. (2.26) Membership function represents the degree that the fuzzy variable ξ takes some prescribed value. How do we determine membership functions? There are several methods reported in the past literature. Anyway, the membership degree µ(x) = 0 if x is an impossible point, and µ(x) = 1 if x is the most possible point that ξ takes. Example 2.11: It is clear that a fuzzy variable has a unique membership function. However, a membership function may produce multiple fuzzy variables. For example, let Θ = {θ1 , θ2 } and Cr{θ1 } = Cr{θ2 } = 0.5. Then (Θ, P, Cr) is a credibility space. We define 0, if θ = θ1 1, if θ = θ1 ξ1 (θ) = ξ2 (θ) = 1, if θ = θ2 , 0, if θ = θ2 . It is clear that both of them are fuzzy variables and have the same membership function, µ(x) ≡ 1 on x = 0 or 1. Theorem 2.14 (Credibility Inversion Theorem) Let ξ be a fuzzy variable with membership function µ. Then for any set B of real numbers, we have 1 Cr{ξ ∈ B} = sup µ(x) + 1 − sup µ(x) . (2.27) 2 x∈B x∈B c
66
Chapter 2 - Credibility Theory
Proof: If Cr{ξ ∈ B} ≤ 0.5, then by Axiom 2, we have Cr{ξ = x} ≤ 0.5 for each x ∈ B. It follows from Axiom 4 that 1 1 Cr{ξ ∈ B} = sup (2Cr{ξ = x} ∧ 1) = sup µ(x). (2.28) 2 x∈B 2 x∈B The self-duality of credibility measure implies that Cr{ξ ∈ B c } ≥ 0.5 and supx∈B c Cr{ξ = x} ≥ 0.5, i.e., sup µ(x) = sup (2Cr{ξ = x} ∧ 1) = 1. x∈B c
(2.29)
x∈B c
It follows from (2.28) and (2.29) that (2.27) holds. If Cr{ξ ∈ B} ≥ 0.5, then Cr{ξ ∈ B c } ≤ 0.5. It follows from the first case that 1 c sup µ(x) + 1 − sup µ(x) Cr{ξ ∈ B} = 1 − Cr{ξ ∈ B } = 1 − 2 x∈B c x∈B 1 = sup µ(x) + 1 − sup µ(x) . 2 x∈B x∈B c The theorem is proved. Example 2.12: Let ξ be a fuzzy variable with membership function µ. Then the following equations follow immediately from Theorem 2.14: ! 1 µ(x) + 1 − sup µ(y) , ∀x ∈ <; (2.30) Cr{ξ = x} = 2 y6=x 1 Cr{ξ ≤ x} = 2
1 Cr{ξ ≥ x} = 2
sup µ(y) + 1 − sup µ(y) ,
∀x ∈ <;
(2.31)
∀x ∈ <.
(2.32)
y>x
y≤x
sup µ(y) + 1 − sup µ(y) , y<x
y≥x
Especially, if µ is a continuous function, then Cr{ξ = x} =
µ(x) , 2
∀x ∈ <.
(2.33)
Theorem 2.15 (Sufficient and Necessary Condition for Membership Function) A function µ : < → [0, 1] is a membership function if and only if sup µ(x) = 1. Proof: If µ is a membership function, then there exists a fuzzy variable ξ whose membership function is just µ, and sup µ(x) = sup (2Cr{ξ = x}) ∧ 1. x∈<
x∈<
67
Section 2.3 - Membership Function
If there is some point x ∈ < such that Cr{ξ = x} ≥ 0.5, then sup µ(x) = 1. Otherwise, we have Cr{ξ = x} < 0.5 for each x ∈ <. It follows from Axiom 4 that sup µ(x) = sup (2Cr{ξ = x}) ∧ 1 = 2 sup Cr{ξ = x} = 2 (Cr{Θ} ∧ 0.5) = 1. x∈<
x∈<
x∈<
Conversely, suppose that sup µ(x) = 1. For each x ∈ <, we define ! 1 Cr{x} = µ(x) + 1 − sup µ(y) . 2 y6=x It is clear that
1 (1 + 1 − 1) = 0.5. 2
sup Cr{x} ≥ x∈< ∗
For any x ∈ < with Cr{x∗ } ≥ 0.5, we have µ(x∗ ) = 1 and Cr{x∗ } + sup Cr{y} y6=x∗
!
1 + sup ∗ y6=x 2
1 = 2
µ(x∗ ) + 1 − sup µ(y)
= 1−
1 1 sup µ(y) + sup µ(y) = 1. 2 y6=x∗ 2 y6=x∗
y6=x∗
! µ(y) + 1 − sup µ(z) z6=y
Thus Cr{x} satisfies the credibility extension condition, and has a unique extension to credibility measure on P(<) by using the credibility extension theorem. Now we define a fuzzy variable ξ as an identity function from the credibility space (<, P(<), Cr) to <. Then the membership function of the fuzzy variable ξ is ! (2Cr{ξ = x}) ∧ 1 =
µ(x) + 1 − sup µ(y)
∧ 1 = µ(x)
y6=x
for each x. The theorem is proved. Remark 2.4: Theorem 2.15 states that the identity function is a universal function for any fuzzy variable by defining an appropriate credibility space. Theorem 2.16 A fuzzy variable ξ with membership function µ is (a) nonnegative if and only if µ(x) = 0 for all x < 0; (b) positive if and only if µ(x) = 0 for all x ≤ 0; (c) simple if and only if µ takes nonzero values at a finite number of points; (d) discrete if and only if µ takes nonzero values at a countable set of points; (e) continuous if and only if µ is a continuous function. Proof: The theorem is obvious since the membership function µ(x) = (2Cr{ξ = x}) ∧ 1 for each x ∈ <.
68
Chapter 2 - Credibility Theory
Some Special Membership Functions By an equipossible fuzzy variable we mean the fuzzy variable fully determined by the pair (a, b) of crisp numbers with a < b, whose membership function is given by ( 1, if a ≤ x ≤ b µ1 (x) = 0, otherwise. By a triangular fuzzy variable we mean the fuzzy variable fully determined by the triplet (a, b, c) of crisp numbers with a < b < c, whose membership function is given by x−a , if a ≤ x ≤ b b−a x−c µ2 (x) = , if b ≤ x ≤ c b−c 0, otherwise. By a trapezoidal fuzzy variable we mean the fuzzy variable fully determined by the quadruplet (a, b, c, d) of crisp numbers with a < b < c < d, whose membership function is given by x−a b − a , if a ≤ x ≤ b 1, if b ≤ x ≤ c µ3 (x) = x−d , if c ≤ x ≤ d c−d 0, otherwise.
1
0
µ3 (x)
µ2 (x)
µ1 (x)
.. .. .. ......... ......... ......... .. .. ... ... . . . . ........................................................... . . . . . . . . . . . ...... . . . . . . . . . . ......... . . . . . . . . . . . . . . . . . . .... . . . . . . . . ............................................ ... ... ... ... .. .. ... . . . .. .. .... ..... . . . ... . ... .... . ... . . .. . ... .... ... ... .. ..... . .. . . .. . . . . . . . . . ... . . . . . . . .. . ... .. . ..... . . .... .... ... .. .. .... .. .. . .. . . . . . . ... . . . . ... . ... . . . . . . . . . . . . . . . . . ... . . . . . ... .. .. .. . .... . . . .... .... ... ... .. .. . . .. . . . . . . . . . . . . ... ... ... . . . . . . . . . .. . . . . .... ... . . . . . ... ... . . . .. . . . . . . . ... . . . . ... ... . . . . . . . . . . . . . ... . . . . ... . ... . . . . . . . .. . . .. . . .... ... . . . . . . ... . ................................................................................................................................ ............................................................................................................. .............................................................................................. .... .... ... .... .... ....
a
b
x
a
b
c
x
a b
c d
x
Figure 2.1: Membership Functions µ1 , µ2 and µ3 Joint Membership Function Definition 2.11 If ξ = (ξ1 , ξ2 , · · · , ξn ) is a fuzzy vector on the credibility space (Θ, P, Cr). Then its joint membership function is derived from the
69
Section 2.4 - Credibility Distribution
credibility measure by µ(x) = (2Cr{ξ = x}) ∧ 1,
∀x ∈
(2.34)
Theorem 2.17 (Sufficient and Necessary Condition for Joint Membership Function) A function µ :
Proof: Like Theorem 2.15.
2.4
Credibility Distribution
Definition 2.12 (Liu [124]) The credibility distribution Φ : < → [0, 1] of a fuzzy variable ξ is defined by Φ(x) = Cr θ ∈ Θ ξ(θ) ≤ x .
(2.35)
That is, Φ(x) is the credibility that the fuzzy variable ξ takes a value less than or equal to x. Generally speaking, the credibility distribution Φ is neither left-continuous nor right-continuous. Example 2.13: The credibility distribution of an equipossible fuzzy variable (a, b) is 0, if x < a 1/2, if a ≤ x < b Φ1 (x) = 1, if x ≥ b. Especially, if ξ is an equipossible fuzzy variable on <, then Φ1 (x) ≡ 1/2. Example 2.14: The credibility distribution (a, b, c) is 0, if x − a 2(b − a) , if Φ2 (x) = x + c − 2b , if 2(c − b) 1, if
of a triangular fuzzy variable x≤a a≤x≤b b≤x≤c x ≥ c.
70
Chapter 2 - Credibility Theory
Example 2.15: The credibility distribution of a trapezoidal fuzzy variable (a, b, c, d) is 0, if x ≤ a x − a , if a ≤ x ≤ b 2(b − a) 1 , if b ≤ x ≤ c Φ3 (x) = 2 x + d − 2c , if c ≤ x ≤ d 2(d − c) 1, if x ≥ d.
1 0.5 0
Φ3 (x)
Φ2 (x)
Φ1 (x)
.... .... .... ....... ....... ........ ... . . . . . . . . . . . . . ............................ . . . . . . . . . ...... . . . . . . . . . . . . . . . . . ............ . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . ........... . . ... . ... . ... . . .... .... ... .... ... . . ... . ... . ... ... ... .. . .. . . . . . . . . ... . . . . . .... .... ... . ... . ... . . . . ... ... ... ... ... . . . .. .. ... ... ... . .. . . . . .......................................... . . . . . . . . . . . . . . . . ..... . . . . . . . . . . . . ......... . . . .... . . . . . . . . . . . ... . . . . . . . . ........................................... . . .. . ... . .. . . . . . . . . . . . .... . . . . ... . . . . . . .. . . . . . . . . ... . . . . . . . . . ... . .. . . . . . . .... ... . . . . . . . . ... . . . . . . . . . . . . . . . . . . . ... ... . . . . . . . . . . . . . . . . . . . . ... . . . . . . . ... . . . . . . . . . . . . . . . . . . . . ... ............................................................................................................................... ................................................................................................................ ................................................................................................ .... .... .... ... ... ...
a
b
x
a
b c
x
a b
c d
x
Figure 2.2: Credibility Distributions Φ1 , Φ2 and Φ3 Theorem 2.18 Let ξ be a fuzzy variable with membership function µ. Then its credibility distribution is 1 Φ(x) = sup µ(y) + 1 − sup µ(y) , ∀x ∈ <. (2.36) 2 y≤x y>x Proof: It follows from the credibility inversion theorem immediately. Theorem 2.19 (Liu [129], Sufficient and Necessary Condition for Credibility Distribution) A function Φ : < → [0, 1] is a credibility distribution if and only if it is an increasing function with lim Φ(x) ≤ 0.5 ≤ lim Φ(x),
x→−∞
x→∞
lim Φ(y) = Φ(x) if lim Φ(y) > 0.5 or Φ(x) ≥ 0.5. y↓x
y↓x
(2.37) (2.38)
Proof: It is obvious that a credibility distribution Φ is an increasing function. The inequalities (2.37) follow from the credibility asymptotic theorem immediately. Assume that x is a point at which limy↓x Φ(y) > 0.5. That is, lim Cr{ξ ≤ y} > 0.5. y↓x
71
Section 2.4 - Credibility Distribution
Since {ξ ≤ y} ↓ {ξ ≤ x} as y ↓ x, it follows from the credibility semicontinuity law that Φ(y) = Cr{ξ ≤ y} ↓ Cr{ξ ≤ x} = Φ(x) as y ↓ x. When x is a point at which Φ(x) ≥ 0.5, if limy↓x Φ(y) 6= Φ(x), then we have lim Φ(y) > Φ(x) ≥ 0.5. y↓x
For this case, we have proved that limy↓x Φ(y) = Φ(x). Thus (2.37) and (2.38) are proved. Conversely, if Φ : < → [0, 1] is an increasing function satisfying (2.37) and (2.38), then 2Φ(x), if Φ(x) < 0.5 1, if lim Φ(y) < 0.5 ≤ Φ(x) µ(x) = (2.39) y↑x 2 − 2Φ(x), if 0.5 ≤ lim Φ(y) y↑x
takes values in [0, 1] and sup µ(x) = 1. It follows from Theorem 2.15 that there is a fuzzy variable ξ whose membership function is just µ. Let us verify that Φ is the credibility distribution of ξ, i.e., Cr{ξ ≤ x} = Φ(x) for each x. The argument breaks down into two cases. (i) If Φ(x) < 0.5, then we have supy>x µ(y) = 1, and µ(y) = 2Φ(y) for each y with y ≤ x. Thus 1 Cr{ξ ≤ x} = 2
sup µ(y) + 1 − sup µ(y) = sup Φ(y) = Φ(x). y>x
y≤x
y≤x
(ii) If Φ(x) ≥ 0.5, then we have supy≤x µ(y) = 1 and Φ(y) ≥ Φ(x) ≥ 0.5 for each y with y > x. Thus µ(y) = 2 − 2Φ(y) and 1 Cr{ξ ≤ x} = sup µ(y) + 1 − sup µ(y) 2 y≤x y>x 1 = 1 + 1 − sup(2 − 2Φ(y)) 2 y>x = inf Φ(y) = lim Φ(y) = Φ(x). y>x
y↓x
The theorem is proved. Example 2.16: Let a and b be two numbers with 0 ≤ a ≤ 0.5 ≤ b ≤ 1. We define a fuzzy variable by the following membership function, if x < 0 2a, 1, if x = 0 µ(x) = 2 − 2b, if x > 0.
72
Chapter 2 - Credibility Theory
Then its credibility distribution is ( Φ(x) =
a, if x < 0 b, if x ≥ 0.
Thus we have lim Φ(x) = a,
x→−∞
lim Φ(x) = b.
x→+∞
Theorem 2.20 A fuzzy variable with credibility distribution Φ is (a) nonnegative if and only if Φ(x) = 0 for all x < 0; (b) positive if and only if Φ(x) = 0 for all x ≤ 0. Proof: It follows immediately from the definition. Theorem 2.21 Let ξ be a fuzzy variable. Then we have (a) if ξ is simple, then its credibility distribution is a simple function; (b) if ξ is discrete, then its credibility distribution is a step function; (c) if ξ is continuous on the real line <, then its credibility distribution is a continuous function. Proof: The parts (a) and (b) follow immediately from the definition. The part (c) follows from Theorem 2.18 and the continuity of the membership function. Example 2.17: However, the inverse of Theorem 2.21 is not true. For example, let ξ be a fuzzy variable whose membership function is ( x, if 0 ≤ x ≤ 1 µ(x) = 1, otherwise. Then its credibility distribution is Φ(x) ≡ 0.5. It is clear that Φ(x) is simple and continuous. But the fuzzy variable ξ is neither simple nor continuous. Definition 2.13 A continuous fuzzy variable is said to be (a) singular if its credibility distribution is a singular function; (b) absolutely continuous if its credibility distribution is absolutely continuous. Definition 2.14 (Liu [124]) The credibility density function φ: < → [0, +∞) of a fuzzy variable ξ is a function such that Z x Φ(x) = φ(y)dy, ∀x ∈ <, (2.40) −∞
Z
+∞
φ(y)dy = 1 −∞
where Φ is the credibility distribution of the fuzzy variable ξ.
(2.41)
Section 2.4 - Credibility Distribution
73
Example 2.18: The credibility density able (a, b, c) is 1 , 2(b − a) 1 φ(x) = , 2(c − b) 0,
function of a triangular fuzzy variif a ≤ x ≤ b if b ≤ x ≤ c otherwise.
Example 2.19: The credibility density function of a trapezoidal fuzzy variable (a, b, c, d) is 1 , if a ≤ x ≤ b 2(b − a) 1 φ(x) = , if c ≤ x ≤ d 2(d − c) 0, otherwise. Example 2.20: The credibility density function of an equipossible fuzzy variable (a, b) does not exist. Example 2.21: The credibility density function does not necessarily exist even if the membership function is continuous and unimodal with a finite support. Let f be the Cantor function, and set if 0 ≤ x ≤ 1 f (x), f (2 − x), if 1 < x ≤ 2 µ(x) = (2.42) 0, otherwise. Then µ is a continuous and unimodal function with µ(1) = 1. Hence µ is a membership function. However, its credibility distribution is not an absolutely continuous function. Thus the credibility density function does not exist. Theorem 2.22 Let ξ be a fuzzy variable whose credibility density function φ exists. Then we have Z x Z +∞ Cr{ξ ≤ x} = φ(y)dy, Cr{ξ ≥ x} = φ(y)dy. (2.43) −∞
x
Proof: The first part follows immediately from the definition. In addition, by the self-duality of credibility measure, we have Z +∞ Z x Z +∞ Cr{ξ ≥ x} = 1 − Cr{ξ < x} = φ(y)dy − φ(y)dy = φ(y)dy. −∞
The theorem is proved.
−∞
x
74
Chapter 2 - Credibility Theory
Example 2.22: Different from the random case, generally speaking, Z Cr{a ≤ ξ ≤ b} = 6
b
φ(y)dy. a
Consider the trapezoidal fuzzy variable ξ = (1, 2, 3, 4). Then Cr{2 ≤ ξ ≤ 3} = 0.5. However, it is obvious that φ(x) = 0 when 2 ≤ x ≤ 3 and Z
3
φ(y)dy = 0 6= 0.5 = Cr{2 ≤ ξ ≤ 3}. 2
Joint Credibility Distribution Definition 2.15 Let (ξ1 , ξ2 , · · · , ξn ) be a fuzzy vector. Then the joint credibility distribution Φ :
−∞
−∞
holds for all (x1 , x2 , · · · , xn ) ∈
+∞
Z
+∞
Z
+∞
··· −∞
−∞
φ(y1 , y2 , · · · , yn )dy1 dy2 · · · dyn = 1 −∞
where Φ is the joint credibility distribution of the fuzzy vector (ξ1 , ξ2 , · · · , ξn ).
2.5
Independence
The independence of fuzzy variables has been discussed by many authors from different angles, for example, Zadeh [248], Nahmias [163], Yager [231], Liu [129], Liu and Gao [148], and Li and Liu [99]. A lot of equivalence conditions of independence are presented. Here we use the following condition. Definition 2.17 (Liu and Gao [148]) The fuzzy variables ξ1 , ξ2 , · · · , ξm are said to be independent if (m ) \ Cr {ξi ∈ Bi } = min Cr {ξi ∈ Bi } (2.44) i=1
for any sets B1 , B2 , · · · , Bm of <.
1≤i≤m
75
Section 2.5 - Independence
Theorem 2.23 The fuzzy variables ξ1 , ξ2 , · · · , ξm are independent if and only if (m ) [ Cr {ξi ∈ Bi } = max Cr {ξi ∈ Bi } (2.45) 1≤i≤m
i=1
for any sets B1 , B2 , · · · , Bm of <. Proof: It follows from the self-duality of credibility measure that ξ1 , ξ2 , · · · , ξm are independent if and only if (m ) (m ) [ \ Cr {ξi ∈ Bi } = 1 − Cr {ξi ∈ Bic } i=1
i=1
= 1 − min Cr{ξi ∈ Bic } = max Cr {ξi ∈ Bi } . 1≤i≤m
1≤i≤m
Thus (2.45) is verified. The proof is complete. Theorem 2.24 The fuzzy variables ξ1 , ξ2 , · · · , ξm are independent if and only if (m ) \ Cr {ξi = xi } = min Cr {ξi = xi } (2.46) 1≤i≤m
i=1
for any real numbers x1 , x2 , · · · , xm with Cr{∩m i=1 {ξi = xi }} < 0.5. Proof: If ξ1 , ξ2 , · · · , ξm are independent, then we have (2.46) immediately by taking Bi = {xi } for each i. Conversely, if Cr{∩m i=1 {ξi ∈ Bi }} ≥ 0.5, it follows from Theorem 2.2 that (2.44) holds. Otherwise, we have Cr{∩m i=1 {ξi = xi }} < 0.5 for any real numbers xi ∈ Bi , i = 1, 2, · · · , m, and (m ) m \ [ \ Cr {ξi ∈ Bi } = Cr {ξi = xi } i=1 xi ∈Bi ,1≤i≤m i=1 (m ) \ = sup Cr {ξi = xi } = sup min Cr{ξi = xi } xi ∈Bi ,1≤i≤m
= min
xi ∈Bi ,1≤i≤m 1≤i≤m
i=1
sup Cr {ξi = xi } = min Cr {ξi ∈ Bi } .
1≤i≤m xi ∈Bi
1≤i≤m
Hence (2.44) is true, and ξ1 , ξ2 , · · · , ξm are independent. The theorem is thus proved. Theorem 2.25 Let µi be membership functions of fuzzy variables ξi , i = 1, 2, · · · , m, respectively, and µ the joint membership function of fuzzy vector (ξ1 , ξ2 , · · · , ξm ). Then the fuzzy variables ξ1 , ξ2 , · · · , ξm are independent if and only if µ(x1 , x2 , · · · , xm ) = min µi (xi ) (2.47) 1≤i≤m
for any real numbers x1 , x2 , · · · , xm .
76
Chapter 2 - Credibility Theory
Proof: Suppose that ξ1 , ξ2 , · · · , ξm are independent. It follows from Theorem 2.24 that (m )! \ µ(x1 , x2 , · · · , xm ) = 2Cr {ξi = xi } ∧1 i=1
=
2 min Cr{ξi = xi } ∧ 1 1≤i≤m
= min (2Cr{ξi = xi }) ∧ 1 = min µi (xi ). 1≤i≤m
1≤i≤m
Conversely, for any real numbers x1 , x2 , · · · , xm with Cr{∩m i=1 {ξi = xi }} < 0.5, we have (m )! (m ) \ \ 1 2Cr {ξi = xi } ∧1 Cr {ξi = xi } = 2 i=1 i=1 1 1 µ(x1 , x2 , · · · , xm ) = min µi (xi ) 2 2 1≤i≤m 1 = min (2Cr {ξi = xi }) ∧ 1 2 1≤i≤m
=
= min Cr {ξi = xi } . 1≤i≤m
It follows from Theorem 2.24 that ξ1 , ξ2 , · · · , ξm are independent. The theorem is proved. Theorem 2.26 Let Φi be credibility distributions of fuzzy variables ξi , i = 1, 2, · · · , m, respectively, and Φ the joint credibility distribution of fuzzy vector (ξ1 , ξ2 , · · · , ξm ). If ξ1 , ξ2 , · · · , ξm are independent, then we have Φ(x1 , x2 , · · · , xm ) = min Φi (xi )
(2.48)
1≤i≤m
for any real numbers x1 , x2 , · · · , xm . Proof: Since ξ1 , ξ2 , · · · , ξm are independent fuzzy variables, we have (m ) \ Φ(x1 , x2 , · · · , xm ) = Cr {ξi ≤ xi } = min Cr{ξi ≤ xi } = min Φi (xi ) i=1
1≤i≤m
1≤i≤m
for any real numbers x1 , x2 , · · · , xm . The theorem is proved. Example 2.23: However, the equation (2.48) does not imply that the fuzzy variables are independent. For example, let ξ be a fuzzy variable with credibility distribution Φ. Then the joint credibility distribution Ψ of fuzzy vector (ξ, ξ) is Ψ(x1 , x2 ) = Cr{ξ ≤ x1 , ξ ≤ x2 } = Cr{ξ ≤ x1 } ∧ Cr{ξ ≤ x2 } = Φ(x1 ) ∧ Φ(x2 )
77
Section 2.5 - Independence
for any real numbers x1 and x2 . But, generally speaking, a fuzzy variable is not independent with itself. Theorem 2.27 Let ξ1 , ξ2 , · · · , ξm be independent fuzzy variables, and f1 , f2 , · · · , fn are real-valued functions. Then f1 (ξ1 ), f2 (ξ2 ), · · · , fm (ξm ) are independent fuzzy variables. Proof: For any sets B1 , B2 , · · · , Bm of <, we have (m ) (m ) \ \ −1 Cr {fi (ξi ) ∈ Bi } = Cr {ξi ∈ fi (Bi )} i=1
i=1
= min Cr{ξi ∈ 1≤i≤m
fi−1 (Bi )}
= min Cr{fi (ξi ) ∈ Bi }. 1≤i≤m
Thus f1 (ξ1 ), f2 (ξ2 ), · · · , fm (ξm ) are independent fuzzy variables. Theorem 2.28 (Extension Principle of Zadeh) Let ξ1 , ξ2 , · · · , ξn be independent fuzzy variables with membership functions µ1 , µ2 , · · · , µn , respectively, and f :
for any x ∈ <. Here we set µ(x) = 0 if there are not real numbers x1 , x2 , · · · , xn such that x = f (x1 , x2 , · · · , xn ). Proof: It follows from Definition 2.10 that the membership function of ξ = f (ξ1 , ξ2 , · · · , ξn ) is µ(x) = (2Cr {f (ξ1 , ξ2 , · · · , ξn ) = x}) ∧ 1 [ = 2Cr {ξ1 = x1 , ξ2 = x2 , · · · , ξn = xn } ∧ 1 x=f (x1 ,x2 ,··· ,xn ) ! =
2
Cr{ξ1 = x1 , ξ2 = x2 , · · · , ξn = xn }
sup
∧1
x=f (x1 ,x2 ,··· ,xn )
! = = =
2
sup
min Cr{ξi = xi }
x=f (x1 ,x2 ,··· ,xn ) 1≤k≤n
∧1
sup
min (2Cr{ξi = xi }) ∧ 1
sup
min µi (xi ).
x=f (x1 ,x2 ,··· ,xn ) 1≤k≤n x=f (x1 ,x2 ,··· ,xn ) 1≤i≤n
The theorem is proved.
(by independence)
78
Chapter 2 - Credibility Theory
Remark 2.5: The extension principle of Zadeh is only applicable to the operations on independent fuzzy variables. In the past literature, the extension principle is used as a postulate. However, it is treated as a theorem in credibility theory. Example 2.24: The sum of independent equipossible fuzzy variables ξ = (a1 , a2 ) and η = (b1 , b2 ) is also an equipossible fuzzy variable, and ξ + η = (a1 + b1 , a2 + b2 ). Their product is also an equipossible fuzzy variable, and ξ·η = min xy, max xy . a1 ≤x≤a2 ,b1 ≤y≤b2
a1 ≤x≤a2 ,b1 ≤y≤b2
Example 2.25: The sum of independent triangular fuzzy variables ξ = (a1 , a2 , a3 ) and η = (b1 , b2 , b3 ) is also a triangular fuzzy variable, and ξ + η = (a1 + b1 , a2 + b2 , a3 + b3 ). The product of a triangular fuzzy variable ξ = (a1 , a2 , a3 ) and a scalar number λ is ( (λa1 , λa2 , λa3 ), if λ ≥ 0 λ·ξ = (λa3 , λa2 , λa1 ), if λ < 0. That is, the product of a triangular fuzzy variable and a scalar number is also a triangular fuzzy variable. However, the product of two triangular fuzzy variables is not a triangular one. Example 2.26: The sum of independent trapezoidal fuzzy variables ξ = (a1 , a2 , a3 , a4 ) and η = (b1 , b2 , b3 , b4 ) is also a trapezoidal fuzzy variable, and ξ + η = (a1 + b1 , a2 + b2 , a3 + b3 , a4 + b4 ). The product of a trapezoidal fuzzy variable ξ = (a1 , a2 , a3 , a4 ) and a scalar number λ is ( (λa1 , λa2 , λa3 , λa4 ), if λ ≥ 0 λ·ξ = (λa4 , λa3 , λa2 , λa1 ), if λ < 0. That is, the product of a trapezoidal fuzzy variable and a scalar number is also a trapezoidal fuzzy variable. However, the product of two trapezoidal fuzzy variables is not a trapezoidal one. Example 2.27: Let ξ1 , ξ2 , · · · , ξn be independent fuzzy variables with membership functions µ1 , µ2 , · · · , µn , respectively, and f :
Section 2.6 - Identical Distribution
2.6
79
Identical Distribution
Definition 2.18 (Liu [129]) The fuzzy variables ξ and η are said to be identically distributed if Cr{ξ ∈ B} = Cr{η ∈ B} (2.50) for any set B of <. Theorem 2.29 The fuzzy variables ξ and η are identically distributed if and only if ξ and η have the same membership function. Proof: Let µ and ν be the membership functions of ξ and η, respectively. If ξ and η are identically distributed fuzzy variables, then, for any x ∈ <, we have µ(x) = (2Cr{ξ = x}) ∧ 1 = (2Cr{η = x}) ∧ 1 = ν(x). Thus ξ and η have the same membership function. Conversely, if ξ and η have the same membership function, i.e., µ(x) ≡ ν(x), then, by using the credibility inversion theorem, we have 1 sup µ(x) + 1 − sup µ(x) Cr{ξ ∈ B} = 2 x∈B x∈B c 1 = sup ν(x) + 1 − sup ν(x) = Cr{η ∈ B} 2 x∈B x∈B c for any set B of <. Thus ξ and η are identically distributed fuzzy variables. Theorem 2.30 The fuzzy variables ξ and η are identically distributed if and only if Cr{ξ = x} = Cr{η = x} for each x ∈ <. Proof: If ξ and η are identically distributed fuzzy variables, then we immediately have Cr{ξ = x} = Cr{η = x} for each x. Conversely, it follows from µ(x) = (2Cr{ξ = x}) ∧ 1 = (2Cr{η = x}) ∧ 1 = ν(x) that ξ and η have the same membership function. Thus ξ and η are identically distributed fuzzy variables. Theorem 2.31 If ξ and η are identically distributed fuzzy variables, then ξ and η have the same credibility distribution. Proof: If ξ and η are identically distributed fuzzy variables, then, for any x ∈ <, we have Cr{ξ ∈ (−∞, x]} = Cr{η ∈ (−∞, x]}. Thus ξ and η have the same credibility distribution. Example 2.28: The inverse of Theorem 2.31 is not true. We consider two fuzzy variables with the following membership functions, 1.0, if x = 0 1.0, if x = 0 0.6, if x = 1 0.7, if x = 1 ν(x) = µ(x) = 0.8, if x = 2, 0.8, if x = 2.
80
Chapter 2 - Credibility Theory
It is easy to verify that ξ and η have the same credibility distribution, 0, if x < 0 0.6, if 0 ≤ x < 2 Φ(x) = 1, if x ≥ 2. However, they are not identically distributed fuzzy variables. Theorem 2.32 Let ξ and η be two fuzzy variables whose credibility density functions exist. If ξ and η are identically distributed, then they have the same credibility density function. Proof: It follows from Theorem 2.31 that the fuzzy variables ξ and η have the same credibility distribution. Hence they have the same credibility density function.
2.7
Expected Value
There are many ways to define an expected value operator for fuzzy variables. See, for example, Dubois and Prade [34], Heilpern [59], Campos and Gonz´ alez [13], Gonz´ alez [53] and Yager [225][236]. The most general definition of expected value operator of fuzzy variable was given by Liu and Liu [126]. This definition is applicable to not only continuous fuzzy variables but also discrete ones. Definition 2.19 (Liu and Liu [126]) Let ξ be a fuzzy variable. Then the expected value of ξ is defined by Z
+∞
Z
0
Cr{ξ ≥ r}dr −
E[ξ] =
Cr{ξ ≤ r}dr
(2.51)
−∞
0
provided that at least one of the two integrals is finite. Example 2.29: Let ξ be the equipossible then Cr{ξ ≤ r} ≡ 0 when r < 0, and 1, if 0.5, if Cr{ξ ≥ r} = 0, if Z E[ξ] =
a
Z 1dr +
0
b
Z 0.5dr +
a
fuzzy variable (a, b). If a ≥ 0,
r≤a a
b, !
+∞
0dr b
Z
0
−
0dr = −∞
a+b . 2
81
Section 2.7 - Expected Value
If b ≤ 0, then Cr{ξ ≥ r} ≡ 0 when r > 0, 1, 0.5, Cr{ξ ≤ r} = 0, Z
+∞
Z
a
0dr −
E[ξ] = 0
Z 0dr +
−∞
and if r ≥ b if a ≤ r < b if r < a, b
Z 0.5dr +
a
!
0
1dr b
=
a+b . 2
If a < 0 < b, then ( Cr{ξ ≥ r} =
0.5, if 0 ≤ r ≤ b 0, if r > b,
(
0, if r < a 0.5, if a ≤ r ≤ 0, ! Z Z b Z +∞ Z 0 a a+b . 0.5dr + 0dr − 0dr + 0.5dr = 2 0 b −∞ a Cr{ξ ≤ r} =
E[ξ] =
Thus we always have the expected value (a + b)/2. Example 2.30: The triangular fuzzy variable ξ = (a, b, c) has an expected value E[ξ] = (a + 2b + c)/4. Example 2.31: The trapezoidal fuzzy variable ξ = (a, b, c, d) has an expected value E[ξ] = (a + b + c + d)/4. Example 2.32: Let ξ be a continuous nonnegative fuzzy variable with membership function µ. If µ is decreasing on [0, +∞), then Cr{ξ ≥ x} = µ(x)/2 for any x > 0, and Z 1 +∞ E[ξ] = µ(x)dx. 2 0 Example 2.33: Let ξ be a continuous fuzzy variable with membership function µ. If its expected value exists, and there is a point x0 such that µ(x) is increasing on (−∞, x0 ) and decreasing on (x0 , +∞), then Z Z 1 +∞ 1 x0 E[ξ] = x0 + µ(x)dx − µ(x)dx. 2 x0 2 −∞ Example 2.34: Let ξ be a fuzzy variable with membership function 0, if x < 0 x, if 0 ≤ x ≤ 1 µ(x) = 1, if x > 1.
82
Chapter 2 - Credibility Theory
Then its expected value is +∞. If ξ is a fuzzy variable with membership function if x < 0 1, 1 − x, if 0 ≤ x ≤ 1 µ(x) = 0, if x > 1. Then its expected value is −∞. Example 2.35: The expected value may not exist for some fuzzy variables. For example, the fuzzy variable ξ with membership function µ(x) =
1 , 1 + |x|
∀x ∈ <
does not have expected value because both of the integrals Z +∞ Z 0 Cr{ξ ≥ r}dr and Cr{ξ ≤ r}dr −∞
0
are infinite. Example 2.36: The definition of expected value operator is also applicable to discrete case. Assume that ξ is a simple fuzzy variable whose membership function is given by µ1 , if x = x1 µ2 , if x = x2 µ(x) = (2.52) ··· µm , if x = xm where x1 , x2 , · · · , xm are distinct numbers. Note that µ1 ∨ µ2 ∨ · · · ∨ µm = 1. Definition 2.19 implies that the expected value of ξ is E[ξ] =
m X
wi xi
(2.53)
i=1
where the weights are given by 1 wi = max {µj |xj ≤ xi } − max {µj |xj < xi } 1≤j≤m 2 1≤j≤m + max {µj |xj ≥ xi } − max {µj |xj > xi } 1≤j≤m
1≤j≤m
for i = 1, 2, · · · , m. It is easy to verify that all wi ≥ 0 and the sum of all weights is just 1. Example 2.37: Consider the fuzzy variable ξ defined by (2.52). Suppose x1 < x2 < · · · < xm . Then the expected value is determined by (2.53) and the weights are given by 1 wi = max µj − max µj + max µj − max µj 1≤j
83
Section 2.7 - Expected Value
for i = 1, 2, · · · , m. Example 2.38: Consider the fuzzy variable ξ defined by (2.52). Suppose x1 < x2 < · · · < xm and there exists an index k with 1 < k < m such that µ1 ≤ µ2 ≤ · · · ≤ µk
and µk ≥ µk+1 ≥ · · · ≥ µm .
Note that µk ≡ 1. Then the expected value is determined by (2.53) and the weights are given by µ1 , if i = 1 2 µi − µi−1 , if i = 2, 3, · · · , k − 1 2 µk−1 + µk+1 wi = , if i = k 1− 2 µi − µi+1 , if i = k + 1, k + 2, · · · , m − 1 2 µm , if i = m. 2 Example 2.39: Consider the fuzzy variable ξ defined by (2.52). Suppose x1 < x2 < · · · < xm and µ1 ≤ µ2 ≤ · · · ≤ µm (µm ≡ 1). Then the expected value is determined by (2.53) and the weights are given by µ1 , if i = 1 2 µi − µi−1 wi = , if i = 2, 3, · · · , m − 1 2 1 − µm−1 , if i = m. 2 Example 2.40: Consider the fuzzy variable ξ defined by (2.52). Suppose x1 < x2 < · · · < xm and µ1 ≥ µ2 ≥ · · · ≥ µm (µ1 ≡ 1). Then the expected value is determined by (2.53) and the weights are given by µ2 1− , if i = 1 2 µi − µi+1 wi = , if i = 2, 3, · · · , m − 1 2 µm , if i = m. 2 Theorem 2.33 (Liu [124]) Let ξ be a fuzzy variable whose credibility density function φ exists. If the Lebesgue integral Z +∞ xφ(x)dx −∞
84
Chapter 2 - Credibility Theory
is finite, then we have +∞
Z
xφ(x)dx.
E[ξ] =
(2.54)
−∞
Proof: It follows from the definition of expected value operator and Fubini Theorem that Z +∞ Z 0 E[ξ] = Cr{ξ ≥ r}dr − Cr{ξ ≤ r}dr −∞
0
Z
+∞
Z
+∞
Z
+∞
=
0
Z
x
Z
0
Z
0
φ(x)dr dx −
=
−∞
0
0
Z
+∞
=
Z
φ(x)dx dr
φ(x)dr dx
x
0
xφ(x)dx +
xφ(x)dx −∞
0
Z
r
−∞
−∞
r
0
Z
Z φ(x)dx dr −
+∞
=
xφ(x)dx. −∞
The theorem is proved. Example 2.41: Let ξ be a fuzzy variable with credibility distribution Φ. Generally speaking, Z +∞ E[ξ] 6= xdΦ(x). −∞
For example, let ξ be a fuzzy variable 0, x, µ(x) = 1,
with membership function if x < 0 if 0 ≤ x ≤ 1 if x > 1.
Then E[ξ] = +∞. However Z +∞ xdΦ(x) = −∞
1 6= +∞. 4
Theorem 2.34 (Liu [129]) Let ξ be a fuzzy variable with credibility distribution Φ. If lim Φ(x) = 0, lim Φ(x) = 1 x→−∞
x→∞
and the Lebesgue-Stieltjes integral Z +∞ xdΦ(x) −∞
85
Section 2.7 - Expected Value
is finite, then we have Z
+∞
xdΦ(x).
E[ξ] =
(2.55)
−∞
R +∞ Proof: Since the Lebesgue-Stieltjes integral −∞ xdΦ(x) is finite, we immediately have Z 0 Z 0 Z +∞ Z y xdΦ(x) xdΦ(x) = xdΦ(x), lim xdΦ(x) = lim y→+∞
y→−∞
0
0
and Z
+∞
y→+∞
y
Z xdΦ(x) = 0,
lim
y
−∞
y
xdΦ(x) = 0.
lim
y→−∞
−∞
It follows from Z +∞ xdΦ(x) ≥ y lim Φ(z) − Φ(y) = y (1 − Φ(y)) ≥ 0, z→+∞
y
Z
y
xdΦ(x) ≤ y Φ(y) − lim Φ(z) = yΦ(y) ≤ 0, z→−∞
−∞
for y > 0,
for y < 0
that lim y (1 − Φ(y)) = 0,
y→+∞
lim yΦ(y) = 0.
y→−∞
Let 0 = x0 < x1 < x2 < · · · < xn = y be a partition of [0, y]. Then we have n−1 X
xi (Φ(xi+1 ) − Φ(xi )) →
xdΦ(x) 0
i=0
and
n−1 X
y
Z
y
Z (1 − Φ(xi+1 ))(xi+1 − xi ) →
Cr{ξ ≥ r}dr 0
i=0
as max{|xi+1 − xi | : i = 0, 1, · · · , n − 1} → 0. Since n−1 X
xi (Φ(xi+1 ) − Φ(xi )) −
i=0
n−1 X
(1 − Φ(xi+1 )(xi+1 − xi ) = y(Φ(y) − 1) → 0
i=0
as y → +∞. This fact implies that Z +∞ Z Cr{ξ ≥ r}dr = 0
+∞
xdΦ(x).
0
A similar way may prove that Z 0 Z − Cr{ξ ≤ r}dr = −∞
It follows that the equation (2.55) holds.
0
−∞
xdΦ(x).
86
Chapter 2 - Credibility Theory
Linearity of Expected Value Operator Theorem 2.35 (Liu and Liu [141]) Let ξ and η be independent fuzzy variables with finite expected values. Then for any numbers a and b, we have E[aξ + bη] = aE[ξ] + bE[η].
(2.56)
Proof: Step 1: We first prove that E[ξ + b] = E[ξ] + b for any real number b. If b ≥ 0, we have Z
∞
Z
0
Cr{ξ + b ≥ r}dr −
E[ξ + b] = Z
Cr{ξ + b ≤ r}dr −∞ Z 0
0 ∞
Cr{ξ ≤ r − b}dr
Cr{ξ ≥ r − b}dr −
=
−∞
0
Z
b
(Cr{ξ ≥ r − b} + Cr{ξ < r − b}) dr
= E[ξ] + 0
= E[ξ] + b. If b < 0, then we have Z E[ξ + b] = E[ξ] −
0
(Cr{ξ ≥ r − b} + Cr{ξ < r − b}) dr = E[ξ] + b. b
Step 2: We prove that E[aξ] = aE[ξ] for any real number a. If a = 0, then the equation E[aξ] = aE[ξ] holds trivially. If a > 0, we have Z
∞
Z
0
Cr{aξ ≥ r}dr −
E[aξ] = Z
Cr{aξ ≤ r}dr −∞ Z 0
0 ∞
n n ro ro dr − dr Cr ξ ≤ Cr ξ ≥ a a −∞ 0 Z ∞ n Z 0 n ro r ro r =a Cr ξ ≥ d −a Cr ξ ≤ d = aE[ξ]. a a a a 0 −∞ =
If a < 0, we have Z
∞
Z
0
Cr{aξ ≥ r}dr −
E[aξ] = 0
Z
∞
Cr{aξ ≤ r}dr −∞ Z 0
n n ro ro Cr ξ ≤ dr − Cr ξ ≥ dr a a 0 −∞ Z ∞ n Z 0 n ro r ro r =a Cr ξ ≥ d −a Cr ξ ≤ d = aE[ξ]. a a a a 0 −∞
=
87
Section 2.7 - Expected Value
Step 3: We prove that E[ξ + η] = E[ξ] + E[η] when both ξ and η are simple fuzzy variables with the following membership functions, µ1 , if x = a1 ν1 , if x = b1 µ2 , if x = a2 ν2 , if x = b2 µ(x) = ν(x) = · · · ··· µm , if x = am , νn , if x = bn . Then ξ+η is also a simple fuzzy variable taking values ai +bj with membership degrees µi ∧ νj , i = 1, 2, · · · , m, j = 1, 2, · · · , n, respectively. Now we define 1 0 max {µk |ak ≤ ai } − max {µk |ak < ai } wi = 1≤k≤m 2 1≤k≤m + max {µk |ak ≥ ai } − max {µk |ak > ai } , 1≤k≤m
wj00 =
1 2
1≤k≤m
max {νl |bl ≤ bi } − max {νl |bl < bi }
1≤l≤n
1≤l≤n
+ max {νl |bl ≥ bi } − max {νl |bl > bi } , 1≤l≤n
wij =
1 2 − +
1≤l≤n
max
1≤k≤m,1≤l≤n
{µk ∧ νl |ak + bl ≤ ai + bj }
max
{µk ∧ νl |ak + bl < ai + bj }
max
{µk ∧ νl |ak + bl ≥ ai + bj }
1≤k≤m,1≤l≤n 1≤k≤m,1≤l≤n
−
max
1≤k≤m,1≤l≤n
{µk ∧ νl |ak + bl > ai + bj }
for i = 1, 2, · · · , m and j = 1, 2, · · · , n. It is also easy to verify that wi0 =
n X
wij ,
j=1
wj00 =
m X
wij
i=1
for i = 1, 2, · · · , m and j = 1, 2, · · · , n. If {ai }, {bj } and {ai + bj } are sequences consisting of distinct elements, then E[ξ] =
m X i=1
ai wi0 ,
E[η] =
n X j=1
bj wj00 ,
E[ξ + η] =
m X n X
(ai + bj )wij .
i=1 j=1
Thus E[ξ + η] = E[ξ] + E[η]. If not, we may give them a small perturbation such that they are distinct, and prove the linearity by letting the perturbation tend to zero.
88
Chapter 2 - Credibility Theory
Step 4: We prove that E[ξ + η] = E[ξ] + E[η] when ξ and η are fuzzy variables such that 1 ≤ Cr{ξ ≤ 0}, 2 1 lim Cr{η ≤ y} ≤ ≤ Cr{η ≤ 0}. y↑0 2
lim Cr{ξ ≤ y} ≤ y↑0
(2.57)
We define simple fuzzy variables ξi via credibility distributions as follows, k−1 k−1 , if ≤ Cr{ξ ≤ x} < i 2 2i k k−1 Φi (x) = , if ≤ Cr{ξ ≤ x} < i 2 2i 1, if Cr{ξ ≤ x} = 1
k , k = 1, 2, · · · , 2i−1 2i k , k = 2i−1 + 1, · · · , 2i 2i
for i = 1, 2, · · · Thus {ξi } is a sequence of simple fuzzy variables satisfying Cr{ξi ≤ r} ↑ Cr{ξ ≤ r}, if r ≤ 0 Cr{ξi ≥ r} ↑ Cr{ξ ≥ r}, if r ≥ 0 as i → ∞. Similarly, we define simple fuzzy variables ηi via credibility distributions as follows, k−1 k−1 , if ≤ Cr{η ≤ x} < i 2 2i k k−1 Ψi (x) = , if ≤ Cr{η ≤ x} < i 2i 2 1, if Cr{η ≤ x} = 1
k , k = 1, 2, · · · , 2i−1 2i k , k = 2i−1 + 1, · · · , 2i 2i
for i = 1, 2, · · · Thus {ηi } is a sequence of simple fuzzy variables satisfying Cr{ηi ≤ r} ↑ Cr{η ≤ r}, if r ≤ 0 Cr{ηi ≥ r} ↑ Cr{η ≥ r}, if r ≥ 0 as i → ∞. It is also clear that {ξi +ηi } is a sequence of simple fuzzy variables. Furthermore, when r ≤ 0, it follows from (2.57) that lim Cr{ξi + ηi ≤ r} = lim
i→∞
sup
i→∞ x≤0,y≤0,x+y≤r
= =
Cr{ξi ≤ x} ∧ Cr{ηi ≤ y}
sup
lim Cr{ξi ≤ x} ∧ Cr{ηi ≤ y}
sup
Cr{ξ ≤ x} ∧ Cr{η ≤ y}
x≤0,y≤0,x+y≤r i→∞ x≤0,y≤0,x+y≤r
= Cr{ξ + η ≤ r}.
89
Section 2.7 - Expected Value
That is, Cr{ξi + ηi ≤ r} ↑ Cr{ξ + η ≤ r}, if r ≤ 0. A similar way may prove that Cr{ξi + ηi ≥ r} ↑ Cr{ξ + η ≥ r}, if r ≥ 0. Since the expected values E[ξ] and E[η] exist, we have +∞
Z
+∞
Z
Cr{ξi ≤ r}dr −∞ Z 0
0
Cr{ξ ≤ r}dr = E[ξ],
Cr{ξ ≥ r}dr −
→
−∞
0 +∞
Z
0
Z Cr{ηi ≥ r}dr −
E[ηi ] =
+∞
Z →
Cr{ηi ≤ r}dr −∞ Z 0
0
Cr{η ≥ r}dr −
Cr{η ≤ r}dr = E[η], −∞
0
+∞
Z
0
Z Cr{ξi ≥ r}dr −
E[ξi ] =
Z
0
Cr{ξi + ηi ≥ r}dr −
E[ξi + ηi ] =
Cr{ξi + ηi ≤ r}dr
0 +∞
Z →
Z
−∞ 0
Cr{ξ + η ≥ r}dr −
Cr{ξ + η ≤ r}dr = E[ξ + η] −∞
0
as i → ∞. It follows from Step 3 that E[ξ + η] = E[ξ] + E[η]. Step 5: We prove that E[ξ + η] = E[ξ] + E[η] when ξ and η are arbitrary fuzzy variables. Since they have finite expected values, there exist two numbers c and d such that 1 ≤ Cr{ξ + c ≤ 0}, y↑0 2 1 lim Cr{η + d ≤ y} ≤ ≤ Cr{η + d ≤ 0}. y↑0 2 lim Cr{ξ + c ≤ y} ≤
It follows from Steps 1 and 4 that E[ξ + η] = E[(ξ + c) + (η + d) − c − d] = E[(ξ + c) + (η + d)] − c − d = E[ξ + c] + E[η + d] − c − d = E[ξ] + c + E[η] + d − c − d = E[ξ] + E[η].
90
Chapter 2 - Credibility Theory
Step 6: We prove that E[aξ + bη] = aE[ξ] + bE[η] for any real numbers a and b. In fact, the equation follows immediately from Steps 2 and 5. The theorem is proved. Example 2.42: Theorem 2.35 does not hold if ξ and η are not independent. For example, take (Θ, P, Cr) to be {θ1 , θ2 , θ3 } with Cr{θ1 } = 0.7, Cr{θ2 } = 0.3 and Cr{θ3 } = 0.2. The fuzzy variables are defined by 0, if θ = θ1 1, if θ = θ1 2, if θ = θ2 0, if θ = θ2 ξ2 (θ) = ξ1 (θ) = 3, if θ = θ3 . 2, if θ = θ3 , Then we have
1, if θ = θ1 2, if θ = θ2 (ξ1 + ξ2 )(θ) = 5, if θ = θ3 .
Thus E[ξ1 ] = 0.9, E[ξ2 ] = 0.8, and E[ξ1 + ξ2 ] = 1.9. This fact implies that E[ξ1 + ξ2 ] > E[ξ1 ] + E[ξ2 ]. If the fuzzy variables are 0, 1, η1 (θ) = 2, Then we have
defined by if θ = θ1 if θ = θ2 if θ = θ3 ,
0, if θ = θ1 3, if θ = θ2 η2 (θ) = 1, if θ = θ3 .
0, if θ = θ1 4, if θ = θ2 (η1 + η2 )(θ) = 3, if θ = θ3 .
Thus E[η1 ] = 0.5, E[η2 ] = 0.9, and E[η1 + η2 ] = 1.2. This fact implies that E[η1 + η2 ] < E[η1 ] + E[η2 ]. Expected Value of Function of Fuzzy Variable Let ξ be a fuzzy variable, and f : < → < a function. Then the expected value of f (ξ) is Z
+∞
Z
0
Cr{f (ξ) ≥ r}dr −
E[f (ξ)] = 0
Cr{f (ξ) ≤ r}dr. −∞
For random case, it has been proved that the expected value E[f (ξ)] is the Lebesgue-Stieltjes integral of f (x) with respect to the probability distribution
91
Section 2.7 - Expected Value
Φ of ξ if the integral exists. However, generally speaking, it is not true for fuzzy case. Example 2.43: We consider a fuzzy variable ξ whose membership function is given by 0.6, if − 1 ≤ x < 0 1, if 0 ≤ x ≤ 1 µ(x) = 0, otherwise. Then the expected value E[ξ 2 ] = 0.5. However, the credibility distribution of ξ is 0, if x < −1 0.3, if − 1 ≤ x < 0 Φ(x) = 0.5, if 0 ≤ x < 1 1, if x ≥ 1 and the Lebesgue-Stieltjes integral Z +∞ x2 dΦ(x) = (−1)2 × 0.3 + 02 × 0.2 + 12 × 0.5 = 0.8 6= E[ξ 2 ]. −∞
Remark 2.6: When f (x) is a monotone and continuous function, Zhu and Ji [263] proved that Z +∞ E[f (ξ)] = f (x)dΦ(x) (2.58) −∞
where Φ is the credibility distribution of ξ. Sum of a Fuzzy Number of Fuzzy Variables Theorem 2.36 (Zhao and Liu [251]) Assume that {ξi } is a sequence of iid fuzzy variables, and n ˜ is a positive fuzzy integer (i.e., a fuzzy variable taking “positive integer” values) that is independent of the sequence {ξi }. Then we have " n˜ # X E ξi = E [˜ nξ1 ] . (2.59) i=1
Proof: Since {ξi } is a sequence of iid fuzzy variables and n ˜ is independent of {ξi }, we have n˜ P Cr ξi ≥ r = sup Cr{˜ n = n} ∧ min Cr {ξi = xi } i=1
1≤i≤n
n,x1 +x2 +···+xn ≥r
≥ sup Cr{˜ n = n} ∧ nx≥r
min Cr {ξi = x}
1≤i≤n
= sup Cr{˜ n = n} ∧ Cr{ξ1 = x} nx≥r
= Cr {˜ nξ1 ≥ r} .
92
Chapter 2 - Credibility Theory
On the other hand, for any given ε > 0, there exists an integer n and real numbers x1 , x2 , · · · , xn with x1 + x2 + · · · + xn ≥ r such that ( n˜ ) X Cr ξi ≥ r − ε ≤ Cr{˜ n = n} ∧ Cr{ξi = xi } i=1
for each i with 1 ≤ i ≤ n. Without loss of generality, we assume that nx1 ≥ r. Then we have ( n˜ ) X Cr ξi ≥ r − ε ≤ Cr{˜ n = n} ∧ Cr{ξ1 = x1 } ≤ Cr {˜ nξ1 ≥ r} . i=1
Letting ε → 0, we get Cr
( n˜ X
) ξi ≥ r
≤ Cr {˜ nξ1 ≥ r} .
i=1
It follows that Cr
( n˜ X
) ξi ≥ r
= Cr {˜ nξ1 ≥ r} .
i=1
Similarly, the above identity still holds if the symbol “≥” is replaced with “≤”. Finally, by the definition of expected value operator, we have " n˜ # Z ( n˜ ) ( n˜ ) Z 0 +∞ X X X E ξi = Cr ξi ≥ r dr − Cr ξi ≤ r dr 0
i=1
Z
−∞
i=1 +∞
Z Cr {˜ nξ1 ≥ r} dr −
= 0
i=1
0
Cr {˜ nξ1 ≤ r} dr = E [˜ nξ1 ] . −∞
The theorem is proved.
2.8
Variance
Definition 2.20 (Liu and Liu [126]) Let ξ be a fuzzy variable with finite expected value e. Then the variance of ξ is defined by V [ξ] = E[(ξ − e)2 ]. The variance of a fuzzy variable provides a measure of the spread of the distribution around its expected value. Example 2.44: Let ξ be an equipossible fuzzy variable (a, b). Then its expected value is e = (a + b)/2, and for any positive number r, we have ( 1/2, if r ≤ (b − a)2 /4 2 Cr{(ξ − e) ≥ r} = 0, if r > (b − a)2 /4.
93
Section 2.8 - Variance
Thus the variance is Z
+∞
Z
2
(b−a)2 /4
Cr{(ξ − e) ≥ r}dr =
V [ξ] =
0
0
1 (b − a)2 dr = . 2 8
Example 2.45: Let ξ = (a, b, c) be a symmetric triangular fuzzy variable, i.e., b − a = c − b. Then its variance is V [ξ] = (c − a)2 /24. Example 2.46: Let ξ = (a, b, c, d) be a symmetric trapezoidal fuzzy variable, i.e., b − a = d − c. Then its variance is V [ξ] = ((d − a)2 + (d − a)(c − b) + (c − b)2 )/24. Example 2.47: A fuzzy variable ξ is called normally distributed if it has a normal membership function −1 π|x − e| √ , µ(x) = 2 1 + exp 6σ
x ∈ <, σ > 0.
(2.60)
The expected value is e and variance is σ 2 . Let ξ1 and ξ2 be independently and normally distributed fuzzy variables with expected values e1 and e2 , variances σ12 and σ22 , respectively. Then for any real numbers a1 and a2 , the fuzzy variable a1 ξ1 + a2 ξ2 is also normally distributed with expected value a1 e1 + a2 e2 and variance (|a1 |σ1 + |a2 |σ2 )2 . µ(x) ..... ....... ..
1 ........ . . . . . . . . . . . . . . . . ............................
. . .... . .... .... .... .. ........ ... .... .... . .. .... .... . ..... .... . . ... . . ..... .. . ... .. ..... ... . . . . . . . . ............ . . . . . . . .. . . . . . . . ........... . . . ......... . ... . . .... . . ....... . . . . ... . .. ....... . . . ... ............ ....... . . . .......... ......... . . . . ............ ..... . . . . . . . . . . . . . . .................... . . . .. .................... . . . . . . . . . . . . . ................................................................................................................................................................................................................................................................... .. .. ...
0.434 0
e−σ
e
e+σ
x
Figure 2.3: Normal Membership Function
Theorem 2.37 If ξ is a fuzzy variable whose variance exists, a and b are real numbers, then V [aξ + b] = a2 V [ξ]. Proof: It follows from the definition of variance that V [aξ + b] = E (aξ + b − aE[ξ] − b)2 = a2 E[(ξ − E[ξ])2 ] = a2 V [ξ]. Theorem 2.38 Let ξ be a fuzzy variable with expected value e. Then V [ξ] = 0 if and only if Cr{ξ = e} = 1.
94
Chapter 2 - Credibility Theory
Proof: If V [ξ] = 0, then E[(ξ − e)2 ] = 0. Note that Z +∞ Cr{(ξ − e)2 ≥ r}dr E[(ξ − e)2 ] = 0 2
which implies Cr{(ξ−e) ≥ r} = 0 for any r > 0. Hence we have Cr{(ξ−e)2 = 0} = 1, i.e., Cr{ξ = e} = 1. Conversely, if Cr{ξ = e} = 1, then we have Cr{(ξ − e)2 = 0} = 1 and Cr{(ξ − e)2 ≥ r} = 0 for any r > 0. Thus Z +∞ Cr{(ξ − e)2 ≥ r}dr = 0. V [ξ] = 0
Maximum Variance Theorem Let ξ be a fuzzy variable that takes values in [a, b], but whose membership function is otherwise arbitrary. When its expected value is given, the maximum variance theorem will provide the maximum variance of ξ, thus playing an important role in treating games against nature. Theorem 2.39 (Li and Liu [104]) Let f be a convex function on [a, b], and ξ a fuzzy variable that takes values in [a, b] and has expected value e. Then E[f (ξ)] ≤
b−e e−a f (a) + f (b). b−a b−a
(2.61)
Proof: For each θ ∈ Θ, we have a ≤ ξ(θ) ≤ b and ξ(θ) =
ξ(θ) − a b − ξ(θ) a+ b. b−a b−a
It follows from the convexity of f that f (ξ(θ)) ≤
b − ξ(θ) ξ(θ) − a f (a) + f (b). b−a b−a
Taking expected values on both sides, we obtain (2.61). Theorem 2.40 (Li and Liu [104], Maximum Variance Theorem) Let ξ be a fuzzy variable that takes values in [a, b] and has expected value e. Then V [ξ] ≤ (e − a)(b − e) and equality holds if the fuzzy variable ξ has membership function 2(b − e) ∧ 1, if x = a b−a µ(x) = 2(e − a) ∧ 1, if x = b. b−a
(2.62)
(2.63)
Proof: It follows from Theorem 2.39 immediately by defining f (x) = (x−e)2 . It is also easy to verify that the fuzzy variable determined by (2.63) has variance (e − a)(b − e). The theorem is proved.
95
Section 2.9 - Moments
2.9
Moments
Definition 2.21 (Liu [128]) Let ξ be a fuzzy variable, and k a positive number. Then (a) the expected value E[ξ k ] is called the kth moment; (b) the expected value E[|ξ|k ] is called the kth absolute moment; (c) the expected value E[(ξ − E[ξ])k ] is called the kth central moment; (d) the expected value E[|ξ −E[ξ]|k ] is called the kth absolute central moment. Note that the first central moment is always 0, the first moment is just the expected value, and the second central moment is just the variance. Example 2.48: A fuzzy variable ξ is called exponentially distributed if it has an exponential membership function −1 πx , x ≥ 0, m > 0. (2.64) µ(x) = 2 1 + exp √ 6m √ The expected value is ( 6m ln 2)/π and the second moment is m2 . Let ξ1 and ξ2 be independently and exponentially distributed fuzzy variables with second moments m21 and m22 , respectively. Then for any positive real numbers a1 and a2 , the fuzzy variable a1 ξ1 +a2 ξ2 is also exponentially distributed with second moment (a1 m1 + a2 m2 )2 . µ(x) .. ......... ... ..... . .... ........ ... ...... .... ... ..... ... .... ..... ... ..... ... .. ... . . . . . . . ......... ..... ... . ........ ... . .. . ........... ... ....... . ... ......... . ............ ... . ................... . ... .. ................................................................................................................................................................................................................... .. .....
1
0.434 0
m
x
Figure 2.4: Exponential Membership Function Theorem 2.41 Let ξ be a nonnegative fuzzy variable, and k a positive number. Then the k-th moment Z +∞ k E[ξ ] = k rk−1 Cr{ξ ≥ r}dr. (2.65) 0
Proof: It follows from the nonnegativity of ξ that Z ∞ Z ∞ Z E[ξ k ] = Cr{ξ k ≥ x}dx = Cr{ξ ≥ r}drk = k 0
The theorem is proved.
0
0
∞
rk−1 Cr{ξ ≥ r}dr.
96
Chapter 2 - Credibility Theory
Theorem 2.42 (Li and Liu [104]) Let ξ be a fuzzy variable that takes values in [a, b] and has expected value e. Then for any positive integer k, the kth absolute moment and kth absolute central moment satisfy the following inequalities, b−e k e−a k |a| + |b| , (2.66) E[|ξ|k ] ≤ b−a b−a E[|ξ − e|k ] ≤
b−e e−a (e − a)k + (b − e)k . b−a b−a
(2.67)
Proof: It follows from Theorem 2.39 immediately by defining f (x) = |x|k and f (x) = |x − e|k .
2.10
Critical Values
In order to rank fuzzy variables, we may use two critical values: optimistic value and pessimistic value. Definition 2.22 (Liu [124]) Let ξ be a fuzzy variable, and α ∈ (0, 1]. Then ξsup (α) = sup r Cr {ξ ≥ r} ≥ α (2.68) is called the α-optimistic value to ξ, and ξinf (α) = inf r Cr {ξ ≤ r} ≥ α
(2.69)
is called the α-pessimistic value to ξ. This means that the fuzzy variable ξ will reach upwards of the α-optimistic value ξsup (α) with credibility α, and will be below the α-pessimistic value ξinf (α) with credibility α. In other words, the α-optimistic value ξsup (α) is the supremum value that ξ achieves with credibility α, and the α-pessimistic value ξinf (α) is the infimum value that ξ achieves with credibility α. Example 2.49: Let ξ be an equipossible fuzzy variable on (a, b). Then its α-optimistic and α-pessimistic values are ( ( b, if α ≤ 0.5 a, if α ≤ 0.5 ξinf (α) = ξsup (α) = a, if α > 0.5, b, if α > 0.5. Example 2.50: Let ξ = (a, b, c) be a triangular fuzzy variable. Then its α-optimistic and α-pessimistic values are ( 2αb + (1 − 2α)c, if α ≤ 0.5 ξsup (α) = (2α − 1)a + (2 − 2α)b, if α > 0.5,
97
Section 2.10 - Critical Values
( ξinf (α) =
(1 − 2α)a + 2αb, if α ≤ 0.5 (2 − 2α)b + (2α − 1)c, if α > 0.5.
Example 2.51: Let ξ = (a, b, c, d) be a trapezoidal fuzzy variable. Then its α-optimistic and α-pessimistic values are ( 2αc + (1 − 2α)d, if α ≤ 0.5 ξsup (α) = (2α − 1)a + (2 − 2α)b, if α > 0.5, ( ξinf (α) =
(1 − 2α)a + 2αb, (2 − 2α)c + (2α − 1)d,
if α ≤ 0.5 if α > 0.5.
Theorem 2.43 Let ξ be a fuzzy variable. If α > 0.5, then we have Cr{ξ ≤ ξinf (α)} ≥ α,
Cr{ξ ≥ ξsup (α)} ≥ α.
(2.70)
Proof: It follows from the definition of α-pessimistic value that there exists a decreasing sequence {xi } such that Cr{ξ ≤ xi } ≥ α and xi ↓ ξinf (α) as i → ∞. Since {ξ ≤ xi } ↓ {ξ ≤ ξinf (α)} and limi→∞ Cr{ξ ≤ xi } ≥ α > 0.5, it follows from the credibility semicontinuity law that Cr{ξ ≤ ξinf (α)} = lim Cr{ξ ≤ xi } ≥ α. i→∞
Similarly, there exists an increasing sequence {xi } such that Cr{ξ ≥ xi } ≥ δ and xi ↑ ξsup (α) as i → ∞. Since {ξ ≥ xi } ↓ {ξ ≥ ξsup (α)} and limi→∞ Cr{ξ ≥ xi } ≥ α > 0.5, it follows from the credibility semicontinuity law that Cr{ξ ≥ ξsup (α)} = lim Cr{ξ ≥ xi } ≥ α. i→∞
The theorem is proved. Example 2.52: When α ≤ 0.5, it is possible that the inequalities Cr{ξ ≤ ξinf (α)} < α,
Cr{ξ ≥ ξsup (α)} < α
hold. Let ξ be an equipossible fuzzy variable on (−1, 1). It is clear that ξinf (0.5) = −1. However, Cr{ξ ≤ ξinf (0.5)} = 0 < 0.5. In addition, ξsup (0.5) = 1 and Cr{ξ ≥ ξsup (0.5)} = 0 < 0.5. Theorem 2.44 Let ξ be a fuzzy variable. Then we have (a) ξinf (α) is an increasing and left-continuous function of α; (b) ξsup (α) is a decreasing and left-continuous function of α. Proof: (a) It is easy to prove that ξinf (α) is an increasing function of α. Next, we prove the left-continuity of ξinf (α) with respect to α. Let {αi } be an arbitrary sequence of positive numbers such that αi ↑ α. Then {ξinf (αi )}
98
Chapter 2 - Credibility Theory
is an increasing sequence. If the limitation is equal to ξinf (α), then the leftcontinuity is proved. Otherwise, there exists a number z ∗ such that lim ξinf (αi ) < z ∗ < ξinf (α).
i→∞
Thus Cr{ξ ≤ z ∗ } ≥ αi for each i. Letting i → ∞, we get Cr{ξ ≤ z ∗ } ≥ α. Hence z ∗ ≥ ξinf (α). A contradiction proves the left-continuity of ξinf (α) with respect to α. The part (b) may be proved similarly. Theorem 2.45 Let ξ be a fuzzy variable. Then we have (a) if α > 0.5, then ξinf (α) ≥ ξsup (α); (b) if α ≤ 0.5, then ξinf (α) ≤ ξsup (α). Proof: Part (a): Write ξ(α) = (ξinf (α) + ξsup (α))/2. If ξinf (α) < ξsup (α), then we have 1 ≥ Cr{ξ < ξ(α)} + Cr{ξ > ξ(α)} ≥ α + α > 1. A contradiction proves ξinf (α) ≥ ξsup (α). Part (b): Assume that ξinf (α) > ξsup (α). It follows from the definition of ξinf (α) that Cr{ξ ≤ ξ(α)} < α. Similarly, it follows from the definition of ξsup (α) that Cr{ξ ≥ ξ(α)} < α. Thus 1 ≤ Cr{ξ ≤ ξ(α)} + Cr{ξ ≥ ξ(α)} < α + α ≤ 1. A contradiction proves ξinf (α) ≤ ξsup (α). The theorem is proved. Theorem 2.46 Let ξ be a fuzzy variable. Then we have (a) if c ≥ 0, then (cξ)sup (α) = cξsup (α) and (cξ)inf (α) = cξinf (α); (b) if c < 0, then (cξ)sup (α) = cξinf (α) and (cξ)inf (α) = cξsup (α). Proof: If c = 0, then the part (a) is obviously valid. When c > 0, we have (cξ)sup (α) = sup {r | Cr{cξ ≥ r} ≥ α} = c sup {r/c | Cr {ξ ≥ r/c} ≥ α} = cξsup (α). A similar way may prove that (cξ)inf (α) = cξinf (α). In order to prove the part (b), it suffices to verify that (−ξ)sup (α) = −ξinf (α) and (−ξ)inf (α) = −ξsup (α). In fact, for any α ∈ (0, 1], we have (−ξ)sup (α) = sup{r | Cr{−ξ ≥ r} ≥ α} = − inf{−r | Cr{ξ ≤ −r} ≥ α} = −ξinf (α). Similarly, we may prove that (−ξ)inf (α) = −ξsup (α). The theorem is proved.
99
Section 2.11 - Entropy
Theorem 2.47 Suppose that ξ and η are independent fuzzy variables. Then for any α ∈ (0, 1], we have (ξ + η)sup (α) = ξsup (α) + ηsup (α), (ξ + η)inf (α) = ξinf (α) + ηinf (α), (ξη)sup (α) = ξsup (α)ηsup (α), (ξη)inf (α) = ξinf (α)ηinf (α), if ξ ≥ 0, η ≥ 0, (ξ ∨ η)sup (α) = ξsup (α) ∨ ηsup (α), (ξ ∨ η)inf (α) = ξinf (α) ∨ ηinf (α), (ξ ∧ η)sup (α) = ξsup (α) ∧ ηsup (α), (ξ ∧ η)inf (α) = ξinf (α) ∧ ηinf (α). Proof: For any given number ε > 0, since ξ and η are independent fuzzy variables, we have Cr{ξ + η ≥ ξsup (α) + ηsup (α) − ε} ≥ Cr {{ξ ≥ ξsup (α) − ε/2} ∩ {η ≥ ηsup (α) − ε/2}} = Cr{ξ ≥ ξsup (α) − ε/2} ∧ Cr{η ≥ ηsup (α) − ε/2} ≥ α which implies (ξ + η)sup (α) ≥ ξsup (α) + ηsup (α) − ε.
(2.71)
On the other hand, by the independence, we have Cr{ξ + η ≥ ξsup (α) + ηsup (α) + ε} ≤ Cr {{ξ ≥ ξsup (α) + ε/2} ∪ {η ≥ ηsup (α) + ε/2}} = Cr{ξ ≥ ξsup (α) + ε/2} ∨ Cr{η ≥ ηsup (α) + ε/2} < α which implies (ξ + η)sup (α) ≤ ξsup (α) + ηsup (α) + ε.
(2.72)
It follows from (2.71) and (2.72) that ξsup (α) + ηsup (α) + ε ≥ (ξ + η)sup (α) ≥ ξsup (α) + ηsup (α) − ε. Letting ε → 0, we obtain (ξ + η)sup (α) = ξsup (α) + ηsup (α). The other equalities may be proved similarly. Example 2.53: The independence condition cannot be removed in Theorem 2.47. For example, let Θ = {θ1 , θ2 }, Cr{θ1 } = Cr{θ2 } = 1/2, and let fuzzy variables ξ and η be defined as ( ( 0, if θ = θ1 1, if θ = θ1 ξ(θ) = η(θ) = 1, if θ = θ2 , 0, if θ = θ2 . However, (ξ + η)sup (0.6) = 1 6= 0 = ξsup (0.6) + ηsup (0.6).
100
2.11
Chapter 2 - Credibility Theory
Entropy
Fuzzy entropy is a measure of uncertainty and has been studied by many researchers such as De Luca and Termini [26], Kaufmann [75], Yager [223], Kosko [85], Pal and Pal [177], Bhandari and Pal [7], and Pal and Bezdek [181]. Those definitions of entropy characterize the uncertainty resulting primarily from the linguistic vagueness rather than resulting from information deficiency, and vanishes when the fuzzy variable is an equipossible one. In order to measure the uncertainty of fuzzy variables, Liu [131] suggested that an entropy of fuzzy variables should meet at least the following three basic requirements: (i) minimum: the entropy of a crisp number is minimum, i.e., 0; (ii) maximum: the entropy of an equipossible fuzzy variable is maximum; (iii) universality: the entropy is applicable not only to finite and infinite cases but also to discrete and continuous cases. In order to meet those requirements, Li and Liu [95] provided a new definition of fuzzy entropy to characterize the uncertainty resulting from information deficiency which is caused by the impossibility to predict the specified value that a fuzzy variable takes. Entropy of Discrete Fuzzy Variables Definition 2.23 (Li and Liu [95]) Let ξ be a discrete fuzzy variable taking values in {x1 , x2 , · · · }. Then its entropy is defined by H[ξ] =
∞ X
S(Cr{ξ = xi })
(2.73)
i=1
where S(t) = −t ln t − (1 − t) ln(1 − t). Remark 2.7: It is easy to verify that S(t) is a symmetric function about t = 0.5, strictly increases on the interval [0, 0.5], strictly decreases on the interval [0.5, 1], and reaches its unique maximum ln 2 at t = 0.5. Remark 2.8: It is clear that the entropy depends only on the number of values and their credibilities and does not depend on the actual values that the fuzzy variable takes. Example 2.54: Suppose that ξ is a discrete fuzzy variable taking values in {x1 , x2 , · · · }. If there exists some index k such that the membership function µ(xk ) = 1, and 0 otherwise, then its entropy H[ξ] = 0. Example 2.55: Suppose that ξ is a simple fuzzy variable taking values in {x1 , x2 , · · · , xn }. If its membership function µ(x) ≡ 1, then its entropy H[ξ] = n ln 2.
101
Section 2.11 - Entropy
S(t) ... .......... ... .. ... . . . . . . . . . . . . . . . ....................... ........ . ..... ... ...... ...... . ... ..... ..... . ..... ..... ... . ..... .... . . . ... . .... ... . . . .... ... . . ... .... ... . . ... . . . . ... ... . ... ... ... . . ... .. ... . . ... .. . ... . ... . .. ... . ... . ... .... ... . . ... ... ... . ... ... ... . ... ... ... . ... . ... ... ... . ...... . ... . ...... ... . ..... .................................................................................................................................................................................... .... .. .
ln 2
0
0.5
1
t
Figure 2.5: Function S(t) = −t ln t − (1 − t) ln(1 − t) Theorem 2.48 Suppose that ξ is a discrete fuzzy variable taking values in {x1 , x2 , · · · }. Then H[ξ] ≥ 0 (2.74) and equality holds if and only if ξ is essentially a crisp number. Proof: The nonnegativity is clear. In addition, H[ξ] = 0 if and only if Cr{ξ = xi } = 0 or 1 for each i. That is, there exists one and only one index k such that Cr{ξ = xk } = 1, i.e., ξ is essentially a crisp number. This theorem states that the entropy of a fuzzy variable reaches its minimum 0 when the fuzzy variable degenerates to a crisp number. In this case, there is no uncertainty. Theorem 2.49 Suppose that ξ is a simple fuzzy variable taking values in {x1 , x2 , · · · , xn }. Then H[ξ] ≤ n ln 2 (2.75) and equality holds if and only if ξ is an equipossible fuzzy variable. Proof: Since the function S(t) reaches its maximum ln 2 at t = 0.5, we have H[ξ] =
n X
S(Cr{ξ = xi }) ≤ n ln 2
i=1
and equality holds if and only if Cr{ξ = xi } = 0.5, i.e., µ(xi ) ≡ 1 for all i = 1, 2, · · · , n. This theorem states that the entropy of a fuzzy variable reaches its maximum when the fuzzy variable is an equipossible one. In this case, there is no preference among all the values that the fuzzy variable will take.
102
Chapter 2 - Credibility Theory
Entropy of Continuous Fuzzy Variables Definition 2.24 (Li and Liu [95]) Let ξ be a continuous fuzzy variable. Then its entropy is defined by Z +∞ H[ξ] = S(Cr{ξ = x})dx (2.76) −∞
where S(t) = −t ln t − (1 − t) ln(1 − t). For any continuous fuzzy variable ξ with membership function µ, we have Cr{ξ = x} = µ(x)/2 for each x ∈ <. Thus Z +∞ µ(x) µ(x) µ(x) µ(x) H[ξ] = − ln + 1− ln 1 − dx. (2.77) 2 2 2 2 −∞ Example 2.56: Let ξ be an equipossible fuzzy variable (a, b). Then µ(x) = 1 if a ≤ x ≤ b, and 0 otherwise. Thus its entropy is Z b 1 1 1 1 ln + 1 − ln 1 − dx = (b − a) ln 2. H[ξ] = − 2 2 2 2 a Example 2.57: Let ξ be a triangular fuzzy variable (a, b, c). Then its entropy is H[ξ] = (c − a)/2. Example 2.58: Let ξ be a trapezoidal fuzzy variable (a, b, c, d). Then its entropy is H[ξ] = (d − a)/2 + (ln 2 − 0.5)(c − b). Example 2.59: Let ξ be an exponentially distributed √ fuzzy variable with second moment m2 . Then its entropy is H[ξ] = πm/ 6. Example 2.60: Let ξ be a normally distributed fuzzy√variable with expected value e and variance σ 2 . Then its entropy is H[ξ] = 6πσ/3. Theorem 2.50 Let ξ be a continuous fuzzy variable. Then H[ξ] > 0. Proof: The positivity is clear. In addition, when a continuous fuzzy variable tends to a crisp number, its entropy tends to the minimum 0. However, a crisp number is not a continuous fuzzy variable. Theorem 2.51 Let ξ be a continuous fuzzy variable taking values on the interval [a, b]. Then H[ξ] ≤ (b − a) ln 2 (2.78) and equality holds if and only if ξ is an equipossible fuzzy variable (a, b). Proof: The theorem follows from the fact that the function S(t) reaches its maximum ln 2 at t = 0.5.
103
Section 2.11 - Entropy
Theorem 2.52 Let ξ and η be two continuous fuzzy variables with membership functions µ(x) and ν(x), respectively. If µ(x) ≤ ν(x) for any x ∈ <, then we have H[ξ] ≤ H[η]. Proof: Since µ(x) ≤ ν(x), we have S(µ(x)/2) ≤ S(ν(x)/2) for any x ∈ <. It follows that H[ξ] ≤ H[η]. Theorem 2.53 Let ξ be a continuous fuzzy variable. Then for any real numbers a and b, we have H[aξ + b] = |a|H[ξ]. Proof: It follows from the definition of entropy that Z +∞ Z +∞ H[aξ+b] = S(Cr{aξ+b = x})dx = |a| S(Cr{ξ = y})dy = |a|H[ξ]. −∞
−∞
Maximum Entropy Principle Given some constraints, for example, expected value and variance, there are usually multiple compatible membership functions. Which membership function shall we take? The maximum entropy principle attempts to select the membership function that maximizes the value of entropy and satisfies the prescribed constraints. Theorem 2.25 (Li and Liu [102]) Let ξ be a continuous nonnegative fuzzy variable with finite second moment m2 . Then πm H[ξ] ≤ √ 6
(2.79)
and the equality holds if ξ is an exponentially distributed fuzzy variable with second moment m2 . Proof: Let µ be the membership function of ξ. Note that µ is a continuous function. The proof is based on the following two steps. Step 1: Suppose that µ is a decreasing function on [0, +∞). For this case, we have Cr{ξ ≥ x} = µ(x)/2 for any x > 0. Thus the second moment Z +∞ Z +∞ Z +∞ E[ξ 2 ] = Cr{ξ 2 ≥ x}dx = 2xCr{ξ ≥ x}dx = xµ(x)dx. 0
0
0
The maximum entropy membership function µ should maximize the entropy Z +∞ µ(x) µ(x) µ(x) µ(x) ln + 1− ln 1 − dx − 2 2 2 2 0 subject to the moment constraint Z +∞ 0
xµ(x)dx = m2 .
104
Chapter 2 - Credibility Theory
The Lagrangian is Z +∞ µ(x) µ(x) µ(x) µ(x) L= − ln + 1− ln 1 − dx 2 2 2 2 0 Z
+∞ 2
xµ(x)dx − m
−λ
.
0
The maximum entropy membership function meets Euler-Lagrange equation 1 µ(x) 1 µ(x) ln − ln 1 − + λx = 0 2 2 2 2 −1
and has the form µ(x) = 2 (1 + exp(2λx)) . Substituting it into the moment constraint, we get −1 πx , x≥0 µ∗ (x) = 2 1 + exp √ 6m which is just the exponential membership function with second moment m2 , √ ∗ and the maximum entropy is H[ξ ] = πm/ 6. Step 2: Let ξ be a general fuzzy variable with second moment m2 . Now we define a fuzzy variable ξb via membership function µ b(x) = sup µ(y),
x ≥ 0.
y≥x
Then µ b is a decreasing function on [0, +∞), and Cr{ξb2 ≥ x} =
1 1 1 sup µ b(y) = sup sup µ(z) = sup µ(z) ≤ Cr{ξ 2 ≥ x} 2 y≥√x 2 y≥√x z≥y 2 z≥√x
for any x > 0. Thus we have Z +∞ Z E[ξb2 ] = Cr{ξb2 ≥ x}dx ≤ 0
+∞
Cr{ξ 2 ≥ x}dx = E[ξ 2 ] = m2 .
0
It follows from µ(x) ≤ µ b(x) and Step 1 that q π E[ξb2 ] πm b ≤ √ H[ξ] ≤ H[ξ] ≤ √ . 6 6 The theorem is thus proved. Theorem 2.26 (Li and Liu [102]) Let ξ be a continuous fuzzy variable with finite expected value e and variance σ 2 . Then √ 6πσ (2.80) H[ξ] ≤ 3 and the equality holds if ξ is a normally distributed fuzzy variable with expected value e and variance σ 2 .
105
Section 2.11 - Entropy
Proof: Let µ be the continuous membership function of ξ. The proof is based on the following two steps. Step 1: Let µ(x) be a unimodal and symmetric function about x = e. For this case, the variance is +∞
Z
Cr{(ξ − e)2 ≥ x}dx =
V [ξ] =
Cr{ξ − e ≥
0
√
x}dx
0 +∞
Z
+∞
Z
Z
+∞
(x − e)µ(x)dx
2(x − e)Cr{ξ ≥ x}dx =
=
e
e
and the entropy is +∞
Z
H[ξ] = −2 e
µ(x) µ(x) µ(x) µ(x) ln + 1− ln 1 − dx. 2 2 2 2
The maximum entropy membership function µ should maximize the entropy subject to the variance constraint. The Lagrangian is Z
+∞
L = −2 e
µ(x) µ(x) µ(x) µ(x) ln + 1− ln 1 − dx 2 2 2 2
+∞
Z −λ
(x − e)µ(x)dx − σ
2
.
e
The maximum entropy membership function meets Euler-Lagrange equation µ(x) µ(x) − ln 1 − + λ(x − e) = 0 ln 2 2 −1
and has the form µ(x) = 2 (1 + exp (λ(x − e))) variance constraint, we get ∗
µ (x) = 2 1 + exp
π|x − e| √ 6σ
. Substituting it into the
−1 ,
x∈<
which is just the normal membership function with √ expected value e and variance σ 2 , and the maximum entropy is H[ξ ∗ ] = 6πσ/3. Step 2: Let ξ be a general fuzzy variable with expected value e and variance σ 2 . We define a fuzzy variable ξb by the membership function
µ b(x) =
sup(µ(y) ∨ µ(2e − y)), y≤x
if x ≤ e
sup (µ(y) ∨ µ(2e − y)) , if x > e. y≥x
106
Chapter 2 - Credibility Theory
It is easy to verify that µ b(x) is a unimodal and symmetric function about x = e. Furthermore, n o 1 1 Cr (ξb − e)2 ≥ r = b(x) = sup µ sup sup(µ(y) ∨ µ(2e − y)) 2 x≥e+√r 2 x≥e+√r y≥x 1 1 sup√ (µ(y) ∨ µ(2e − y)) = sup µ(y) 2 y≥e+ r 2 (y−e)2 ≥r ≤ Cr (ξ − e)2 ≥ r =
for any r > 0. Thus Z Z +∞ 2 b b Cr{(ξ − e) ≥ r}dr ≤ V [ξ] =
+∞
Cr{(ξ − e)2 ≥ r}dr = σ 2 .
0
0
It follows from µ(x) ≤ µ b(x) and Step 1 that √ q √ b 6π V [ξ] 6πσ b H[ξ] ≤ H[ξ] ≤ ≤ . 3 3 The proof is complete.
2.12
Distance
Distance between fuzzy variables has been defined in many ways, for example, Hausdorff distance (Puri and Ralescu [192], Klement et. al. [81]), and Hamming distance (Kacprzyk [72]). However, those definitions have no identification property. In order to overcome this shortage, Liu [129] proposed a definition of distance as follows. Definition 2.27 (Liu [129]) The distance between fuzzy variables ξ and η is defined as d(ξ, η) = E[|ξ − η|]. (2.81) Example 2.61: Let ξ and η be equipossible fuzzy variables (a1 , b1 ) and (a2 , b2 ), respectively, and (a1 , b1 )∩(a2 , b2 ) = ∅. Then |ξ −η| is an equipossible fuzzy variable on the interval with endpoints |a1 − b2 | and |b1 − a2 |. Thus the distance between ξ and η is the expected value of |ξ − η|, i.e., d(ξ, η) =
1 (|a1 − b2 | + |b1 − a2 |) . 2
Example 2.62: Let ξ = (a1 , b1 , c1 ) and η = (a2 , b2 , c2 ) be triangular fuzzy variables such that (a1 , c1 ) ∩ (a2 , c2 ) = ∅. Then d(ξ, η) =
1 (|a1 − c2 | + 2|b1 − b2 | + |c1 − a2 |) . 4
107
Section 2.13 - Inequalities
Example 2.63: Let ξ = (a1 , b1 , c1 , d1 ) and η = (a2 , b2 , c2 , d2 ) be trapezoidal fuzzy variables such that (a1 , d1 ) ∩ (a2 , d2 ) = ∅. Then d(ξ, η) =
1 (|a1 − d2 | + |b1 − c2 | + |c1 − b2 | + |d1 − a2 |) . 4
Theorem 2.54 (Li and Liu [105]) Let ξ, η, τ be fuzzy variables, and let d(·, ·) be the distance. Then we have (a) (Nonnegativity) d(ξ, η) ≥ 0; (b) (Identification) d(ξ, η) = 0 if and only if ξ = η; (c) (Symmetry) d(ξ, η) = d(η, ξ); (d) (Triangle Inequality) d(ξ, η) ≤ 2d(ξ, τ ) + 2d(η, τ ). Proof: The parts (a), (b) and (c) follow immediately from the definition. Now we prove the part (d). It follows from the credibility subadditivity theorem that Z +∞ d(ξ, η) = Cr {|ξ − η| ≥ r} dr 0
Z
+∞
≤
Cr {|ξ − τ | + |τ − η| ≥ r} dr 0
Z
+∞
≤
Cr {{|ξ − τ | ≥ r/2} ∪ {|τ − η| ≥ r/2}} dr 0
Z
+∞
(Cr{|ξ − τ | ≥ r/2} + Cr{|τ − η| ≥ r/2}) dr
≤ 0
Z
+∞
Z Cr{|ξ − τ | ≥ r/2}dr +
= 0
+∞
Cr{|τ − η| ≥ r/2}dr 0
= 2E[|ξ − τ |] + 2E[|τ − η|] = 2d(ξ, τ ) + 2d(τ, η). Example 2.64: Let Θ = {θ1 , θ2 , θ3 } and Cr{θi } = 1/2 for i = 1, 2, 3. We define fuzzy variables ξ, η and τ as follows, ( ( 1, if θ 6= θ3 −1, if θ 6= θ1 ξ(θ) = η(θ) = τ (θ) ≡ 0. 0, otherwise, 0, otherwise, It is easy to verify that d(ξ, τ ) = d(τ, η) = 1/2 and d(ξ, η) = 3/2. Thus d(ξ, η) =
2.13
3 (d(ξ, τ ) + d(τ, η)). 2
Inequalities
There are several useful inequalities for random variable, such as Markov inequality, Chebyshev inequality, H¨older’s inequality, Minkowski inequality,
108
Chapter 2 - Credibility Theory
and Jensen’s inequality. This section introduces the analogous inequalities for fuzzy variable. Theorem 2.55 (Liu [128]) Let ξ be a fuzzy variable, and f a nonnegative function. If f is even and increasing on [0, ∞), then for any given number t > 0, we have E[f (ξ)] Cr{|ξ| ≥ t} ≤ . (2.82) f (t) Proof: It is clear that Cr{|ξ| ≥ f −1 (r)} is a monotone decreasing function of r on [0, ∞). It follows from the nonnegativity of f (ξ) that Z +∞ Cr{f (ξ) ≥ r}dr E[f (ξ)] = 0
Z
+∞
Cr{|ξ| ≥ f −1 (r)}dr
= 0
Z
f (t)
Cr{|ξ| ≥ f −1 (r)}dr
≥ 0
Z ≥
f (t)
dr · Cr{|ξ| ≥ f −1 (f (t))}
0
= f (t) · Cr{|ξ| ≥ t} which proves the inequality. Theorem 2.56 (Liu [128], Markov Inequality) Let ξ be a fuzzy variable. Then for any given numbers t > 0 and p > 0, we have Cr{|ξ| ≥ t} ≤
E[|ξ|p ] . tp
(2.83)
Proof: It is a special case of Theorem 2.55 when f (x) = |x|p . Theorem 2.57 (Liu [128], Chebyshev Inequality) Let ξ be a fuzzy variable whose variance V [ξ] exists. Then for any given number t > 0, we have Cr {|ξ − E[ξ]| ≥ t} ≤
V [ξ] . t2
(2.84)
Proof: It is a special case of Theorem 2.55 when the fuzzy variable ξ is replaced with ξ − E[ξ], and f (x) = x2 . Theorem 2.58 (Liu [128], H¨ older’s Inequality) Let p and q be two positive real numbers with 1/p+1/q = 1, and let ξ and η be independent fuzzy variables with E[|ξ|p ] < ∞ and E[|η|q ] < ∞. Then we have p p E[|ξη|] ≤ p E[|ξ|p ] q E[|η|q ]. (2.85)
109
Section 2.13 - Inequalities
Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume√E[|ξ|p ] > 0 and E[|η|q ] > 0. It is easy to prove that the function √ f (x, y) = p x q y is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
Letting x0 = E[|ξ|p ], y0 = E[|η|q ], x = |ξ|p and y = |η|q , we have f (|ξ|p , |η|q ) − f (E[|ξ|p ], E[|η|q ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|q − E[|η|q ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|q )] ≤ f (E[|ξ|p ], E[|η|q ]). Hence the inequality (2.85) holds. Theorem 2.59 (Liu [128], Minkowski Inequality) Let p be a real number with p ≥ 1, and let ξ and η be independent fuzzy variables with E[|ξ|p ] < ∞ and E[|η|p ] < ∞. Then we have p p p p E[|ξ + η|p ] ≤ p E[|ξ|p ] + p E[|η|p ]. (2.86) Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume √ E[|ξ|p ] > 0 and E[|η|p ] > 0. It is easy to prove that the function √ f (x, y) = ( p x + p y)p is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
Letting x0 = E[|ξ|p ], y0 = E[|η|p ], x = |ξ|p and y = |η|p , we have f (|ξ|p , |η|p ) − f (E[|ξ|p ], E[|η|p ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|p − E[|η|p ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|p )] ≤ f (E[|ξ|p ], E[|η|p ]). Hence the inequality (2.86) holds. Theorem 2.60 (Liu [129], Jensen’s Inequality) Let ξ be a fuzzy variable, and f : < → < a convex function. If E[ξ] and E[f (ξ)] are finite, then f (E[ξ]) ≤ E[f (ξ)]. Especially, when f (x) = |x|p and p ≥ 1, we have |E[ξ]|p ≤ E[|ξ|p ].
(2.87)
110
Chapter 2 - Credibility Theory
Proof: Since f is a convex function, for each y, there exists a number k such that f (x) − f (y) ≥ k · (x − y). Replacing x with ξ and y with E[ξ], we obtain f (ξ) − f (E[ξ]) ≥ k · (ξ − E[ξ]). Taking the expected values on both sides, we have E[f (ξ)] − f (E[ξ]) ≥ k · (E[ξ] − E[ξ]) = 0 which proves the inequality.
2.14
Convergence Concepts
This section discusses some convergence concepts of fuzzy sequence: convergence almost surely (a.s.), convergence in credibility, convergence in mean, and convergence in distribution. Table 2.1: Relations among Convergence Concepts Convergence Almost Surely Convergence in Mean
→
Convergence
%
in Credibility
&
Convergence in Distribution
Definition 2.28 (Liu [128]) Suppose that ξ, ξ1 , ξ2 , · · · are fuzzy variables defined on the credibility space (Θ, P, Cr). The sequence {ξi } is said to be convergent a.s. to ξ if and only if there exists an event A with Cr{A} = 1 such that lim |ξi (θ) − ξ(θ)| = 0 (2.88) i→∞
for every θ ∈ A. In that case we write ξi → ξ, a.s. Definition 2.29 (Liu [128]) Suppose that ξ, ξ1 , ξ2 , · · · are fuzzy variables defined on the credibility space (Θ, P, Cr). We say that the sequence {ξi } converges in credibility to ξ if lim Cr {|ξi − ξ| ≥ ε} = 0
i→∞
(2.89)
for every ε > 0. Definition 2.30 (Liu [128]) Suppose that ξ, ξ1 , ξ2 , · · · are fuzzy variables with finite expected values defined on the credibility space (Θ, P, Cr). We say that the sequence {ξi } converges in mean to ξ if lim E[|ξi − ξ|] = 0.
i→∞
(2.90)
111
Section 2.14 - Convergence Concepts
In addition, the sequence {ξi } is said to converge in mean square to ξ if lim E[|ξi − ξ|2 ] = 0.
i→∞
(2.91)
Definition 2.31 (Liu [128]) Suppose that Φ, Φ1 , Φ2 , · · · are the credibility distributions of fuzzy variables ξ, ξ1 , ξ2 , · · · , respectively. We say that {ξi } converges in distribution to ξ if Φi → Φ at any continuity point of Φ. Convergence in Mean vs. Convergence in Credibility Theorem 2.61 (Liu [128]) Suppose that ξ, ξ1 , ξ2 , · · · are fuzzy variables defined on the credibility space (Θ, P, Cr). If the sequence {ξi } converges in mean to ξ, then {ξi } converges in credibility to ξ. Proof: It follows from Theorem 2.56 that, for any given number ε > 0, Cr {|ξi − ξ| ≥ ε} ≤
E[|ξi − ξ|] →0 ε
as i → ∞. Thus {ξi } converges in credibility to ξ. Example 2.65: Convergence in credibility does not imply convergence in mean. For example, take (Θ, P, Cr) to be {θ1 , θ2 , · · · } with Cr{θ1 } = 1/2 and Cr{θj } = 1/j for j = 2, 3, · · · The fuzzy variables are defined by ( i, if j = i ξi (θj ) = 0, otherwise for i = 1, 2, · · · and ξ = 0. For any small number ε > 0, we have Cr {|ξi − ξ| ≥ ε} =
1 → 0. i
That is, the sequence {ξi } converges in credibility to ξ. However, E [|ξi − ξ|] ≡ 1 6→ 0. That is, the sequence {ξi } does not converge in mean to ξ. Convergence Almost Surely vs. Convergence in Credibility Example 2.66: Convergence a.s. does not imply convergence in credibility. For example, take (Θ, P, Cr) to be {θ1 , θ2 , · · · } with Cr{θj } = j/(2j + 1) for j = 1, 2, · · · The fuzzy variables are defined by ( i, if j = i ξi (θj ) = 0, otherwise
112
Chapter 2 - Credibility Theory
for i = 1, 2, · · · and ξ = 0. Then the sequence {ξi } converges a.s. to ξ. However, for any small number ε > 0, we have Cr {|ξi − ξ| ≥ ε} =
i 1 → . 2i + 1 2
That is, the sequence {ξi } does not converge in credibility to ξ. Theorem 2.62 (Wang and Liu [221]) Suppose that ξ, ξ1 , ξ2 , · · · are fuzzy variables defined on the credibility space (Θ, P, Cr). If the sequence {ξi } converges in credibility to ξ, then {ξi } converges a.s. to ξ. Proof: If {ξi } does not converge a.s. to ξ, then there exists an element θ∗ ∈ Θ with Cr{θ∗ } > 0 such that ξi (θ∗ ) 6→ ξ(θ∗ ) as i → ∞. In other words, there exists a small number ε > 0 and a subsequence {ξik (θ∗ )} such that |ξik (θ∗ ) − ξ(θ∗ )| ≥ ε for any k. Since credibility measure is an increasing set function, we have Cr {|ξik − ξ| ≥ ε} ≥ Cr{θ∗ } > 0 for any k. It follows that {ξi } does not converge in credibility to ξ. A contradiction proves the theorem. Convergence in Credibility vs. Convergence in Distribution Theorem 2.63 (Wang and Liu [221]) Suppose that ξ, ξ1 , ξ2 , · · · are fuzzy variables. If the sequence {ξi } converges in credibility to ξ, then {ξi } converges in distribution to ξ. Proof: Let x be any given continuity point of the distribution Φ. On the one hand, for any y > x, we have {ξi ≤ x} = {ξi ≤ x, ξ ≤ y} ∪ {ξi ≤ x, ξ > y} ⊂ {ξ ≤ y} ∪ {|ξi − ξ| ≥ y − x}. It follows from the credibility subadditivity theorem that Φi (x) ≤ Φ(y) + Cr{|ξi − ξ| ≥ y − x}. Since {ξi } converges in credibility to ξ, we have Cr{|ξi − ξ| ≥ y − x} → 0. Thus we obtain lim supi→∞ Φi (x) ≤ Φ(y) for any y > x. Letting y → x, we get lim sup Φi (x) ≤ Φ(x). (2.92) i→∞
On the other hand, for any z < x, we have {ξ ≤ z} = {ξ ≤ z, ξi ≤ x} ∪ {ξ ≤ z, ξi > x} ⊂ {ξi ≤ x} ∪ {|ξi − ξ| ≥ x − z} which implies that Φ(z) ≤ Φi (x) + Cr{|ξi − ξ| ≥ x − z}.
Section 2.14 - Convergence Concepts
113
Since Cr{|ξi − ξ| ≥ x − z} → 0, we obtain Φ(z) ≤ lim inf i→∞ Φi (x) for any z < x. Letting z → x, we get Φ(x) ≤ lim inf Φi (x). i→∞
(2.93)
It follows from (2.92) and (2.93) that Φi (x) → Φ(x). The theorem is proved. Example 2.67: Convergence in distribution does not imply convergence in credibility. For example, take (Θ, P, Cr) to be {θ1 , θ2 } with Cr{θ1 } = Cr{θ2 } = 1/2, and define −1, if θ = θ1 ξ(θ) = 1, if θ = θ2 . We also define ξi = −ξ for i = 1, 2, · · · Then ξi and ξ are identically distributed. Thus {ξi } converges in distribution to ξ. But, for any small number ε > 0, we have Cr{|ξi − ξ| > ε} = Cr{Θ} = 1. That is, the sequence {ξi } does not converge in credibility to ξ. Convergence Almost Surely vs. Convergence in Distribution Example 2.68: Convergence in distribution does not imply convergence a.s. For example, take (Θ, P, Cr) to be {θ1 , θ2 } with Cr{θ1 } = Cr{θ2 } = 1/2, and define −1, if θ = θ1 ξ(θ) = 1, if θ = θ2 . We also define ξi = −ξ for i = 1, 2, · · · Then {ξi } converges in distribution to ξ. However, {ξi } does not converge a.s. to ξ. Example 2.69: Convergence a.s. does not imply convergence in distribution. For example, take (Θ, P, Cr) to be {θ1 , θ2 , · · · } with Cr{θj } = j/(2j + 1) for j = 1, 2, · · · The fuzzy variables are defined by ( i, if j = i ξi (θj ) = 0, otherwise for i = 1, 2, · · · and ξ = 0. Then the sequence {ξi } converges a.s. to ξ. However, the credibility distributions of ξi are 0, if x < 0 (i + 1)/(2i + 1), if 0 ≤ x < i Φi (x) = 1, if x ≥ i, i = 1, 2, · · · , respectively. The credibility distribution of ξ is ( 0, if x < 0 Φ(x) = 1, if x ≥ 0.
114
Chapter 2 - Credibility Theory
It is clear that Φi (x) 6→ Φ(x) at x > 0. That is, the sequence {ξi } does not converge in distribution to ξ.
2.15
Conditional Credibility
We now consider the credibility of an event A after it has been learned that some other event B has occurred. This new credibility of A is called the conditional credibility of A given B. The first problem is whether the conditional credibility is determined uniquely and completely. The answer is negative. It is doomed to failure to define an unalterable and widely accepted conditional credibility. For this reason, it is appropriate to speak of a certain person’s subjective conditional credibility, rather than to speak of the true conditional credibility. In order to define a conditional credibility measure Cr{A|B}, at first we have to enlarge Cr{A ∩ B} because Cr{A ∩ B} < 1 for all events whenever Cr{B} < 1. It seems that we have no alternative but to divide Cr{A ∩ B} by Cr{B}. Unfortunately, Cr{A∩B}/Cr{B} is not always a credibility measure. However, the value Cr{A|B} should not be greater than Cr{A ∩ B}/Cr{B} (otherwise the normality will be lost), i.e., Cr{A|B} ≤
Cr{A ∩ B} . Cr{B}
(2.94)
On the other hand, in order to preserve the self-duality, we should have Cr{A|B} = 1 − Cr{Ac |B} ≥ 1 −
Cr{Ac ∩ B} . Cr{B}
(2.95)
Furthermore, since (A ∩ B) ∪ (Ac ∩ B) = B, we have Cr{B} ≤ Cr{A ∩ B} + Cr{Ac ∩ B} by using the credibility subadditivity theorem. Thus 0≤1−
Cr{Ac ∩ B} Cr{A ∩ B} ≤ ≤ 1. Cr{B} Cr{B}
(2.96)
Hence any numbers between 1 − Cr{Ac ∩ B}/Cr{B} and Cr{A ∩ B}/Cr{B} are reasonable values that the conditional credibility may take. Based on the maximum uncertainty principle, we have the following conditional credibility measure. Definition 2.32 (Liu [132]) Let (Θ, P, Cr) be a credibility space, and A, B ∈ P. Then the conditional credibility measure of A given B is defined by Cr{A ∩ B} Cr{A ∩ B} , if < 0.5 Cr{B} Cr{B} Cr{Ac ∩ B} Cr{Ac ∩ B} (2.97) Cr{A|B} = 1− , if < 0.5 Cr{B} Cr{B} 0.5, otherwise
Section 2.15 - Conditional Credibility
115
provided that Cr{B} > 0. It follows immediately from the definition of conditional credibility that 1−
Cr{Ac ∩ B} Cr{A ∩ B} ≤ Cr{A|B} ≤ . Cr{B} Cr{B}
(2.98)
Furthermore, the value of Cr{A|B} takes values as close to 0.5 as possible in the interval. In other words, it accords with the maximum uncertainty principle. Theorem 2.64 (Liu [132]) Let (Θ, P, Cr) be a credibility space, and B an event with Cr{B} > 0. Then Cr{·|B} defined by (2.97) is a credibility measure, and (Θ, P, Cr{·|B}) is a credibility space. Proof: It is sufficient to prove that Cr{·|B} satisfies the normality, monotonicity, self-duality and maximality axioms. At first, it satisfies the normality axiom, i.e., Cr{Θ|B} = 1 −
Cr{Θc ∩ B} Cr{∅} =1− = 1. Cr{B} Cr{B}
For any events A1 and A2 with A1 ⊂ A2 , if Cr{A1 ∩ B} Cr{A2 ∩ B} ≤ < 0.5, Cr{B} Cr{B} then Cr{A1 |B} = If
Cr{A1 ∩ B} Cr{A2 ∩ B} ≤ = Cr{A2 |B}. Cr{B} Cr{B}
Cr{A2 ∩ B} Cr{A1 ∩ B} ≤ 0.5 ≤ , Cr{B} Cr{B}
then Cr{A1 |B} ≤ 0.5 ≤ Cr{A2 |B}. If 0.5 <
Cr{A1 ∩ B} Cr{A2 ∩ B} ≤ , Cr{B} Cr{B}
then we have Cr{Ac1 ∩ B} Cr{Ac2 ∩ B} Cr{A1 |B} = 1 − ∨0.5 ≤ 1 − ∨0.5 = Cr{A2 |B}. Cr{B} Cr{B} This means that Cr{·|B} satisfies the monotonicity axiom. For any event A, if Cr{A ∩ B} Cr{Ac ∩ B} ≥ 0.5, ≥ 0.5, Cr{B} Cr{B}
116
Chapter 2 - Credibility Theory
then we have Cr{A|B} + Cr{Ac |B} = 0.5 + 0.5 = 1 immediately. Otherwise, without loss of generality, suppose Cr{A ∩ B} Cr{Ac ∩ B} < 0.5 < , Cr{B} Cr{B} then we have Cr{A ∩ B} Cr{A ∩ B} + 1− = 1. Cr{A|B} + Cr{A |B} = Cr{B} Cr{B} c
That is, Cr{·|B} satisfies the self-duality axiom. Finally, for any events {Ai } with supi Cr{Ai |B} < 0.5, we have supi Cr{Ai ∩ B} < 0.5 and sup Cr{Ai |B} = i
supi Cr{Ai ∩ B} Cr{∪i Ai ∩ B} = = Cr{∪i Ai |B}. Cr{B} Cr{B}
Thus Cr{·|B} satisfies the maximality axiom. Hence Cr{·|B} is a credibility measure. Furthermore, (Θ, P, Cr{·|B}) is a credibility space. Example 2.70: Let ξ be a fuzzy variable, and X a set of real numbers such that Cr{ξ ∈ X} > 0. Then for any x ∈ X, the conditional credibility of ξ = x given ξ ∈ X is
Cr {ξ = x|ξ ∈ X} =
1−
Cr{ξ = x} , Cr{ξ ∈ X}
if
Cr{ξ = x} < 0.5 Cr{ξ ∈ X}
Cr{ξ 6= x, ξ ∈ X} Cr{ξ 6= x, ξ ∈ X} , if < 0.5 Cr{ξ ∈ X} Cr{ξ ∈ X} 0.5,
otherwise.
Example 2.71: Let ξ and η be two fuzzy variables, and Y a set of real numbers such that Cr{η ∈ Y } > 0. Then we have Cr{ξ = x, η ∈ Y } Cr{ξ = x, η ∈ Y } , if < 0.5 Cr{η ∈ Y } Cr{η ∈ Y } Cr{ξ 6= x, η ∈ Y } Cr{ξ 6= x, η ∈ Y } Cr {ξ = x|η ∈ Y } = 1− , if < 0.5 Cr{η ∈ Y } Cr{η ∈ Y } 0.5, otherwise. Definition 2.33 (Liu [132]) The conditional membership function of a fuzzy variable ξ given B is defined by µ(x|B) = (2Cr{ξ = x|B}) ∧ 1, provided that Cr{B} > 0.
x∈<
(2.99)
117
Section 2.15 - Conditional Credibility
Example 2.72: Let ξ be a fuzzy variable with membership function µ(x), and X a set of real numbers such that µ(x) > 0 for some x ∈ X. Then the conditional membership function of ξ given ξ ∈ X is 2µ(x) ∧ 1, if sup µ(x) < 1 sup µ(x) x∈X x∈X µ(x|X) = (2.100) 2µ(x) ∧ 1, if sup µ(x) = 1 2 − sup µ(x) x∈X x∈X c
for x ∈ X. Please mention that µ(x|X) ≡ 0 if x 6∈ X. µ(x|X)
µ(x|X) .... ........ ....................................... ... ....................................................................... . ... ... ... ... ..... ... ... ... ... ... ... .. ... ... . ... ... ... ... .. . . ... . ... ... .. .. . ... . . ... . ... .. ... ... . . ... . ... .. ... .. . . . ... ... .. ..... ... .... ... ... ... .. ... ... ... ... ....... ... . ... ... . ... ... ... .... ... .. ... ..... .. ... ... ... .... .. ..... .. ...... ... ... . . . . . .................................................................................................................................................................. .. . .. .............. ............................................ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . ...
1
0
X
.... ........ ........................................ .......................... . ... ... ...... . ... ..... .. ... ... ... .. ... ... ... .... ... . ... . ... .... .. ... . . ... . .. ... ........ . . ... ... ... .. . . ... . . ... .. ... ..... . . ... ... ... .. . . ... . ... ... .. ... . . .. . ... ... .. . . . ... . ... ..... .. ... . . . ... ... ... . .. ... .. ... ..... ... ... .. ... .... ... .. .. ...... ... ... ..................................................................................................................................................................... .. . . .. ...................... ..................... ... ...
1
x
0
X
x
Figure 2.6: Conditional Membership Function µ(x|X) Example 2.73: Let ξ and η be two fuzzy variables with joint membership function µ(x, y), and Y a set of real numbers. Then the conditional membership function of ξ given η ∈ Y is 2 sup µ(x, y) y∈Y ∧ 1, if sup µ(x, y) < 1 sup µ(x, y) x∈<,y∈Y x∈<,y∈Y µ(x|Y ) = (2.101) 2 sup µ(x, y) y∈Y sup µ(x, y) = 1 2 − sup µ(x, y) ∧ 1, if x∈<,y∈Y x∈<,y∈Y c
provided that µ(x, y) > 0 for some x ∈ < and y ∈ Y . Especially, the conditional membership function of ξ given η = y is 2µ(x, y) ∧ 1, if sup µ(x, y) < 1 sup µ(x, y) x∈< x∈< µ(x|y) = 2µ(x, y) ∧ 1, if sup µ(x, y) = 1 2 − sup µ(x, z) x∈< x∈<,z6=y
118
Chapter 2 - Credibility Theory
provided that µ(x, y) > 0 for some x ∈ <. Definition 2.34 (Liu [132]) The conditional credibility distribution Φ: < → [0, 1] of a fuzzy variable ξ given B is defined by Φ(x|B) = Cr {ξ ≤ x|B}
(2.102)
provided that Cr{B} > 0. Example 2.74: Let ξ and η be fuzzy variables. Then the conditional credibility distribution of ξ given η = y is Cr{ξ ≤ x, η = y} Cr{ξ ≤ x, η = y} , if < 0.5 Cr{η = y} Cr{η = y} Cr{ξ > x, η = y} Cr{ξ > x, η = y} Φ(x|η = y) = , if < 0.5 1− Cr{η = y} Cr{η = y} 0.5, otherwise provided that Cr{η = y} > 0. Definition 2.35 (Liu [132]) The conditional credibility density function φ of a fuzzy variable ξ given B is a nonnegative function such that Z x Φ(x|B) = φ(y|B)dy, ∀x ∈ <, (2.103) −∞
Z
+∞
φ(y|B)dy = 1
(2.104)
−∞
where Φ(x|B) is the conditional credibility distribution of ξ given B. Definition 2.36 (Liu [132]) Let ξ be a fuzzy variable. Then the conditional expected value of ξ given B is defined by Z +∞ Z 0 E[ξ|B] = Cr{ξ ≥ r|B}dr − Cr{ξ ≤ r|B}dr (2.105) −∞
0
provided that at least one of the two integrals is finite. Following conditional credibility and conditional expected value, we also have conditional variance, conditional moments, conditional critical values, conditional entropy as well as conditional convergence. Definition 2.37 Let ξ be a nonnegative fuzzy variable representing lifetime. Then the hazard rate (or failure rate) is h(x) = lim Cr{ξ ≤ x + ∆ ξ > x}. (2.106) ∆↓0
119
Section 2.16 - Fuzzy Process
The hazard rate tells us the credibility of a failure just after time x when it is functioning at time x. Example 2.75: Let ξ be an exponentially distributed fuzzy variable. Then its hazard rate h(x) ≡ 0.5. In fact, the hazard rate is always 0.5 if the membership function is positive and decreasing. Example 2.76: Let ξ be a triangular fuzzy variable (a, b, c) with a ≥ 0. Then its hazard rate 0, if x ≤ a x−a , if a ≤ x ≤ (a + b)/2 b−a h(x) = 0.5, if (a + b)/2 ≤ x < c 0, if x ≥ c.
2.16
Fuzzy Process
Definition 2.38 Let T be an index set and let (Θ, P, Cr) be a credibility space. A fuzzy process is a function from T × (Θ, P, Cr) to the set of real numbers. That is, a fuzzy process Xt (θ) is a function of two variables such that the function Xt∗ (θ) is a fuzzy variable for each t∗ . For each fixed θ∗ , the function Xt (θ∗ ) is called a sample path of the fuzzy process. A fuzzy process Xt (θ) is said to be sample-continuous if the sample path is continuous for almost all θ. Definition 2.39 A fuzzy process Xt is said to have independent increments if Xt1 − Xt0 , Xt2 − Xt1 , · · · , Xtk − Xtk−1 (2.107) are independent fuzzy variables for any times t0 < t1 < · · · < tk . A fuzzy process Xt is said to have stationary increments if, for any given t > 0, the Xs+t − Xs are identically distributed fuzzy variables for all s > 0. Fuzzy Renewal Process Definition 2.40 (Zhao and Liu [251]) Let ξ1 , ξ2 , · · · be iid positive fuzzy variables. Define S0 = 0 and Sn = ξ1 + ξ2 + · · · + ξn for n ≥ 1. Then the fuzzy process Nt = max n Sn ≤ t (2.108) n≥0
is called a fuzzy renewal process.
120
Chapter 2 - Credibility Theory
N. t 4 3 2 1 0
... .......... ... .. ........... .............................. .. ... ... .. .. ... .......... ......................................................... .. ... .. .. .... .. .. .. .. .. .......... ....................................... .. ... .. .. .. ... .. .. ... .. .. .. .. ... .. ......................................................... ......... .. .. .. .. .. .... .. .. .. .. ... .. .. .. . .. . ...................................................................................................................................................................................................................................... ... ... ... ... ... .... .... .... .... ... 1 ... 2 3 ... 4 ... ... ... .. .. .. .. ..
ξ
S0
ξ
S1
ξ
S2
ξ
S3
t
S4
Figure 2.7: A Sample Path of Fuzzy Renewal Process If ξ1 , ξ2 , · · · denote the interarrival times of successive events. Then Sn can be regarded as the waiting time until the occurrence of the nth event, and Nt is the number of renewals in (0, t]. Each sample path of Nt is a right-continuous and increasing step function taking only nonnegative integer values. Furthermore, the size of each jump of Nt is always 1. In other words, Nt has at most one renewal at each time. In particular, Nt does not jump at time 0. Since Nt ≥ n if and only if Sn ≤ t, we have t Cr{Nt ≥ n} = Cr{Sn ≤ t} = Cr ξ1 ≤ . (2.109) n Theorem 2.65 Let Nt be a fuzzy renewal process with interarrival times ξ1 , ξ2 , · · · Then we have ∞ ∞ X X t Cr ξ1 ≤ Cr{Sn ≤ t} = E[Nt ] = . (2.110) n n=1 n=1 Proof: Since Nt takes only nonnegative integer values, we have Z ∞ ∞ Z n X E[Nt ] = Cr{Nt ≥ r}dr = Cr{Nt ≥ r}dr 0 ∞ X
n=1 ∞ X
n−1 ∞ X
t Cr ξ1 ≤ = Cr{Nt ≥ n} = Cr{Sn ≤ t} = . n n=1 n=1 n=1 The theorem is proved. Example 2.77: A renewal process Nt is called an equipossible renewal process if ξ1 , ξ2 , · · · are iid equipossible fuzzy variables (a, b) with a > 0. Then for each nonnegative integer n, we have 0, if t < na 0.5, if na ≤ t < nb (2.111) Cr{Nt ≥ n} = 1, if t ≥ nb,
121
Section 2.16 - Fuzzy Process
1 E[Nt ] = 2
t t + a b
(2.112)
where bxc represents the maximal integer less than or equal to x. Example 2.78: A renewal process Nt is called a triangular renewal process if ξ1 , ξ2 , · · · are iid triangular fuzzy variables (a, b, c) with a > 0. Then for each nonnegative integer n, we have
Cr{Nt ≥ n} =
if t ≤ na
0, t − na , 2n(b − a)
if na ≤ t ≤ nb
nc − 2nb + t , if nb ≤ t ≤ nc 2n(c − b) 1, if t ≥ nc.
(2.113)
Theorem 2.66 (Zhao and Liu [251], Renewal Theorem) Let Nt be a fuzzy renewal process with interarrival times ξ1 , ξ2 , · · · Then 1 E[Nt ] =E . t→∞ t ξ1 lim
(2.114)
Proof: Since ξ1 is a positive fuzzy variable and Nt is a nonnegative fuzzy variable, we have E[Nt ] = t
Z
∞
Cr
0
Z ∞ 1 1 E = ≥ r dr. Cr ξ1 ξ1 0
Nt ≥ r dr, t
It is easy to verify that Cr
Nt ≥r t
≤ Cr
1 ≥r ξ1
and lim Cr
t→∞
Nt ≥r t
= Cr
1 ≥r ξ1
for any real numbers t > 0 and r > 0. It follows from Lebesgue dominated convergence theorem that Z
∞
lim
t→∞
Cr
0
Hence (2.114) holds.
Z ∞ Nt 1 ≥ r dr = Cr ≥ r dr. t ξ1 0
122
Chapter 2 - Credibility Theory
C Process This subsection introduces a fuzzy counterpart of Brownian motion, and provides some basic mathematical properties. Definition 2.41 (Liu [133]) A fuzzy process Ct is said to be a C process if (i) C0 = 0, (ii) Ct has stationary and independent increments, (iii) every increment Cs+t − Cs is a normally distributed fuzzy variable with expected value et and variance σ 2 t2 . The parameters e and σ are called the drift and diffusion coefficients, respectively. The C process is said to be standard if e = 0 and σ = 1. Any C process may be represented by et + σCt where Ct is a standard C process. C. t
..... ....... .... ... ... ... ... ... ... ... . ... ... .. ..... .. ............. ... ..... . . . ... ...... ... ... .... ..... ... ... ... .. ... .. .. .. .. .. ... ... ... ..... .. .... ...... .. .... ........ .. ........ ... . . . .. .... ... . .... . . . . . . .. . . . . . . . ... ... .. ... ...... ..... .. ...... .. ..... .. ............. ... .. .... . ... .. ..... .... . ... ..... ... ....... . ....... ......... ... ... ... .... ... ..... .... ... .. ... ..... ... .. ... ... ... ..... ... .. ....... ... ... ... ... ...... .. .. ...... .... .... ... .... ........ ...... . . .. . . . ... . . . . . . . . . . .. .. ..... ..... ..... .... ..... .. ....... .... .... ... ... ...... ... ...... . ..... .. ... .... ... .. ... . ... ... .. ... ... ... ... ... ... ... ... ...... ... ...... ... .. ....... .. ...... ... . . .... ... .. . . ...... .... ... ... ..... ... ...... .... ...... ... ......... ... ...... ............. ..... ... .. ...... ..... ..... .... ...... .... ... .........................................................................................................................................................................................................................................................................................
t
Figure 2.8: A Sample Path of Standard C Process Perhaps the readers would like to know why the increment is a normally distributed fuzzy variable. The reason is that a normally distributed fuzzy variable has maximum entropy when its expected value and variance are given, just like a normally distributed random variable. Theorem 2.67 (Liu [133], Existence Theorem) There is a C process. Furthermore, each version of C process is sample-continuous. Proof: Without loss of generality, we only prove that there is a standard C process on the range of t ∈ [0, 1]. Let ξ(r) r represents rational numbers in [0, 1] be a countable sequence of independently and normally distributed fuzzy variables with expected value zero and variance one. For each integer n, we define a fuzzy process k X i k 1 ξ , if t = (k = 0, 1, · · · , n) n n n Xn (t) = i=1 linear, otherwise.
123
Section 2.16 - Fuzzy Process
Since the limit lim Xn (t)
n→∞
exists almost surely, we may verify that the limit meets the conditions of C process. Hence there is a standard C process. Remark 2.9: Suppose that Ct is a standard C process. It has been proved that X1 (t) = −Ct , (2.115) X2 (t) = aCt/a ,
(2.116)
X3 (t) = Ct+s − Cs
(2.117)
are each a version of standard C process. Dai [23] proved that almost all C paths are Lipschitz continuous and have a finite variation. Thus almost all C paths are differentiable almost everywhere and have zero squared variation. However, for any number ` ∈ (0, 0.5), there is a sample θ with Cr{θ} = ` such that Ct (θ) is not differentiable in a dense set of t. As a byproduct, C path is an example of Lipschitz continuous function whose nondifferentiable points are dense, while Brownian path is an example of continuous function that is nowhere differentiable. In other words, C path is the worst Lipschitz continuous function and Brownian path is the worst continuous function in the sense of differentiability. Example 2.79: Let Ct be a C process with drift 0. Then for any level x > 0 and any time t > 0, Dai [23] proved the following reflection principle, Cr
max Cs ≥ x = Cr{Ct ≥ x}.
0≤s≤t
In addition, for any level x < 0 and any time t > 0, we have Cr min Cs ≤ x = Cr{Ct ≤ x}. 0≤s≤t
(2.118)
(2.119)
Example 2.80: Let Ct be a C process with drift e > 0 and diffusion coefficient σ. Then the first passage time τ that the C process reaches the barrier x > 0 has the membership function µ(t) = 2 1 + exp
π|x − et| √ 6σt
whose expected value E[τ ] = +∞ (Dai [23]).
−1 ,
t>0
(2.120)
124
Chapter 2 - Credibility Theory
Definition 2.42 (Liu [133]) Let Ct be a standard C process. Then et + σCt is a C process, and the fuzzy process Gt = exp(et + σCt )
(2.121)
is called a geometric C process. Geometric C process Gt is employed to model stock prices. For each t > 0, Li [108] proved that Gt has a lognormal membership function −1 π| ln z − et| √ , z≥0 (2.122) µ(z) = 2 1 + exp 6σt whose expected value is √ √ E[Gt ] = exp(et) csc( 6σt) 6σt,
√ ∀t < π/( 6σ).
(2.123)
In addition, the first passage time that a geometric C process Gt reaches the barrier x > 1 is just the time that the C process with drift e and diffusion σ reaches ln x. A Basic Stock Model It was assumed that stock price follows geometric Brownian motion, and stochastic financial mathematics was then founded based on this assumption. Liu [133] presented an alternative assumption that stock price follows geometric C process. Based on this assumption, we obtain a basic stock model for fuzzy financial market in which the bond price Xt and the stock price Yt follow ( Xt = X0 exp(rt) (2.124) Yt = Y0 exp(et + σCt ) where r is the riskless interest rate, e is the stock drift, σ is the stock diffusion, and Ct is a standard C process. It is just a fuzzy counterpart of Black-Scholes stock model [8]. For exploring further development of fuzzy stock models, the interested readers may consult Gao [50], Peng [191], Qin and Li [194], and Zhu [267].
2.17
Fuzzy Calculus
Let Ct be a standard C process, and dt an infinitesimal time interval. Then dCt = Ct+dt − Ct is a fuzzy process such that, for each t, the dCt is a normally distributed fuzzy variable with E[dCt ] = 0, V [dCt ] = dt2 , E[dCt2 ] = dt2 ,
V [dCt2 ] ≈ 7dt4 .
125
Section 2.17 - Fuzzy Calculus
Definition 2.43 (Liu [133]) Let Xt be a fuzzy process and let Ct be a standard C process. For any partition of closed interval [a, b] with a = t1 < t2 < · · · < tk+1 = b, the mesh is written as ∆ = max |ti+1 − ti |. 1≤i≤k
Then the fuzzy integral of Xt with respect to Ct is Z b k X Xt dCt = lim Xti · (Cti+1 − Cti ) ∆→0
a
(2.125)
i=1
provided that the limit exists almost surely and is a fuzzy variable. Example 2.81: Let Ct be a standard C process. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have Z
k X
s
dCt = lim
∆→0
0
(Cti+1 − Cti ) ≡ Cs − C0 = Cs .
i=1
Example 2.82: Let Ct be a standard C process. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have sCs =
k X
ti+1 Cti+1 − ti Cti
i=1
=
k X
ti (Cti+1 − Cti ) +
i=1 Z s
→
k X
Cti+1 (ti+1 − ti )
i=1
Z
s
tdCt + 0
Ct dt 0
as ∆ → 0. It follows that Z
s
Z
s
tdCt = sCs − 0
Ct dt. 0
Example 2.83: Let Ct be a standard C process. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have Cs2 =
k X
Ct2i+1 − Ct2i
i=1
=
k X
Cti+1 − Cti
i=1
2
+2
k X i=1
Z →0+2
s
Ct dCt 0
Cti Cti+1 − Cti
126
Chapter 2 - Credibility Theory
as ∆ → 0. That is,
s
Z
Ct dCt = 0
1 2 C . 2 s
This equation shows that fuzzy integral does not behave like Ito integral. In fact, the fuzzy integral behaves like ordinary integrals. Theorem 2.68 (Liu [133]) Let Ct be a standard C process, and let h(t, c) be a continuously differentiable function. Define Xt = h(t, Ct ). Then we have the following chain rule dXt =
∂h ∂h (t, Ct )dt + (t, Ct )dCt . ∂t ∂c
(2.126)
Proof: Since the function h is continuously differentiable, by using Taylor series expansion, the infinitesimal increment of Xt has a first-order approximation ∂h ∂h ∆Xt = (t, Ct )∆t + (t, Ct )∆Ct . ∂t ∂c Hence we obtain the chain rule because it makes Z s Z s ∂h ∂h Xs = X0 + (t, Ct )dt + (t, Ct )dCt ∂t 0 0 ∂c for any s ≥ 0. Remark 2.10: The infinitesimal increment dCt in (2.126) may be replaced with the derived C process dYt = ut dt + vt dCt
(2.127)
where ut and vt are absolutely integrable fuzzy processes, thus producing dh(t, Yt ) =
∂h ∂h (t, Yt )dt + (t, Yt )dYt . ∂t ∂c
(2.128)
Remark 2.11: Assume that C1t , C2t , · · · , Cmt are standard C processes, and h(t, c1 , c2 , · · · , cm ) is a continuously differentiable function. Define Xt = h(t, C1t , C2t , · · · , Cmt ). You [243] proved the following chain rule dXt =
m X ∂h ∂h dt + dCit . ∂t ∂c i i=1
(2.129)
Example 2.84: Applying the chain rule, we obtain the following formula d(tCt ) = Ct dt + tdCt .
127
Section 2.18 - Fuzzy Differential Equation
Hence we have s
Z sCs =
s
Z d(tCt ) =
0
Z
s
Ct dt + 0
tdCt . 0
That is, s
Z
s
Z tdCt = sCs −
Ct dt. 0
0
Example 2.85: Let Ct be a standard C process. By using the chain rule d(Ct2 ) = 2Ct dCt , we get Cs2 =
s
Z
d(Ct2 ) = 2
s
Z
Ct dCt .
0
0
It follows that
s
Z
Ct dCt = 0
1 2 C . 2 s
Example 2.86: Let Ct be a standard C process. Then we have the following chain rule d(Ct3 ) = 3Ct2 dCt . Thus we obtain Cs3
s
Z
d(Ct3 )
=
Z =3
0
That is Z
s
Ct2 dCt .
0 s
Ct2 dCt =
0
1 3 C . 3 s
Theorem 2.69 (Liu [133], Integration by Parts) Suppose that Ct is a standard C process and F (t) is an absolutely continuous function. Then Z s Z s F (t)dCt = F (s)Cs − Ct dF (t). (2.130) 0
0
Proof: By defining h(t, Ct ) = F (t)Ct and using the chain rule, we get d(F (t)Ct ) = Ct dF (t) + F (t)dCt . Thus Z F (s)Cs =
Z d(F (t)Ct ) =
0
which is just (2.130).
s
s
Z Ct dF (t) +
0
s
F (t)dCt 0
128
2.18
Chapter 2 - Credibility Theory
Fuzzy Differential Equation
Fuzzy differential equation is a type of differential equation driven by C process. Definition 2.44 (Liu [133]) Suppose Ct is a standard C process, and f and g are some given functions. Then dXt = f (t, Xt )dt + g(t, Xt )dCt
(2.131)
is called a fuzzy differential equation. A solution is a fuzzy process Xt that satisfies (2.131) identically in t. Remark 2.12: Note that there is no precise definition for the terms dXt , dt and dCt in the fuzzy differential equation (2.131). The mathematically meaningful form is the fuzzy integral equation Z s Z s Xs = X0 + f (t, Xt )dt + g(t, Xt )dCt . (2.132) 0
0
However, the differential form is more convenient for us. This is the main reason why we accept the differential form. Example 2.87: Let Ct be a standard C process. Then the fuzzy differential equation dXt = adt + bdCt has a solution Xt = at + bCt which is just a C process with drift coefficient a and diffusion coefficient b. Example 2.88: Let Ct be a standard C process. Then the fuzzy differential equation dXt = aXt dt + bXt dCt has a solution Xt = exp (at + bCt ) which is just a geometric C process. Example 2.89: Let Ct be a standard C process. Then the fuzzy differential equations ( dXt = −Yt dCt dYt = Xt dCt have a solution (Xt , Yt ) = (cos Ct , sin Ct ) which is called a C process on unit circle since Xt2 + Yt2 ≡ 1.
Chapter 3
Chance Theory Fuzziness and randomness are two basic types of uncertainty. In many cases, fuzziness and randomness simultaneously appear in a system. In order to describe this phenomena, a fuzzy random variable was introduced by Kwakernaak [87][88] as a random element taking “fuzzy variable” values. In addition, a random fuzzy variable was proposed by Liu [124] as a fuzzy element taking “random variable” values. For example, it might be known that the lifetime of a modern engine is an exponentially distributed random variable with an unknown parameter. If the parameter is provided as a fuzzy variable, then the lifetime is a random fuzzy variable. More generally, a hybrid variable was introduced by Liu [130] as a tool to describe the quantities with fuzziness and randomness. Fuzzy random variable and random fuzzy variable are instances of hybrid variable. In order to measure hybrid events, a concept of chance measure was introduced by Li and Liu [103]. Chance theory is a hybrid of probability theory and credibility theory. Perhaps the reader would like to know what axioms we should assume for chance theory. In fact, chance theory will be based on the three axioms of probability and four axioms of credibility. The emphasis in this chapter is mainly on chance space, hybrid variable, chance measure, chance distribution, independence, identical distribution, expected value, variance, moments, critical values, entropy, distance, convergence almost surely, convergence in chance, convergence in mean, convergence in distribution, conditional chance, hybrid process, hybrid calculus, and hybrid differential equation.
3.1
Chance Space
Chance theory begins with the concept of chance space that inherits the mathematical foundations of both probability theory and credibility theory.
130
Chapter 3 - Chance Theory
Definition 3.1 (Liu [130]) Suppose that (Θ, P, Cr) is a credibility space and (Ω, A, Pr) is a probability space. The product (Θ, P, Cr) × (Ω, A, Pr) is called a chance space. The universal set Θ × Ω is clearly the set of all ordered pairs of the form (θ, ω), where θ ∈ Θ and ω ∈ Ω. What is the product σ-algebra P × A? What is the product measure Cr × Pr? Let us discuss these two basic problems. What is the product σ-algebra
P × A?
Generally speaking, it is not true that all subsets of Θ × Ω are measurable. Let Λ be a subset of Θ × Ω. Write (3.1) Λ(θ) = ω ∈ Ω (θ, ω) ∈ Λ . It is clear that Λ(θ) is a subset of Ω. If Λ(θ) ∈ A holds for each θ ∈ Θ, then Λ may be regarded as a measurable set. Definition 3.2 (Liu [132]) Let (Θ, P, Cr) × (Ω, A, Pr) be a chance space. A subset Λ ⊂ Θ × Ω is called an event if Λ(θ) ∈ A for each θ ∈ Θ. Example 3.1: Empty set ∅ and universal set Θ × Ω are clearly events. ∈ A. Then X × Y is a subset of Θ × Ω. ( Y, if θ ∈ X (X × Y )(θ) = ∅, if θ ∈ X c
Example 3.2: Let X ∈ Since the set
P and Y
A for each θ ∈ Θ, the rectangle X × Y is an event. Theorem 3.1 (Liu [132]) Let (Θ, P, Cr) × (Ω, A, Pr) be a chance space. The class of all events is a σ-algebra over Θ × Ω, and denoted by P × A. Proof: At first, it is obvious that Θ×Ω ∈ P × A. For any event Λ, we always is in the σ-algebra
have
Λ(θ) ∈ A,
∀θ ∈ Θ.
Thus for each θ ∈ Θ, the set Λc (θ) = ω ∈ Ω (θ, ω) ∈ Λc = (Λ(θ))c ∈ A which implies that Λc ∈ P × A. Finally, let Λ1 , Λ2 , · · · be events. Then for each θ ∈ Θ, we have ! ( ) ∞ ∞ ∞ [ [ [ Λi (θ) = ω ∈ Ω (θ, ω) ∈ Λi = ω ∈ Ω (θ, ω) ∈ Λi ∈ A. i=1
i=1
i=1
That is, the countable union ∪i Λi ∈ P × A. Hence
P × A is a σ-algebra.
131
Section 3.1 - Chance Space
Example 3.3: When Θ × Ω is countable, usually
P × A is the power set.
Example 3.4: When Θ×Ω is uncountable, for example Θ×Ω = <2 , usually P × A is a σ-algebra between the Borel algebra and power set of <2 . Let X be a nonempty Borel set and let Y be a non-Borel set of real numbers. It follows from X × Y 6∈ P × A that P × A is smaller than the power set. It is also clear that Y × X ∈ P × A but Y × X is not a Borel set. Hence P × A is larger than the Borel algebra. What is the product measure Cr × Pr? Product probability is a probability measure, and product credibility is a credibility measure. What is the product measure Cr × Pr? We will call it chance measure and define it as follows. Definition 3.3 (Li and Liu [103]) Let (Θ, P, Cr) × (Ω, A, Pr) be a chance space. Then a chance measure of an event Λ is defined as sup(Cr{θ} ∧ Pr{Λ(θ)}), θ∈Θ if sup(Cr{θ} ∧ Pr{Λ(θ)}) < 0.5 θ∈Θ (3.2) Ch{Λ} = 1 − sup(Cr{θ} ∧ Pr{Λc (θ)}), θ∈Θ if sup(Cr{θ} ∧ Pr{Λ(θ)}) ≥ 0.5. θ∈Θ
Example 3.5: Take a credibility space (Θ, P, Cr) to be {θ1 , θ2 } with Cr{θ1 } = 0.6 and Cr{θ2 } = 0.4, and take a probability space (Ω, A, Pr) to be {ω1 , ω2 } with Pr{ω1 } = 0.7 and Pr{ω2 } = 0.3. Then Ch{(θ1 , ω1 )} = 0.6,
Ch{(θ2 , ω2 )} = 0.3.
Example 3.6: Take a credibility space (Θ, P, Cr) to be [0, 1] with Cr{θ} ≡ 1/2, and take a probability space (Ω, A, Pr) to be [0, 1] with Borel algebra and Lebesgue measure. Then for any real numbers a, b ∈ [0, 1], we have b, if b < 0.5 0.5, if 0.5 ≤ b < 1 Ch{[0, a] × [0, b]} = 1, if a = b = 1. Theorem 3.2 Let (Θ, P, Cr) × (Ω, A, Pr) be a chance space and Ch a chance measure. Then we have Ch{∅} = 0, (3.3)
for any event Λ.
Ch{Θ × Ω} = 1,
(3.4)
0 ≤ Ch{Λ} ≤ 1
(3.5)
132
Chapter 3 - Chance Theory
Proof: It follows from the definition immediately. Theorem 3.3 Let (Θ, P, Cr) × (Ω, A, Pr) be a chance space and Ch a chance measure. Then for any event Λ, we have sup(Cr{θ} ∧ Pr{Λ(θ)}) ∨ sup(Cr{θ} ∧ Pr{Λc (θ)}) ≥ 0.5, θ∈Θ
(3.6)
θ∈Θ
sup(Cr{θ} ∧ Pr{Λ(θ)}) + sup(Cr{θ} ∧ Pr{Λc (θ)}) ≤ 1, θ∈Θ
(3.7)
θ∈Θ
sup(Cr{θ} ∧ Pr{Λ(θ)}) ≤ Ch{Λ} ≤ 1 − sup(Cr{θ} ∧ Pr{Λc (θ)}). θ∈Θ
(3.8)
θ∈Θ
Proof: It follows from the basic properties of probability and credibility that sup(Cr{θ} ∧ Pr{Λ(θ)}) ∨ sup(Cr{θ} ∧ Pr{Λc (θ)}) θ∈Θ
θ∈Θ
≥ sup(Cr{θ} ∧ (Pr{Λ(θ)} ∨ Pr{Λc (θ)})) θ∈Θ
≥ sup Cr{θ} ∧ 0.5 = 0.5 θ∈Θ
and
sup(Cr{θ} ∧ Pr{Λ(θ)}) + sup(Cr{θ} ∧ Pr{Λc (θ)}) θ∈Θ
=
θ∈Θ
sup (Cr{θ1 } ∧ Pr{Λ(θ1 )} + Cr{θ2 } ∧ Pr{Λc (θ2 )}) θ1 ,θ2 ∈Θ
≤ sup (Cr{θ1 } + Cr{θ2 }) ∨ sup(Pr{Λ(θ)} + Pr{Λc (θ)}) θ1 6=θ2
θ∈Θ
≤ 1 ∨ 1 = 1. The inequalities (3.8) follows immediately from the above inequalities and the definition of chance measure. Theorem 3.4 (Li and Liu [103]) The chance measure is increasing. That is, Ch{Λ1 } ≤ Ch{Λ2 } (3.9) for any events Λ1 and Λ2 with Λ1 ⊂ Λ2 . Proof: Since Λ1 (θ) ⊂ Λ2 (θ) and Λc2 (θ) ⊂ Λc1 (θ) for each θ ∈ Θ, we have sup(Cr{θ} ∧ Pr{Λ1 (θ)}) ≤ sup(Cr{θ} ∧ Pr{Λ2 (θ)}), θ∈Θ
θ∈Θ
sup(Cr{θ} ∧ θ∈Θ
Pr{Λc2 (θ)})
≤ sup(Cr{θ} ∧ Pr{Λc1 (θ)}). θ∈Θ
The argument breaks down into three cases. Case 1: sup(Cr{θ} ∧ Pr{Λ2 (θ)}) < 0.5. For this case, we have θ∈Θ
sup(Cr{θ} ∧ Pr{Λ1 (θ)}) < 0.5, θ∈Θ
133
Section 3.1 - Chance Space
Ch{Λ2 } = sup(Cr{θ} ∧ Pr{Λ2 (θ)}) ≥ sup(Cr{θ} ∧ Pr{Λ1 (θ)} = Ch{Λ1 }. θ∈Θ
θ∈Θ
Case 2: sup(Cr{θ} ∧ Pr{Λ2 (θ)}) ≥ 0.5 and sup(Cr{θ} ∧ Pr{Λ1 (θ)}) < 0.5. θ∈Θ
θ∈Θ
It follows from Theorem 3.3 that Ch{Λ2 } ≥ sup(Cr{θ} ∧ Pr{Λ2 (θ)}) ≥ 0.5 > Ch{Λ1 }. θ∈Θ
Case 3: sup(Cr{θ} ∧ Pr{Λ2 (θ)}) ≥ 0.5 and sup(Cr{θ} ∧ Pr{Λ1 (θ)}) ≥ 0.5. θ∈Θ
θ∈Θ
For this case, we have Ch{Λ2 } = 1−sup(Cr{θ}∧Pr{Λc2 (θ)}) ≥ 1−sup(Cr{θ}∧Pr{Λc1 (θ)}) = Ch{Λ1 }. θ∈Θ
θ∈Θ
Thus Ch is an increasing measure. Theorem 3.5 (Li and Liu [103]) The chance measure is self-dual. That is, Ch{Λ} + Ch{Λc } = 1
(3.10)
for any event Λ. Proof: For any event Λ, please note that c if sup(Cr{θ} ∧ Pr{Λc (θ)}) < 0.5 sup(Cr{θ} ∧ Pr{Λ (θ)}), θ∈Θ θ∈Θ c Ch{Λ } = 1 − sup(Cr{θ} ∧ Pr{Λ(θ)}), if sup(Cr{θ} ∧ Pr{Λc (θ)}) ≥ 0.5. θ∈Θ
θ∈Θ
The argument breaks down into three cases. Case 1: sup(Cr{θ} ∧ Pr{Λ(θ)}) < 0.5. For this case, we have θ∈Θ
sup(Cr{θ} ∧ Pr{Λc (θ)}) ≥ 0.5, θ∈Θ
Ch{Λ} + Ch{Λc } = sup(Cr{θ} ∧ Pr{Λ(θ)}) + 1 − sup(Cr{θ} ∧ Pr{Λ(θ)}) = 1. θ∈Θ
θ∈Θ
Case 2: sup(Cr{θ} ∧ Pr{Λ(θ)}) ≥ 0.5 and sup(Cr{θ} ∧ Pr{Λc (θ)}) < 0.5. θ∈Θ
θ∈Θ
For this case, we have Ch{Λ}+Ch{Λc } = 1−sup(Cr{θ}∧Pr{Λc (θ)})+sup(Cr{θ}∧Pr{Λc (θ)}) = 1. θ∈Θ
θ∈Θ
Case 3: sup(Cr{θ} ∧ Pr{Λ(θ)}) ≥ 0.5 and sup(Cr{θ} ∧ Pr{Λc (θ)}) ≥ 0.5. θ∈Θ
θ∈Θ
For this case, it follows from Theorem 3.3 that sup(Cr{θ} ∧ Pr{Λ(θ)}) = sup(Cr{θ} ∧ Pr{Λc (θ)}) = 0.5. θ∈Θ
θ∈Θ
Hence Ch{Λ} + Ch{Λc } = 0.5 + 0.5 = 1. The theorem is proved.
134
Chapter 3 - Chance Theory
Theorem 3.6 (Li and Liu [103]) For any event X × Y , we have Ch{X × Y } = Cr{X} ∧ Pr{Y }.
(3.11)
Proof: The argument breaks down into three cases. Case 1: Cr{X} < 0.5. For this case, we have sup Cr{θ} ∧ Pr{Y } = Cr{X} ∧ Cr{Y } < 0.5, θ∈X
Ch{X × Y } = sup Cr{θ} ∧ Pr{Y } = Cr{X} ∧ Pr{Y }. θ∈X
Case 2: Cr{X} ≥ 0.5 and Pr{Y } < 0.5. Then we have sup Cr{θ} ≥ 0.5, θ∈X
sup Cr{θ} ∧ Pr{Y } = Pr{Y } < 0.5, θ∈X
Ch{X × Y } = sup Cr{θ} ∧ Pr{Y } = Pr{Y } = Cr{X} ∧ Pr{Y }. θ∈X
Case 3: Cr{X} ≥ 0.5 and Pr{Y } ≥ 0.5. Then we have sup (Cr{θ} ∧ Pr{(X × Y )(θ)}) ≥ sup Cr{θ} ∧ Pr{Y } ≥ 0.5, θ∈Θ
θ∈X
Ch{X × Y } = 1 − sup (Cr{θ} ∧ Pr{(X × Y )c (θ)}) = Cr{X} ∧ Pr{Y }. θ∈Θ
The theorem is proved. Example 3.7: It follows from Theorem 3.6 that for any events X × Ω and Θ × Y , we have Ch{X × Ω} = Cr{X},
Ch{Θ × Y } = Pr{Y }.
(3.12)
Theorem 3.7 (Li and Liu [103], Chance Subadditivity Theorem) The chance measure is subadditive. That is, Ch{Λ1 ∪ Λ2 } ≤ Ch{Λ1 } + Ch{Λ2 }
(3.13)
for any events Λ1 and Λ2 . In fact, chance measure is not only finitely subadditive but also countably subadditive. Proof: The proof breaks down into three cases. Case 1: Ch{Λ1 ∪ Λ2 } < 0.5. Then Ch{Λ1 } < 0.5, Ch{Λ2 } < 0.5 and Ch{Λ1 ∪ Λ2 } = sup(Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )(θ)}) θ∈Θ
≤ sup(Cr{θ} ∧ (Pr{Λ1 (θ)} + Pr{Λ2 (θ)})) θ∈Θ
≤ sup(Cr{θ} ∧ Pr{Λ1 (θ)} + Cr{θ} ∧ Pr{Λ2 (θ)}) θ∈Θ
≤ sup(Cr{θ} ∧ Pr{Λ1 (θ)}) + sup(Cr{θ} ∧ Pr{Λ2 (θ)}) θ∈Θ
= Ch{Λ1 } + Ch{Λ2 }.
θ∈Θ
135
Section 3.1 - Chance Space
Case 2: Ch{Λ1 ∪ Λ2 } ≥ 0.5 and Ch{Λ1 } ∨ Ch{Λ2 } < 0.5. We first have sup(Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )(θ)}) ≥ 0.5. θ∈Θ
For any sufficiently small number ε > 0, there exists a point θ such that Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )(θ)} > 0.5 − ε > Ch{Λ1 } ∨ Ch{Λ2 }, Cr{θ} > 0.5 − ε > Pr{Λ1 (θ)}, Cr{θ} > 0.5 − ε > Pr{Λ2 (θ)}. Thus we have Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)} + Cr{θ} ∧ Pr{Λ1 (θ)} + Cr{θ} ∧ Pr{Λ2 (θ)} = Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)} + Pr{Λ1 (θ)} + Pr{Λ2 (θ)} ≥ Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)} + Pr{(Λ1 ∪ Λ2 )(θ)} ≥ 1 − 2ε because if Cr{θ} ≥ Pr{(Λ1 ∪ Λ2 )c (θ)}, then Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)} + Pr{(Λ1 ∪ Λ2 )(θ)} = Pr{(Λ1 ∪ Λ2 )c (θ)} + Pr{(Λ1 ∪ Λ2 )(θ)} = 1 ≥ 1 − 2ε and if Cr{θ} < Pr{(Λ1 ∪ Λ2 )c (θ)}, then Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)} + Pr{(Λ1 ∪ Λ2 )(θ)} = Cr{θ} + Pr{(Λ1 ∪ Λ2 )(θ)} ≥ (0.5 − ε) + (0.5 − ε) = 1 − 2ε. Taking supremum on both sides and letting ε → 0, we obtain Ch{Λ1 ∪ Λ2 } = 1 − sup(Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)}) θ∈Θ
≤ sup(Cr{θ} ∧ Pr{Λ1 (θ)}) + sup(Cr{θ} ∧ Pr{Λ2 (θ)}) θ∈Θ
θ∈Θ
= Ch{Λ1 } + Ch{Λ2 }. Case 3: Ch{Λ1 ∪ Λ2 } ≥ 0.5 and Ch{Λ1 } ∨ Ch{Λ2 } ≥ 0.5. Without loss of generality, suppose Ch{Λ1 } ≥ 0.5. For each θ, we first have Cr{θ} ∧ Pr{Λc1 (θ)} = Cr{θ} ∧ Pr{(Λc1 (θ) ∩ Λc2 (θ)) ∪ (Λc1 (θ) ∩ Λ2 (θ))} ≤ Cr{θ} ∧ (Pr{(Λ1 ∪ Λ2 )c (θ)} + Pr{Λ2 (θ)}) ≤ Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)} + Cr{θ} ∧ Pr{Λ2 (θ)},
136
Chapter 3 - Chance Theory
i.e., Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)} ≥ Cr{θ} ∧ Pr{Λc1 (θ)} − Cr{θ} ∧ Pr{Λ2 (θ)}. It follows from Theorem 3.3 that Ch{Λ1 ∪ Λ2 } = 1 − sup(Cr{θ} ∧ Pr{(Λ1 ∪ Λ2 )c (θ)}) θ∈Θ
≤ 1 − sup(Cr{θ} ∧ Pr{Λc1 (θ)}) + sup(Cr{θ} ∧ Pr{Λ2 (θ)}) θ∈Θ
θ∈Θ
≤ Ch{Λ1 } + Ch{Λ2 }. The theorem is proved. Remark 3.1: For any events Λ1 and Λ2 , it follows from the chance subadditivity theorem that the chance measure is null-additive, i.e., Ch{Λ1 ∪ Λ2 } = Ch{Λ1 } + Ch{Λ2 } if either Ch{Λ1 } = 0 or Ch{Λ2 } = 0. Theorem 3.8 Let {Λi } be a decreasing sequence of events with Ch{Λi } → 0 as i → ∞. Then for any event Λ, we have lim Ch{Λ ∪ Λi } = lim Ch{Λ\Λi } = Ch{Λ}.
i→∞
i→∞
(3.14)
Proof: Since chance measure is increasing and subadditive, we immediately have Ch{Λ} ≤ Ch{Λ ∪ Λi } ≤ Ch{Λ} + Ch{Λi } for each i. Thus we get Ch{Λ ∪ Λi } → Ch{Λ} by using Ch{Λi } → 0. Since (Λ\Λi ) ⊂ Λ ⊂ ((Λ\Λi ) ∪ Λi ), we have Ch{Λ\Λi } ≤ Ch{Λ} ≤ Ch{Λ\Λi } + Ch{Λi }. Hence Ch{Λ\Λi } → Ch{Λ} by using Ch{Λi } → 0. Theorem 3.9 (Li and Liu [103], Chance Semicontinuity Law) For events Λ1 , Λ2 , · · · , we have n o lim Ch{Λi } = Ch lim Λi (3.15) i→∞
i→∞
if one of the following conditions is satisfied: (a) Ch{Λ} ≤ 0.5 and Λi ↑ Λ; (b) lim Ch{Λi } < 0.5 and Λi ↑ Λ; i→∞
(c) Ch{Λ} ≥ 0.5 and Λi ↓ Λ;
(d) lim Ch{Λi } > 0.5 and Λi ↓ Λ. i→∞
Proof: (a) Assume Ch{Λ} ≤ 0.5 and Λi ↑ Λ. We first have Ch{Λ} = sup(Cr{θ} ∧ Pr{Λ(θ)}), θ∈Θ
Ch{Λi } = sup(Cr{θ} ∧ Pr{Λi (θ)}) θ∈Θ
for i = 1, 2, · · · For each θ ∈ Θ, since Λi (θ) ↑ Λ(θ), it follows from the probability continuity theorem that lim Cr{θ} ∧ Pr{Λi (θ)} = Cr{θ} ∧ Pr{Λ(θ)}.
i→∞
137
Section 3.2 - Hybrid Variables
Taking supremum on both sides, we obtain lim sup(Cr{θ} ∧ Pr{Λi (θ)}) = sup(Cr{θ} ∧ Pr{Λ(θ}).
i→∞ θ∈Θ
θ∈Θ
The part (a) is verified. (b) Assume limi→∞ Ch{Λi } < 0.5 and Λi ↑ Λ. For each θ ∈ Θ, since Cr{θ} ∧ Pr{Λ(θ)} = lim Cr{θ} ∧ Pr{Λi (θ)}, i→∞
we have sup(Cr{θ} ∧ Pr{Λ(θ)}) ≤ lim sup(Cr{θ} ∧ Pr{Λi (θ)}) < 0.5. i→∞ θ∈Θ
θ∈Θ
It follows that Ch{Λ} < 0.5 and the part (b) holds by using (a). (c) Assume Ch{Λ} ≥ 0.5 and Λi ↓ Λ. We have Ch{Λc } ≤ 0.5 and Λci ↑ Λc . It follows from (a) that lim Ch{Λi } = 1 − lim Ch{Λci } = 1 − Ch{Λc } = Ch{Λ}.
i→∞
i→∞
(d) Assume limi→∞ Ch{Λi } > 0.5 and Λi ↓ Λ. We have lim Ch{Λci } < i→∞
0.5 and Λci ↑ Λc . It follows from (b) that
lim Ch{Λi } = 1 − lim Ch{Λci } = 1 − Ch{Λc } = Ch{Λ}.
i→∞
i→∞
The theorem is proved. Theorem 3.10 (Chance Asymptotic Theorem) For any events Λ1 , Λ2 , · · · , we have lim Ch{Λi } ≥ 0.5, if Λi ↑ Θ × Ω, (3.16) i→∞
lim Ch{Λi } ≤ 0.5,
i→∞
if Λi ↓ ∅.
(3.17)
Proof: Assume Λi ↑ Θ × Ω. If limi→∞ Ch{Λi } < 0.5, it follows from the chance semicontinuity law that Ch{Θ × Ω} = lim Ch{Λi } < 0.5 i→∞
which is in contradiction with Cr{Θ × Ω} = 1. The first inequality is proved. The second one may be verified similarly.
3.2
Hybrid Variables
Recall that a random variable is a measurable function from a probability space to the set of real numbers, and a fuzzy variable is a function from a credibility space to the set of real numbers. In order to describe a quantity with both fuzziness and randomness, we introduce a concept of hybrid variable as follows.
138
Chapter 3 - Chance Theory ............................................................................... ........... ............... .......... ........ ........ ....... ....... ...... . . . . .... ..... ... . ... .... ... .. ... .. .. .... . . . . . . ....... ..... .. . . ........ . . . ...... ........... . . . . . ... ...... ....... .. .... .... .......... .......................... .. .......... ..... ..................................................................................... ... ... .. .. ... . . ... ... .... ... . .. ......................................................... ......................................................... . . ... ... ... . .. ... . ... ... . . ... ... ........ ... . ... .. . ... . ... ... . . ..... ... ... . . . . .. . . . ... ... . . ..... ... .... ... .. ... . . . . . ... ... . . . . ..... ... .... ... .. ... . . . . ........ . . .... . ..... ..................................................................... ... . ............................................................. . . . . . . ..... ... ... . . . .. . . . . . . . ..... ... ..... ..... ... ... ..... ..... ... ... ..... ..... ... . ... ..... ... .................................................................................................................... .. . ... .. .. . . ... .. .. . . ... .. .. ... . . .. ... .. . . ..... ... . .. . . . . ... ... ........ . . .. . . . . ...... ... ..... . .. . . . . . . . ...... ... ............................................ .............................. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......... ...... ............. ..... ... . ......... . . . . . . . . . . . . . . . . . . . . . . . . ...... ... ........ ...... .... .... .... ... . . . . . . . . ..... . . . . . . ... . . . . . . . ........... ......... .. ..... .. ... ... ..... ... .............. .. ... ...... ...... . . ... . ... . . . . . . .... . . ... ... .. .... . .. . . ... ... . . . .... . . .. ... .. ... ... ... . . ... . ... . . ... . . . . ... ... . . . . . . . ..... . . . ..... . ..... ..... ...... ..... ...... ..... ..... ....... ........ ...... ....... ......... ....... ........... ......... ................ ............................................ ...............................
Set of Real Numbers • • •
Fuzzy Variable
Hybrid Variable
• • Credibility Space
Random Variable
• • Probability Space
Figure 3.1: Graphical Representation of Hybrid Variable Definition 3.4 (Liu [130]) A hybrid variable is a measurable function from a chance space (Θ, P, Cr) × (Ω, A, Pr) to the set of real numbers, i.e., for any Borel set B of real numbers, the set {ξ ∈ B} = {(θ, ω) ∈ Θ × Ω ξ(θ, ω) ∈ B} (3.18) is an event. Remark 3.2: A hybrid variable degenerates to a fuzzy variable if the value of ξ(θ, ω) does not vary with ω. For example, ξ(θ, ω) = θ,
ξ(θ, ω) = θ2 + 1,
ξ(θ, ω) = sin θ.
Remark 3.3: A hybrid variable degenerates to a random variable if the value of ξ(θ, ω) does not vary with θ. For example, ξ(θ, ω) = ω,
ξ(θ, ω) = ω 2 + 1,
ξ(θ, ω) = sin ω.
Remark 3.4: For each fixed θ ∈ Θ, it is clear that the hybrid variable ξ(θ, ω) is a measurable function from the probability space (Ω, A, Pr) to the set of real numbers. Thus it is a random variable and we will denote it by ξ(θ, ·). Then a hybrid variable ξ(θ, ω) may also be regarded as a function from a credibility space (Θ, P, Cr) to the set {ξ(θ, ·)|θ ∈ Θ} of random variables. Thus ξ is a random fuzzy variable defined by Liu [124]. Remark 3.5: For each fixed ω ∈ Ω, it is clear that the hybrid variable ξ(θ, ω) is a function from the credibility space (Θ, P, Cr) to the set of real
139
Section 3.2 - Hybrid Variables
numbers. Thus it is a fuzzy variable and we will denote it by ξ(·, ω). Then a hybrid variable ξ(θ, ω) may be regarded as a function from a probability space (Ω, A, Pr) to the set {ξ(·, ω)|ω ∈ Ω} of fuzzy variables. If Cr{ξ(·, ω) ∈ B} is a measurable function of ω for any Borel set B of real numbers, then ξ is a fuzzy random variable in the sense of Liu and Liu [140]. Model I If a ˜ is a fuzzy variable and η is a random variable, then the sum ξ = a ˜ + η is a hybrid variable. The product ξ = a ˜ · η is also a hybrid variable. Generally speaking, if f : <2 → < is a measurable function, then ξ = f (˜ a, η)
(3.19)
is a hybrid variable. Suppose that a ˜ has a membership function µ, and η has a probability density function φ. Then for any Borel set B of real numbers, Qin and Liu [195] proved the following formula, ! Z µ(x) sup ∧ φ(y)dy , 2 x f (x,y)∈B ! Z µ(x) if sup ∧ φ(y)dy < 0.5 2 x f (x,y)∈B Ch{f (˜ a, η) ∈ B} = ! Z µ(x) ∧ φ(y)dy , 1 − sup 2 x f (x,y)∈B c ! Z µ(x) ∧ φ(y)dy ≥ 0.5. if sup 2 x f (x,y)∈B More generally, let a ˜1 , a ˜2 , · · · , a ˜m be fuzzy variables, and let η1 , η2 , · · · , ηn be random variables. If f : <m+n → < is a measurable function, then ξ = f (˜ a1 , a ˜2 , · · · , a ˜m ; η1 , η2 , · · · , ηn )
(3.20)
is a hybrid variable. The chance Ch{f (˜ a1 , a ˜2 , · · · , a ˜m ; η1 , η2 , · · · , ηn ) ∈ B} may be calculated in a similar way provided that µ is the joint membership function and φ is the joint probability density function. Model II Let a ˜1 , a ˜2 , · · · , a ˜m be fuzzy variables, and let p1 , p2 , · · · , pm be nonnegative numbers with p1 + p2 + · · · + pm = 1. Then a ˜1 with probability p1 a ˜2 with probability p2 ξ= (3.21) · ·· a ˜m with probability pm
140
Chapter 3 - Chance Theory
is clearly a hybrid variable. If a ˜1 , a ˜2 , · · · , a ˜m have membership functions µ1 , µ2 , · · · , µm , respectively, then for any Borel set B of real numbers, we have ([195]) ! X m µ (x ) i i ∧ {pi | xi ∈ B} , sup min 1≤i≤m 2 x1 ,x2 ··· ,xm i=1 ! X m µ (x ) i i if sup min ∧ {pi | xi ∈ B} < 0.5 1≤i≤m 2 x1 ,x2 ··· ,xm i=1 Ch{ξ ∈ B} = ! X m µi (xi ) c ∧ {pi | xi ∈ B } , 1− sup min 1≤i≤m 2 x1 ,x2 ··· ,xm i=1 ! X m µ (x ) i i ∧ {pi | xi ∈ B} ≥ 0.5. min if x ,xsup 1≤i≤m 2 1 2 ··· ,xm i=1 Model III Let η1 , η2 , · · · , ηm be random variables, and let u1 , u2 , · · · , um be nonnegative numbers with u1 ∨ u2 ∨ · · · ∨ um = 1. Then η1 with membership degree u1 η2 with membership degree u2 (3.22) ξ= ··· ηm with membership degree um is clearly a hybrid variable. If η1 , η2 , · · · , ηm have probability density functions φ1 , φ2 , · · · , φm , respectively, then for any Borel set B of real numbers, we have ([195]) Z ui max ∧ φ (x)dx , i 1≤i≤m 2 B Z ui if max ∧ φ (x)dx < 0.5 i 1≤i≤m 2 B Ch{ξ ∈ B} = Z ui 1 − max ∧ φ (x)dx , i 1≤i≤m 2 Bc Z ui ∧ φi (x)dx ≥ 0.5. if max 1≤i≤m 2 B Model IV In many statistics problems, the probability density function is completely known except for the values of one or more parameters. For example, it
Section 3.2 - Hybrid Variables
141
might be known that the lifetime ξ of a modern engine is an exponentially distributed random variable with an unknown expected value β. Usually, there is some relevant information in practice. It is thus possible to specify an interval in which the value of β is likely to lie, or to give an approximate estimate of the value of β. It is typically not possible to determine the value of β exactly. If the value of β is provided as a fuzzy variable, then ξ is a hybrid variable. More generally, suppose that ξ has a probability density function φ(x; a ˜1 , a ˜2 , · · · , a ˜m ), x ∈ < (3.23) in which the parameters a ˜1 , a ˜2 , · · · , a ˜m are fuzzy variables rather than crisp numbers. Then ξ is a hybrid variable provided that φ(x; y1 , y2 , · · · , ym ) is a probability density function for any (y1 , y2 , · · · , ym ) that (˜ a1 , a ˜2 , · · · , a ˜m ) may take. If a ˜1 , a ˜2 , · · · , a ˜m have membership functions µ1 , µ2 , · · · , µm , respectively, then for any Borel set B of real numbers, the chance Ch{ξ ∈ B} is ([195]) Z µi (yi ) sup min ∧ φ(x; y , y , · · · , y )dx , 1 2 m 1≤i≤m 2 y1 ,y2 ··· ,ym B Z µi (yi ) ∧ φ(x; y , y , · · · , y )dx < 0.5 if sup min 1 2 m 1≤i≤m 2 y1 ,y2 ,··· ,ym B Z µi (yi ) 1 − sup min ∧ φ(x; y1 , y2 , · · · , ym )dx , 1≤i≤m 2 y1 ,y2 ··· ,ym Bc Z µi (yi ) ∧ φ(x; y1 , y2 , · · · , ym )dx ≥ 0.5. if sup min 1≤i≤m 2 y1 ,y2 ,··· ,ym B Model V Suppose a fuzzy variable ξ has a normal membership function with unknown expected value e and variance σ. If e and σ are provided as random variables, then ξ is a hybrid variable. More generally, suppose that ξ has a membership function µ(x; η1 , η2 , · · · , ηm ), x ∈ < (3.24) in which the parameters η1 , η2 , · · · , ηm are random variables rather than deterministic numbers. Then ξ is a hybrid variable if µ(x; y1 , y2 , · · · , ym ) is a membership function for any (y1 , y2 , · · · , ym ) that (η1 , η2 , · · · , ηm ) may take. When are two hybrid variables equal to each other? Definition 3.5 Let ξ1 and ξ2 be hybrid variables defined on the chance space (Θ, P, Cr) × (Ω, A, Pr). We say ξ1 = ξ2 if ξ1 (θ, ω) = ξ2 (θ, ω) for almost all (θ, ω) ∈ Θ × Ω.
142
Chapter 3 - Chance Theory
Hybrid Vectors Definition 3.6 An n-dimensional hybrid vector is a measurable function from a chance space (Θ, P, Cr) × (Ω, A, Pr) to the set of n-dimensional real vectors, i.e., for any Borel set B of
i=1
is an event. Next, the class B is a σ-algebra of
is an event. This means that B c ∈ B; (iii) if Bi ∈ B for i = 1, 2, · · · , then {(θ, ω) ∈ Θ × Ω|ξ(θ, ω) ∈ Bi } are events and ( ) ∞ ∞ [ [ (θ, ω) ∈ Θ × Ω ξ(θ, ω) ∈ Bi = {(θ, ω) ∈ Θ × Ω ξ(θ, ω) ∈ Bi } i=1
i=1
is an event. This means that ∪i Bi ∈ B. Since the smallest σ-algebra containing all open intervals of
143
Section 3.3 - Chance Distribution
Hybrid Arithmetic Definition 3.7 Let f :
∀(θ, ω) ∈ Θ × Ω.
(3.26)
Example 3.8: Let ξ1 and ξ2 be two hybrid variables. Then the sum ξ = ξ1 + ξ2 is a hybrid variable defined by ξ(θ, ω) = ξ1 (θ, ω) + ξ2 (θ, ω),
∀(θ, ω) ∈ Θ × Ω.
The product ξ = ξ1 ξ2 is also a hybrid variable defined by ξ(θ, ω) = ξ1 (θ, ω) · ξ2 (θ, ω),
∀(θ, ω) ∈ Θ × Ω.
Theorem 3.12 Let ξ be an n-dimensional hybrid vector, and f :
3.3
Chance Distribution
Chance distribution has been defined in several ways. Yang and Liu [240] presented the concept of chance distribution of fuzzy random variables, and Zhu and Liu [261] proposed the chance distribution of random fuzzy variables. Li and Liu [103] gave the following definition of chance distribution of hybrid variables. Definition 3.8 The chance distribution Φ: < → [0, 1] of a hybrid variable ξ is defined by Φ(x) = Ch (θ, ω) ∈ Θ × Ω ξ(θ, ω) ≤ x . (3.27) Example 3.9: Let η be a random variable on a probability space (Ω, A, Pr). It is clear that η may be regarded as a hybrid variable on the chance space (Θ, P, Cr) × (Ω, A, Pr) as follows, ξ(θ, ω) = η(ω),
∀(θ, ω) ∈ Θ × Ω.
144
Chapter 3 - Chance Theory
Thus its chance distribution is Φ(x) = Ch (θ, ω) ∈ Θ × Ω ξ(θ, ω) ≤ x = Ch Θ × {ω ∈ Ω η(ω) ≤ x} = Cr{Θ} ∧ Pr ω ∈ Ω η(ω) ≤ x = Pr{η ≤ x} which is just the probability distribution of the random variable η. Example 3.10: Let a ˜ be a fuzzy variable on a credibility space (Θ, P, Cr). It is clear that a ˜ may be regarded as a hybrid variable on the chance space (Θ, P, Cr) × (Ω, A, Pr) as follows, ξ(θ, ω) = a ˜(θ),
∀(θ, ω) ∈ Θ × Ω.
Thus its chance distribution is Φ(x) = Ch (θ, ω) ∈ Θ × Ω ξ(θ, ω) ≤ x = Ch {θ ∈ Θ a ˜(θ) ≤ x} × Ω = Cr θ ∈ Θ a ˜(θ) ≤ x ∧ Pr{Ω} = Cr{˜ a ≤ x} which is just the credibility distribution of the fuzzy variable a ˜. Theorem 3.13 (Sufficient and Necessary Condition for Chance Distribution) A function Φ : < → [0, 1] is a chance distribution if and only if it is an increasing function with lim Φ(x) ≤ 0.5 ≤ lim Φ(x),
x→−∞
x→+∞
lim Φ(y) = Φ(x) if lim Φ(y) > 0.5 or Φ(x) ≥ 0.5. y↓x
y↓x
(3.28) (3.29)
Proof: It is obvious that a chance distribution Φ is an increasing function. The inequalities (3.28) follow from the chance asymptotic theorem immediately. Assume that x is a point at which limy↓x Φ(y) > 0.5. That is, lim Ch{ξ ≤ y} > 0.5. y↓x
Since {ξ ≤ y} ↓ {ξ ≤ x} as y ↓ x, it follows from the chance semicontinuity law that Φ(y) = Ch{ξ ≤ y} ↓ Ch{ξ ≤ x} = Φ(x) as y ↓ x. When x is a point at which Φ(x) ≥ 0.5, if limy↓x Φ(y) 6= Φ(x), then we have lim Φ(y) > Φ(x) ≥ 0.5. y↓x
145
Section 3.4 - Expected Value
For this case, we have proved that limy↓x Φ(y) = Φ(x). Thus (3.29) is proved. Conversely, suppose Φ : < → [0, 1] is an increasing function satisfying (3.28) and (3.29). Theorem 2.19 states that there is a fuzzy variable whose credibility distribution is just Φ(x). Since a fuzzy variable is a special hybrid variable, the theorem is proved. Definition 3.9 The chance density function φ: < → [0, +∞) of a hybrid variable ξ is a function such that Z x Φ(x) = φ(y)dy, ∀x ∈ <, (3.30) −∞
Z
+∞
φ(y)dy = 1
(3.31)
−∞
where Φ is the chance distribution of ξ. Theorem 3.14 Let ξ be a hybrid variable whose chance density function φ exists. Then we have Z x Z +∞ Ch{ξ ≤ x} = φ(y)dy, Ch{ξ ≥ x} = φ(y)dy. (3.32) −∞
x
Proof: The first part follows immediately from the definition. In addition, by the self-duality of chance measure, we have Z +∞ Z x Z +∞ Ch{ξ ≥ x} = 1 − Ch{ξ < x} = φ(y)dy − φ(y)dy = φ(y)dy. −∞
−∞
x
The theorem is proved. Joint Chance Distribution Definition 3.10 Let (ξ1 , ξ2 , · · · , ξn ) be a hybrid vector. Then the joint chance distribution Φ :
−∞
−∞
n
holds for all (x1 , x2 , · · · , xn ) ∈ < , and Z +∞ Z +∞ Z +∞ ··· φ(y1 , y2 , · · · , yn )dy1 dy2 · · · dyn = 1 −∞
−∞
−∞
where Φ is the joint chance distribution of the hybrid vector (ξ1 , ξ2 , · · · , ξn ).
146
3.4
Chapter 3 - Chance Theory
Expected Value
Expected value has been defined in several ways. For example, Kwakernaak [87], Puri and Ralescu [193], Kruse and Meyer [86], and Liu and Liu [140] gave different expected value operators of fuzzy random variables. Liu and Liu [141] presented an expected value operator of random fuzzy variables. Li and Liu [103]) suggested the following definition of expected value operator of hybrid variables. Definition 3.12 Let ξ be a hybrid variable. Then the expected value of ξ is defined by Z +∞ Z 0 E[ξ] = Ch{ξ ≥ r}dr − Ch{ξ ≤ r}dr (3.33) −∞
0
provided that at least one of the two integrals is finite. Example 3.11: If a hybrid variable ξ degenerates to a random variable η, then Ch{ξ ≤ x} = Pr{η ≤ x},
Ch{ξ ≥ x} = Pr{η ≥ x},
∀x ∈ <.
It follows from (3.33) that E[ξ] = E[η]. In other words, the expected value operator of hybrid variable coincides with that of random variable. Example 3.12: If a hybrid variable ξ degenerates to a fuzzy variable a ˜, then Ch{ξ ≤ x} = Cr{˜ a ≤ x},
Ch{ξ ≥ x} = Cr{˜ a ≥ x},
∀x ∈ <.
It follows from (3.33) that E[ξ] = E[˜ a]. In other words, the expected value operator of hybrid variable coincides with that of fuzzy variable. Example 3.13: Let a ˜ be a fuzzy variable and η a random variable with finite expected values. Then the hybrid variable ξ = a ˜ + η has expected value E[ξ] = E[˜ a] + E[η]. Theorem 3.15 Let ξ be a hybrid variable whose chance density function φ exists. If the Lebesgue integral Z
+∞
xφ(x)dx −∞
is finite, then we have Z
+∞
E[ξ] =
xφ(x)dx. −∞
(3.34)
147
Section 3.4 - Expected Value
Proof: It follows from the definition of expected value operator and Fubini Theorem that Z 0 Z +∞ Ch{ξ ≤ r}dr Ch{ξ ≥ r}dr − E[ξ] = −∞
0
Z
+∞
Z
+∞
Z
+∞
=
Z
x
Z
0
Z
0
φ(x)dr dx −
= +∞
Z =
x
0
Z xφ(x)dx +
xφ(x)dx −∞
0
Z
φ(x)dx dr
φ(x)dr dx −∞
0
0
r
−∞
−∞
r
0
Z
0
Z φ(x)dx dr −
+∞
=
xφ(x)dx. −∞
The theorem is proved. Theorem 3.16 Let ξ be a hybrid variable with chance distribution Φ. If lim Φ(x) = 0,
lim Φ(x) = 1
x→−∞
x→+∞
and the Lebesgue-Stieltjes integral Z
+∞
xdΦ(x) −∞
is finite, then we have Z
+∞
E[ξ] =
xdΦ(x).
(3.35)
−∞
Proof: Since the Lebesgue-Stieltjes integral diately have Z lim
y→+∞
y
Z lim
y→+∞
lim
y
xdΦ(x) = 0. −∞
It follows from Z +∞ xdΦ(x) ≥ y lim Φ(z) − Φ(y) = y (1 − Φ(y)) ≥ 0, y
z→+∞
xdΦ(x)
y
lim
y→−∞
0
−∞
y
Z xdΦ(x) = 0,
Z xdΦ(x) =
y→−∞
+∞
xdΦ(x) is finite, we imme-
0
Z xdΦ(x),
0
and
−∞
+∞
Z xdΦ(x) =
0
R +∞
for y > 0,
148
Chapter 3 - Chance Theory
Z
y
xdΦ(x) ≤ y Φ(y) − lim Φ(z) = yΦ(y) ≤ 0, z→−∞
−∞
for y < 0
that lim y (1 − Φ(y)) = 0,
y→+∞
lim yΦ(y) = 0.
y→−∞
Let 0 = x0 < x1 < x2 < · · · < xn = y be a partition of [0, y]. Then we have n−1 X
xdΦ(x) 0
i=0
and
n−1 X
y
Z xi (Φ(xi+1 ) − Φ(xi )) →
Z
y
(1 − Φ(xi+1 ))(xi+1 − xi ) →
Ch{ξ ≥ r}dr 0
i=0
as max{|xi+1 − xi | : i = 0, 1, · · · , n − 1} → 0. Since n−1 X
n−1 X
xi (Φ(xi+1 ) − Φ(xi )) −
(1 − Φ(xi+1 )(xi+1 − xi ) = y(Φ(y) − 1) → 0
i=0
i=0
as y → +∞. This fact implies that Z +∞ Z Ch{ξ ≥ r}dr = 0
+∞
xdΦ(x).
0
A similar way may prove that Z 0 Z − Ch{ξ ≤ r}dr = −∞
0
xdΦ(x).
−∞
It follows that the equation (3.35) holds. Theorem 3.17 Let ξ be a hybrid variable with finite expected values. Then for any real numbers a and b, we have E[aξ + b] = aE[ξ] + b.
(3.36)
Proof: Step 1: We first prove that E[ξ + b] = E[ξ] + b for any real number b. If b ≥ 0, we have Z +∞ Z 0 E[ξ + b] = Ch{ξ + b ≥ r}dr − Ch{ξ + b ≤ r}dr −∞ 0
0
Z
+∞
Z Ch{ξ ≥ r − b}dr −
=
Ch{ξ ≤ r − b}dr −∞
0
Z
b
(Ch{ξ ≥ r − b} + Ch{ξ < r − b})dr
= E[ξ] + 0
= E[ξ] + b.
149
Section 3.5 - Variance
If b < 0, then we have Z 0 E[aξ + b] = E[ξ] − (Ch{ξ ≥ r − b} + Ch{ξ < r − b})dr = E[ξ] + b. b
Step 2: We prove E[aξ] = aE[ξ]. If a = 0, then the equation E[aξ] = aE[ξ] holds trivially. If a > 0, we have +∞
Z
Z
0
Ch{aξ ≥ r}dr −
E[aξ] =
Ch{aξ ≤ r}dr −∞ Z 0
0 +∞
Z
Ch{ξ ≥ r/a}dr −
=
Ch{ξ ≤ r/a}dr −∞ Z 0
0
Z
+∞
Ch{ξ ≥ t}dt − a
=a
Ch{ξ ≤ t}dt −∞
0
= aE[ξ]. If a < 0, we have +∞
Z
Z
0
Ch{aξ ≥ r}dr −
E[aξ] = +∞
Z
Ch{aξ ≤ r}dr −∞ Z 0
0
Ch{ξ ≤ r/a}dr −
= Z
Ch{ξ ≥ r/a}dr −∞ Z 0
0 +∞
Ch{ξ ≥ t}dt − a
=a 0
Ch{ξ ≤ t}dt −∞
= aE[ξ]. Step 2: For any real numbers a and b, it follows from Steps 1 and 2 that E[aξ + b] = E[aξ] + b = aE[ξ] + b. The theorem is proved.
3.5
Variance
The variance has been given by different ways. Liu and Liu [140][141] proposed the variance definitions of fuzzy random variables and random fuzzy variables. Li and Liu [103] suggested the following variance definition of hybrid variables. Definition 3.13 Let ξ be a hybrid variable with finite expected value e. Then the variance of ξ is defined by V [ξ] = E[(ξ − e)2 ]. Theorem 3.18 If ξ is a hybrid variable with finite expected value, a and b are real numbers, then V [aξ + b] = a2 V [ξ].
150
Chapter 3 - Chance Theory
Proof: It follows from the definition of variance that V [aξ + b] = E (aξ + b − aE[ξ] − b)2 = a2 E[(ξ − E[ξ])2 ] = a2 V [ξ]. Theorem 3.19 Let ξ be a hybrid variable with expected value e. Then V [ξ] = 0 if and only if Ch{ξ = e} = 1. Proof: If V [ξ] = 0, then E[(ξ − e)2 ] = 0. Note that Z +∞ 2 E[(ξ − e) ] = Ch{(ξ − e)2 ≥ r}dr 0 2
which implies Ch{(ξ − e) ≥ r} = 0 for any r > 0. Hence we have Ch{(ξ − e)2 = 0} = 1. That is, Ch{ξ = e} = 1. Conversely, if Ch{ξ = e} = 1, then we have Ch{(ξ − e)2 = 0} = 1 and Ch{(ξ − e)2 ≥ r} = 0 for any r > 0. Thus Z +∞ V [ξ] = Ch{(ξ − e)2 ≥ r}dr = 0. 0
The theorem is proved. Maximum Variance Theorem Theorem 3.20 Let f be a convex function on [a, b], and ξ a hybrid variable that takes values in [a, b] and has expected value e. Then E[f (ξ)] ≤
b−e e−a f (a) + f (b). b−a b−a
(3.37)
Proof: For each (θ, ω) ∈ Θ × Ω, we have a ≤ ξ(θ, ω) ≤ b and ξ(θ, ω) =
b − ξ(θ, ω) ξ(θ, ω) − a a+ b. b−a b−a
It follows from the convexity of f that f (ξ(θ, ω)) ≤
b − ξ(θ, ω) ξ(θ, ω) − a f (a) + f (b). b−a b−a
Taking expected values on both sides, we obtain (3.37). Theorem 3.21 (Maximum Variance Theorem) Let ξ be a hybrid variable that takes values in [a, b] and has expected value e. Then V [ξ] ≤ (e − a)(b − e).
(3.38)
Proof: It follows from Theorem 3.20 immediately by defining f (x) = (x−e)2 .
151
Section 3.7 - Independence
3.6
Moments
Liu [129] defined the concepts of moments of both fuzzy random variables and random fuzzy variables. Li and Liu [103] discussed the moments of hybrid variables. Definition 3.14 Let ξ be a hybrid variable. Then for any positive integer k, (a) the expected value E[ξ k ] is called the kth moment; (b) the expected value E[|ξ|k ] is called the kth absolute moment; (c) the expected value E[(ξ − E[ξ])k ] is called the kth central moment; (d) the expected value E[|ξ −E[ξ]|k ] is called the kth absolute central moment. Note that the first central moment is always 0, the first moment is just the expected value, and the second central moment is just the variance. Theorem 3.22 Let ξ be a nonnegative hybrid variable, and k a positive number. Then the k-th moment Z +∞ E[ξ k ] = k rk−1 Ch{ξ ≥ r}dr. (3.39) 0
Proof: It follows from the nonnegativity of ξ that Z ∞ Z ∞ Z k k k E[ξ ] = Ch{ξ ≥ x}dx = Ch{ξ ≥ r}dr = k 0
0
∞
rk−1 Ch{ξ ≥ r}dr.
0
The theorem is proved. Theorem 3.23 (Li and Liu [104]) Let ξ be a hybrid variable that takes values in [a, b] and has expected value e. Then for any positive integer k, the kth absolute moment and kth absolute central moment satisfy the following inequalities, b−e k e−a k |a| + |b| , (3.40) E[|ξ|k ] ≤ b−a b−a b−e e−a E[|ξ − e|k ] ≤ (e − a)k + (b − e)k . (3.41) b−a b−a Proof: It follows from Theorem 3.20 immediately by defining f (x) = |x|k and f (x) = |x − e|k .
3.7
Independence
This book uses the following definition of independence of hybrid variables. Definition 3.15 (Liu [132]) The hybrid variables ξ1 , ξ2 , · · · , ξn are said to be independent if " n # n X X fi (ξi ) = E[fi (ξi )] (3.42) E i=1
i=1
152
Chapter 3 - Chance Theory
for any measurable functions f1 , f2 , · · · , fn provided that the expected values exist and are finite. Theorem 3.24 Hybrid variables are independent if they are (independent or not) random variables. Proof: Suppose the hybrid variables are random variables ξ1 , ξ2 , · · · , ξn . Then f1 (ξ1 ), f2 (ξ2 ), · · · , fn (ξn ) are also random variables for any measurable functions f1 , f2 , · · · , fn . It follows from the linearity of expected value operator of random variables that E[f1 (ξ1 ) + f2 (ξ2 ) + · · · + fn (ξn )] = E[f1 (ξ1 )] + E[f2 (ξ2 )] + · · · + E[fn (ξn )]. Hence the hybrid variables are independent. Theorem 3.25 Hybrid variables are independent if they are independent fuzzy variables. Proof: If the hybrid variables are independent fuzzy variables ξ1 , ξ2 , · · · , ξn , then f1 (ξ1 ), f2 (ξ2 ), · · · , fn (ξn ) are also independent fuzzy variables for any measurable functions f1 , f2 , · · · , fn . It follows from the linearity of expected value operator of fuzzy variables that E[f1 (ξ1 ) + f2 (ξ2 ) + · · · + fn (ξn )] = E[f1 (ξ1 )] + E[f2 (ξ2 )] + · · · + E[fn (ξn )]. Hence the hybrid variables are independent. Theorem 3.26 Two hybrid variables are independent if one is a random variable and another is a fuzzy variable. Proof: Suppose that ξ is a random variable and η is a fuzzy variable. Then f (ξ) is a random variable and g(η) is a fuzzy variable for any measurable functions f and g. Thus E[f (ξ) + g(η)] = E[f (ξ)] + E[g(η)]. Hence ξ and η are independent hybrid variables. Theorem 3.27 If ξ and η are independent hybrid variables with finite expected values, then we have E[aξ + bη] = aE[ξ] + bE[η]
(3.43)
for any real numbers a and b. Proof: The theorem follows from the definition by defining f1 (x) = ax and f2 (x) = bx. Theorem 3.28 Suppose that ξ1 , ξ2 , · · · , ξn are independent hybrid variables, and f1 , f2 , · · · , fn are measurable functions. Then f1 (ξ1 ), f2 (ξ2 ), · · · , fn (ξn ) are independent hybrid variables. Proof: The theorem follows from the definition because the compound of measurable functions is also measurable.
Section 3.9 - Critical Values
3.8
153
Identical Distribution
This section introduces the concept of identical distribution of hybrid variables. Definition 3.16 The hybrid variables ξ and η are identically distributed if Ch{ξ ∈ B} = Ch{η ∈ B}
(3.44)
for any Borel set B of real numbers. Theorem 3.29 Let ξ and η be identically distributed hybrid variables, and f : < → < a measurable function. Then f (ξ) and f (η) are identically distributed hybrid variables. Proof: For any Borel set B of real numbers, we have Ch{f (ξ) ∈ B} = Ch{ξ ∈ f −1 (B)} = Ch{η ∈ f −1 (B)} = Ch{f (η) ∈ B}. Hence f (ξ) and f (η) are identically distributed hybrid variables. Theorem 3.30 If ξ and η are identically distributed hybrid variables, then they have the same chance distribution. Proof: Since ξ and η are identically distributed hybrid variables, we have Ch{ξ ∈ (−∞, x]} = Ch{η ∈ (−∞, x]} for any x. Thus ξ and η have the same chance distribution. Theorem 3.31 If ξ and η are identically distributed hybrid variables whose chance density functions exist, then they have the same chance density function. Proof: It follows from Theorem 3.30 immediately.
3.9
Critical Values
In order to rank fuzzy random variables, Liu [122] defined two critical values: optimistic value and pessimistic value. Analogously, Liu [124] gave the concepts of critical values of random fuzzy variables. Li and Liu [103] presented the following definition of critical values of hybrid variables. Definition 3.17 Let ξ be a hybrid variable, and α ∈ (0, 1]. Then ξsup (α) = sup r Ch {ξ ≥ r} ≥ α
(3.45)
is called the α-optimistic value to ξ, and ξinf (α) = inf r Ch {ξ ≤ r} ≥ α
(3.46)
is called the α-pessimistic value to ξ.
154
Chapter 3 - Chance Theory
The hybrid variable ξ reaches upwards of the α-optimistic value ξsup (α), and is below the α-pessimistic value ξinf (α) with chance α. Example 3.14: If a hybrid variable ξ degenerates to a random variable η, then Ch{ξ ≤ x} = Pr{η ≤ x},
Ch{ξ ≥ x} = Pr{η ≥ x},
∀x ∈ <.
It follows from the definition of critical values that ξsup (α) = ηsup (α),
ξinf (α) = ηinf (α),
∀α ∈ (0, 1].
In other words, the critical values of hybrid variable coincide with that of random variable. Example 3.15: If a hybrid variable ξ degenerates to a fuzzy variable a ˜, then Ch{ξ ≤ x} = Cr{˜ a ≤ x},
Ch{ξ ≥ x} = Cr{˜ a ≥ x},
∀x ∈ <.
It follows from the definition of critical values that ξsup (α) = a ˜sup (α),
ξinf (α) = a ˜inf (α),
∀α ∈ (0, 1].
In other words, the critical values of hybrid variable coincide with that of fuzzy variable. Theorem 3.32 Let ξ be a hybrid variable. If α > 0.5, then we have Ch{ξ ≤ ξinf (α)} ≥ α,
Ch{ξ ≥ ξsup (α)} ≥ α.
(3.47)
Proof: It follows from the definition of α-pessimistic value that there exists a decreasing sequence {xi } such that Ch{ξ ≤ xi } ≥ α and xi ↓ ξinf (α) as i → ∞. Since {ξ ≤ xi } ↓ {ξ ≤ ξinf (α)} and limi→∞ Ch{ξ ≤ xi } ≥ α > 0.5, it follows from the chance semicontinuity theorem that Ch{ξ ≤ ξinf (α)} = lim Ch{ξ ≤ xi } ≥ α. i→∞
Similarly, there exists an increasing sequence {xi } such that Ch{ξ ≥ xi } ≥ α and xi ↑ ξsup (α) as i → ∞. Since {ξ ≥ xi } ↓ {ξ ≥ ξsup (α)} and limi→∞ Ch{ξ ≥ xi } ≥ α > 0.5, it follows from the chance semicontinuity theorem that Ch{ξ ≥ ξsup (α)} = lim Ch{ξ ≥ xi } ≥ α. i→∞
The theorem is proved. Theorem 3.33 Let ξ be a hybrid variable and α a number between 0 and 1. We have (a) if c ≥ 0, then (cξ)sup (α) = cξsup (α) and (cξ)inf (α) = cξinf (α); (b) if c < 0, then (cξ)sup (α) = cξinf (α) and (cξ)inf (α) = cξsup (α).
155
Section 3.10 - Entropy
Proof: (a) If c = 0, then the part (a) is obvious. In the case of c > 0, we have (cξ)sup (α) = sup{r Ch{cξ ≥ r} ≥ α} = c sup{r/c | Ch{ξ ≥ r/c} ≥ α} = cξsup (α). A similar way may prove (cξ)inf (α) = cξinf (α). In order to prove the part (b), it suffices to prove that (−ξ)sup (α) = −ξinf (α) and (−ξ)inf (α) = −ξsup (α). In fact, we have (−ξ)sup (α) = sup{r Ch{−ξ ≥ r} ≥ α} = − inf{−r | Ch{ξ ≤ −r} ≥ α} = −ξinf (α). Similarly, we may prove that (−ξ)inf (α) = −ξsup (α). The theorem is proved. Theorem 3.34 Let ξ be a hybrid variable. Then we have (a) if α > 0.5, then ξinf (α) ≥ ξsup (α); (b) if α ≤ 0.5, then ξinf (α) ≤ ξsup (α). ¯ Proof: Part (a): Write ξ(α) = (ξinf (α) + ξsup (α))/2. If ξinf (α) < ξsup (α), then we have ¯ ¯ 1 ≥ Ch{ξ < ξ(α)} + Ch{ξ > ξ(α)} ≥ α + α > 1. A contradiction proves ξinf (α) ≥ ξsup (α). Part (b): Assume that ξinf (α) > ξsup (α). It follows from the definition of ¯ ξinf (α) that Ch{ξ ≤ ξ(α)} < α. Similarly, it follows from the definition of ¯ ξsup (α) that Ch{ξ ≥ ξ(α)} < α. Thus ¯ ¯ 1 ≤ Ch{ξ ≤ ξ(α)} + Ch{ξ ≥ ξ(α)} < α + α ≤ 1. A contradiction proves ξinf (α) ≤ ξsup (α). The theorem is verified. Theorem 3.35 Let ξ be a hybrid variable. Then we have (a) ξsup (α) is a decreasing and left-continuous function of α; (b) ξinf (α) is an increasing and left-continuous function of α. Proof: (a) It is easy to prove that ξinf (α) is an increasing function of α. Next, we prove the left-continuity of ξinf (α) with respect to α. Let {αi } be an arbitrary sequence of positive numbers such that αi ↑ α. Then {ξinf (αi )} is an increasing sequence. If the limitation is equal to ξinf (α), then the leftcontinuity is proved. Otherwise, there exists a number z ∗ such that lim ξinf (αi ) < z ∗ < ξinf (α).
i→∞
Thus Ch{ξ ≤ z ∗ } ≥ αi for each i. Letting i → ∞, we get Ch{ξ ≤ z ∗ } ≥ α. Hence z ∗ ≥ ξinf (α). A contradiction proves the left-continuity of ξinf (α) with respect to α. The part (b) may be proved similarly.
156
3.10
Chapter 3 - Chance Theory
Entropy
This section provides a definition of entropy to characterize the uncertainty of hybrid variables resulting from information deficiency. Definition 3.18 (Li and Liu [103]) Suppose that ξ is a discrete hybrid variable taking values in {x1 , x2 , · · · }. Then its entropy is defined by H[ξ] =
∞ X
S(Ch{ξ = xi })
(3.48)
i=1
where S(t) = −t ln t − (1 − t) ln(1 − t). Example 3.16: Suppose that ξ is a discrete hybrid variable taking values in {x1 , x2 , · · · }. If there exists some index k such that Ch{ξ = xk } = 1, and 0 otherwise, then its entropy H[ξ] = 0. Example 3.17: Suppose that ξ is a simple hybrid variable taking values in {x1 , x2 , · · · , xn }. If Ch{ξ = xi } = 0.5 for all i = 1, 2, · · · , n, then its entropy H[ξ] = n ln 2. Theorem 3.36 Suppose that ξ is a discrete hybrid variable taking values in {x1 , x2 , · · · }. Then H[ξ] ≥ 0 (3.49) and equality holds if and only if ξ is essentially a deterministic/crisp number. Proof: The nonnegativity is clear. In addition, H[ξ] = 0 if and only if Ch{ξ = xi } = 0 or 1 for each i. That is, there exists one and only one index k such that Ch{ξ = xk } = 1, i.e., ξ is essentially a deterministic/crisp number. This theorem states that the entropy of a hybrid variable reaches its minimum 0 when the uncertain variable degenerates to a deterministic/crisp number. In this case, there is no uncertainty. Theorem 3.37 Suppose that ξ is a simple hybrid variable taking values in {x1 , x2 , · · · , xn }. Then H[ξ] ≤ n ln 2 (3.50) and equality holds if and only if Ch{ξ = xi } = 0.5 for all i = 1, 2, · · · , n. Proof: Since the function S(t) reaches its maximum ln 2 at t = 0.5, we have H[ξ] =
n X
S(Ch{ξ = xi }) ≤ n ln 2
i=1
and equality holds if and only if Ch{ξ = xi } = 0.5 for all i = 1, 2, · · · , n. This theorem states that the entropy of a hybrid variable reaches its maximum when the hybrid variable is an equipossible one. In this case, there is no preference among all the values that the hybrid variable will take.
157
Section 3.12 - Inequalities
3.11
Distance
Definition 3.19 (Li and Liu [103]) The distance between hybrid variables ξ and η is defined as d(ξ, η) = E[|ξ − η|]. (3.51) Theorem 3.38 (Li and Liu [103]) Let ξ, η, τ be hybrid variables, and let d(·, ·) be the distance. Then we have (a) (Nonnegativity) d(ξ, η) ≥ 0; (b) (Identification) d(ξ, η) = 0 if and only if ξ = η; (c) (Symmetry) d(ξ, η) = d(η, ξ); (d) (Triangle Inequality) d(ξ, η) ≤ 2d(ξ, τ ) + 2d(η, τ ). Proof: The parts (a), (b) and (c) follow immediately from the definition. Now we prove the part (d). It follows from the chance subadditivity theorem that Z +∞ d(ξ, η) = Ch {|ξ − η| ≥ r} dr 0
Z
+∞
≤
Ch {|ξ − τ | + |τ − η| ≥ r} dr 0
Z
+∞
≤
Ch {{|ξ − τ | ≥ r/2} ∪ {|τ − η| ≥ r/2}} dr 0
Z
+∞
≤
(Ch{|ξ − τ | ≥ r/2} + Ch{|τ − η| ≥ r/2}) dr 0
Z
+∞
Z
+∞
Ch{|ξ − τ | ≥ r/2}dr +
= 0
Ch{|τ − η| ≥ r/2}dr 0
= 2E[|ξ − τ |] + 2E[|τ − η|] = 2d(ξ, τ ) + 2d(τ, η).
3.12
Inequalities
Yang and Liu [237] proved some important inequalities for fuzzy random variables, and Zhu and Liu [262] presented several inequalities for random fuzzy variables. Li and Liu [103] also verified the following inequalities for hybrid variables. Theorem 3.39 Let ξ be a hybrid variable, and f a nonnegative function. If f is even and increasing on [0, ∞), then for any given number t > 0, we have Ch{|ξ| ≥ t} ≤
E[f (ξ)] . f (t)
(3.52)
158
Chapter 3 - Chance Theory
Proof: It is clear that Ch{|ξ| ≥ f −1 (r)} is a monotone decreasing function of r on [0, ∞). It follows from the nonnegativity of f (ξ) that Z
+∞
Ch{f (ξ) ≥ r}dr
E[f (ξ)] = 0
Z
+∞
Ch{|ξ| ≥ f −1 (r)}dr
= 0
Z
f (t)
Ch{|ξ| ≥ f −1 (r)}dr
≥ 0
Z ≥
f (t)
dr · Ch{|ξ| ≥ f −1 (f (t))}
0
= f (t) · Ch{|ξ| ≥ t} which proves the inequality. Theorem 3.40 (Markov Inequality) Let ξ be a hybrid variable. Then for any given numbers t > 0 and p > 0, we have Ch{|ξ| ≥ t} ≤
E[|ξ|p ] . tp
(3.53)
Proof: It is a special case of Theorem 3.39 when f (x) = |x|p . Theorem 3.41 (Chebyshev Inequality) Let ξ be a hybrid variable whose variance V [ξ] exists. Then for any given number t > 0, we have Ch {|ξ − E[ξ]| ≥ t} ≤
V [ξ] . t2
(3.54)
Proof: It is a special case of Theorem 3.39 when the hybrid variable ξ is replaced with ξ − E[ξ], and f (x) = x2 . Theorem 3.42 (H¨ older’s Inequality) Let p and q be positive real numbers with 1/p + 1/q = 1, and let ξ and η be independent hybrid variables with E[|ξ|p ] < ∞ and E[|η|q ] < ∞. Then we have p p E[|ξη|] ≤ p E[|ξ|p ] q E[|η|q ]. (3.55) Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume√E[|ξ|p ] > 0 and E[|η|q ] > 0. It is easy to prove that the function √ f (x, y) = p x q y is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
159
Section 3.13 - Convergence Concepts
Letting x0 = E[|ξ|p ], y0 = E[|η|q ], x = |ξ|p and y = |η|q , we have f (|ξ|p , |η|q ) − f (E[|ξ|p ], E[|η|q ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|q − E[|η|q ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|q )] ≤ f (E[|ξ|p ], E[|η|q ]). Hence the inequality (3.55) holds. Theorem 3.43 (Minkowski Inequality) Let p be a real number with p ≥ 1, and let ξ and η be independent hybrid variables with E[|ξ|p ] < ∞ and E[|η|p ] < ∞. Then we have p p p p E[|ξ + η|p ] ≤ p E[|ξ|p ] + p E[|η|p ]. (3.56) Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume √ E[|ξ|p ] > 0 and E[|η|p ] > 0. It is easy to prove that the function √ f (x, y) = ( p x + p y)p is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
Letting x0 = E[|ξ|p ], y0 = E[|η|p ], x = |ξ|p and y = |η|p , we have f (|ξ|p , |η|p ) − f (E[|ξ|p ], E[|η|p ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|p − E[|η|p ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|p )] ≤ f (E[|ξ|p ], E[|η|p ]). Hence the inequality (3.56) holds. Theorem 3.44 (Jensen’s Inequality) Let ξ be a hybrid variable, and f : < → < a convex function. If E[ξ] and E[f (ξ)] are finite, then f (E[ξ]) ≤ E[f (ξ)].
(3.57)
Especially, when f (x) = |x|p and p ≥ 1, we have |E[ξ]|p ≤ E[|ξ|p ]. Proof: Since f is a convex function, for each y, there exists a number k such that f (x) − f (y) ≥ k · (x − y). Replacing x with ξ and y with E[ξ], we obtain f (ξ) − f (E[ξ]) ≥ k · (ξ − E[ξ]). Taking the expected values on both sides, we have E[f (ξ)] − f (E[ξ]) ≥ k · (E[ξ] − E[ξ]) = 0 which proves the inequality.
160
3.13
Chapter 3 - Chance Theory
Convergence Concepts
Liu [129] gave the convergence concepts of fuzzy random sequence, and Zhu and Liu [264] introduced the convergence concepts of random fuzzy sequence. Li and Liu [103] discussed the convergence concepts of hybrid sequence: convergence almost surely (a.s.), convergence in chance, convergence in mean, and convergence in distribution. Table 3.1: Relationship among Convergence Concepts Convergence in Mean
Convergence
⇒
in Chance
⇒
Convergence in Distribution
Definition 3.20 Suppose that ξ, ξ1 , ξ2 , · · · are hybrid variables defined on the chance space (Θ, P, Cr) × (Ω, A, Pr). The sequence {ξi } is said to be convergent a.s. to ξ if there exists an event Λ with Ch{Λ} = 1 such that lim |ξi (θ, ω) − ξ(θ, ω)| = 0
i→∞
(3.58)
for every (θ, ω) ∈ Λ. In that case we write ξi → ξ, a.s. Definition 3.21 Suppose that ξ, ξ1 , ξ2 , · · · are hybrid variables. We say that the sequence {ξi } converges in chance to ξ if lim Ch {|ξi − ξ| ≥ ε} = 0
i→∞
(3.59)
for every ε > 0. Definition 3.22 Suppose that ξ, ξ1 , ξ2 , · · · are hybrid variables with finite expected values. We say that the sequence {ξi } converges in mean to ξ if lim E[|ξi − ξ|] = 0.
i→∞
(3.60)
In addition, the sequence {ξi } is said to converge in mean square to ξ if lim E[|ξi − ξ|2 ] = 0.
i→∞
(3.61)
Definition 3.23 Suppose that Φ, Φ1 , Φ2 , · · · are the chance distributions of hybrid variables ξ, ξ1 , ξ2 , · · · , respectively. We say that {ξi } converges in distribution to ξ if Φi → Φ at any continuity point of Φ.
161
Section 3.13 - Convergence Concepts
Convergence Almost Surely vs. Convergence in Chance Example 3.18: Convergence a.s. does not imply convergence in chance. Take a credibility space (Θ, P, Cr) to be {θ1 , θ2 , · · · } with Cr{θj } = j/(2j +1) for j = 1, 2, · · · and take an arbitrary probability space (Ω, A, Pr). Then we define hybrid variables as ( i, if j = i ξi (θj , ω) = 0, otherwise for i = 1, 2, · · · and ξ ≡ 0. The sequence {ξi } convergence a.s. to ξ. However, for some small number ε > 0, we have Ch{|ξi − ξ| ≥ ε} = Cr{|ξi − ξ| ≥ ε} =
i 1 → . 2i + 1 2
That is, the sequence {ξi } does not converge in chance to ξ. Example 3.19: Convergence in chance does not imply convergence a.s. Take an arbitrary credibility space (Θ, P, Cr) and take a probability space (Ω, A, Pr) to be [0, 1] with Borel algebra and Lebesgue measure. For any positive integer i, there is an integer j such that i = 2j + k, where k is an integer between 0 and 2j − 1. Then we define hybrid variables as ( i, if k/2j ≤ ω ≤ (k + 1)/2j ξi (θ, ω) = 0, otherwise for i = 1, 2, · · · and ξ ≡ 0. For some small number ε > 0, we have Ch{|ξi − ξ| ≥ ε} = Pr{|ξi − ξ| ≥ ε} =
1 →0 2i
as i → ∞. That is, the sequence {ξi } converges in chance to ξ. However, for any ω ∈ [0, 1], there is an infinite number of intervals of the form [k/2j , (k + 1)/2j ] containing ω. Thus ξi (θ, ω) does converges to 0. In other words, the sequence {ξi } does not converge a.s. to ξ. Convergence in Mean vs. Convergence in Chance Theorem 3.45 Suppose that ξ, ξ1 , ξ2 , · · · are hybrid variables. If {ξi } converges in mean to ξ, then {ξi } converges in chance to ξ. Proof: It follows from the Markov inequality that for any given number ε > 0, we have E[|ξi − ξ|] Ch{|ξi − ξ| ≥ ε} ≤ →0 ε as i → ∞. Thus {ξi } converges in chance to ξ. The theorem is proved.
162
Chapter 3 - Chance Theory
Example 3.20: Convergence in chance does not imply convergence in mean. Take a credibility space (Θ, P, Cr) to be {θ1 , θ2 , · · · } with Cr{θ1 } = 1/2 and Cr{θj } = 1/j for j = 2, 3, · · · and take an arbitrary probability space (Ω, A, Pr). The hybrid variables are defined by ( i, if j = i ξi (θj , ω) = 0, otherwise for i = 1, 2, · · · and ξ ≡ 0. For some small number ε > 0, we have 1 → 0. i That is, the sequence {ξi } converges in chance to ξ. However, Ch{|ξi − ξ| ≥ ε} = Cr{|ξi − ξ| ≥ ε} =
E[|ξi − ξ|] = 1,
∀i.
That is, the sequence {ξi } does not converge in mean to ξ. Convergence Almost Surely vs. Convergence in Mean Example 3.21: Convergence a.s. does not imply convergence in mean. Take an arbitrary credibility space (Θ, P, Cr) and take a probability space (Ω, A, Pr) to be {ω1 , ω2 , · · · } with Pr{ωj } = 1/2j for j = 1, 2, · · · The hybrid variables are defined by ( 2i , if j = i ξi (θ, ωj ) = 0, otherwise for i = 1, 2, · · · and ξ ≡ 0. Then ξi converges a.s. to ξ. However, the sequence {ξi } does not converges in mean to ξ because E[|ξi − ξ|] ≡ 1. Example 3.22: Convergence in chance does not imply convergence a.s. Take an arbitrary credibility space (Θ, P, Cr) and take a probability space (Ω, A, Pr) to be [0, 1] with Borel algebra and Lebesgue measure. For any positive integer i, there is an integer j such that i = 2j + k, where k is an integer between 0 and 2j − 1. The hybrid variables are defined by ( i, if k/2j ≤ ω ≤ (k + 1)/2j ξi (θ, ω) = 0, otherwise for i = 1, 2, · · · and ξ ≡ 0. Then 1 → 0. 2j That is, the sequence {ξi } converges in mean to ξ. However, for any ω ∈ [0, 1], there is an infinite number of intervals of the form [k/2j , (k+1)/2j ] containing ω. Thus ξi (θ, ω) does converges to 0. In other words, the sequence {ξi } does not converge a.s. to ξ. E[|ξi − ξ|] =
Section 3.13 - Convergence Concepts
163
Convergence in Chance vs. Convergence in Distribution Theorem 3.46 Suppose ξ, ξ1 , ξ2 , · · · are hybrid variables. If {ξi } converges in chance to ξ, then {ξi } converges in distribution to ξ. Proof: Let x be a given continuity point of the distribution Φ. On the one hand, for any y > x, we have {ξi ≤ x} = {ξi ≤ x, ξ ≤ y} ∪ {ξi ≤ x, ξ > y} ⊂ {ξ ≤ y} ∪ {|ξi − ξ| ≥ y − x}. It follows from the chance subadditivity theorem that Φi (x) ≤ Φ(y) + Ch{|ξi − ξ| ≥ y − x}. Since {ξi } converges in chance to ξ, we have Ch{|ξi − ξ| ≥ y − x} → 0 as i → ∞. Thus we obtain lim supi→∞ Φi (x) ≤ Φ(y) for any y > x. Letting y → x, we get lim sup Φi (x) ≤ Φ(x). (3.62) i→∞
On the other hand, for any z < x, we have {ξ ≤ z} = {ξi ≤ x, ξ ≤ z} ∪ {ξi > x, ξ ≤ z} ⊂ {ξi ≤ x} ∪ {|ξi − ξ| ≥ x − z} which implies that Φ(z) ≤ Φi (x) + Ch{|ξi − ξ| ≥ x − z}. Since Ch{|ξi − ξ| ≥ x − z} → 0, we obtain Φ(z) ≤ lim inf i→∞ Φi (x) for any z < x. Letting z → x, we get Φ(x) ≤ lim inf Φi (x). i→∞
(3.63)
It follows from (3.62) and (3.63) that Φi (x) → Φ(x). The theorem is proved. Example 3.23: Convergence in distribution does not imply convergence in chance. Take a credibility space (Θ, P, Cr) to be {θ1 , θ2 } with Cr{θ1 } = Cr{θ2 } = 1/2 and take an arbitrary probability space (Ω, A, Pr). We define a hybrid variables as ( −1, if θ = θ1 ξ(θ, ω) = 1, if θ = θ2 . We also define ξi = −ξ for i = 1, 2, · · · . Then ξi and ξ have the same chance distribution. Thus {ξi } converges in distribution to ξ. However, for some small number ε > 0, we have Ch{|ξi − ξ| ≥ ε} = Cr{|ξi − ξ| ≥ ε} = 1. That is, the sequence {ξi } does not converge in chance to ξ.
164
Chapter 3 - Chance Theory
Convergence Almost Surely vs. Convergence in Distribution Example 3.24: Convergence in distribution does not imply convergence a.s. Take a credibility space to be (Θ, P, Cr) to be {θ1 , θ2 } with Cr{θ1 } = Cr{θ2 } = 1/2 and take an arbitrary probability space (Ω, A, Pr). We define a hybrid variable ξ as ( −1, if θ = θ1 ξ(θ, ω) = 1, if θ = θ2 . We also define ξi = −ξ for i = 1, 2, · · · . Then ξi and ξ have the same chance distribution. Thus {ξi } converges in distribution to ξ. However, the sequence {ξi } does not converge a.s. to ξ. Example 3.25: Convergence a.s. does not imply convergence in chance. Take a credibility space (Θ, P, Cr) to be {θ1 , θ2 , · · · } with Cr{θj } = j/(2j +1) for j = 1, 2, · · · and take an arbitrary probability space (Ω, A, Pr). The hybrid variables are defined by ( i, if j = i ξi (θj , ω) = 0, otherwise for i = 1, 2, · · · and ξ ≡ 0. Then the sequence {ξi } converges a.s. to ξ. However, the chance distributions of ξi are 0, if x < 0 (i + 1)/(2i + 1), if 0 ≤ x < i Φi (x) = 1, if x ≥ i for i = 1, 2, · · · , respectively. The chance distribution of ξ is 0, if x < 0 Φ(x) = 1, if x ≥ 0. It is clear that Φi (x) does not converge to Φ(x) at x > 0. That is, the sequence {ξi } does not converge in distribution to ξ.
3.14
Conditional Chance
We consider the chance measure of an event A after it has been learned that some other event B has occurred. This new chance measure of A is called the conditional chance measure of A given B. In order to define a conditional chance measure Ch{A|B}, at first we have to enlarge Ch{A ∩ B} because Ch{A ∩ B} < 1 for all events whenever Ch{B} < 1. It seems that we have no alternative but to divide Ch{A∩B} by
165
Section 3.14 - Conditional Chance
Ch{B}. Unfortunately, Ch{A ∩ B}/Ch{B} is not always a chance measure. However, the value Ch{A|B} should not be greater than Ch{A ∩ B}/Ch{B} (otherwise the normality will be lost), i.e., Ch{A|B} ≤
Ch{A ∩ B} . Ch{B}
(3.64)
On the other hand, in order to preserve the self-duality, we should have Ch{A|B} = 1 − Ch{Ac |B} ≥ 1 −
Ch{Ac ∩ B} . Ch{B}
(3.65)
Furthermore, since (A ∩ B) ∪ (Ac ∩ B) = B, we have Ch{B} ≤ Ch{A ∩ B} + Ch{Ac ∩ B} by using the chance subadditivity theorem. Thus 0≤1−
Ch{A ∩ B} Ch{Ac ∩ B} ≤ ≤ 1. Ch{B} Ch{B}
(3.66)
Hence any numbers between 1 − Ch{Ac ∩ B}/Ch{B} and Ch{A ∩ B}/Ch{B} are reasonable values that the conditional chance may take. Based on the maximum uncertainty principle, we have the following conditional chance measure. Definition 3.24 (Li and Liu [106]) Let (Θ, P, Cr) × (Ω, A, Pr) be a chance space and A, B two events. Then the conditional chance measure of A given B is defined by
Ch{A|B} =
Ch{A ∩ B} , Ch{B}
1−
if
Ch{A ∩ B} < 0.5 Ch{B}
Ch{Ac ∩ B} Ch{Ac ∩ B} , if < 0.5 Ch{B} Ch{B} 0.5,
(3.67)
otherwise
provided that Ch{B} > 0.
Remark 3.6: It follows immediately from the definition of conditional chance that 1−
Ch{A ∩ B} Ch{Ac ∩ B} ≤ Ch{A|B} ≤ . Ch{B} Ch{B}
(3.68)
Furthermore, it is clear that the conditional chance measure obeys the maximum uncertainty principle.
166
Chapter 3 - Chance Theory
Remark 3.7: Let X and Y be events in the credibility space. Then the conditional chance measure of X × Ω given Y × Ω is
Ch{X × Ω|Y × Ω} =
Cr{X ∩ Y } , Cr{Y }
1−
if
Cr{X ∩ Y } < 0.5 Cr{Y }
Cr{X c ∩ Y } Cr{X c ∩ Y } , if < 0.5 Cr{Y } Cr{Y } 0.5,
otherwise
which is just the conditional credibility of X given Y . Remark 3.8: Let X and Y be events in the probability space. Then the conditional chance measure of Θ × X given Θ × Y is Pr{X ∩ Y } Pr{Y }
Ch{Θ × X|Θ × Y } =
which is just the conditional probability of X given Y . Example 3.26: Let ξ and η be two hybrid variables. Then we have Ch{ξ = x, η = y} Ch{ξ = x, η = y} , if < 0.5 Ch{η = y} Ch{η = y} Ch{ξ 6= x, η = y} Ch{ξ 6= x, η = y} Ch {ξ = x|η = y} = , if < 0.5 1− Ch{η = y} Ch{η = y} 0.5, otherwise provided that Ch{η = y} > 0. Theorem 3.47 (Li and Liu [106]) Conditional chance measure is a type of uncertain measure. That is, conditional chance measure is normal, increasing, self-dual and countably subadditive. Proof: At first, the conditional chance measure Ch{·|B} is normal, i.e., Ch{Θ × Ω|B} = 1 −
Ch{∅} = 1. Ch{B}
For any events A1 and A2 with A1 ⊂ A2 , if Ch{A2 ∩ B} Ch{A1 ∩ B} ≤ < 0.5, Ch{B} Ch{B} then Ch{A1 |B} =
Ch{A1 ∩ B} Ch{A2 ∩ B} ≤ = Ch{A2 |B}. Ch{B} Ch{B}
167
Section 3.14 - Conditional Chance
If
Ch{A1 ∩ B} Ch{A2 ∩ B} ≤ 0.5 ≤ , Ch{B} Ch{B}
then Ch{A1 |B} ≤ 0.5 ≤ Ch{A2 |B}. If 0.5 <
Ch{A1 ∩ B} Ch{A2 ∩ B} ≤ , Ch{B} Ch{B}
then we have Ch{Ac1 ∩ B} Ch{Ac2 ∩ B} Ch{A1 |B} = 1 − ∨0.5 ≤ 1 − ∨0.5 = Ch{A2 |B}. Ch{B} Ch{B} This means that Ch{·|B} is increasing. For any event A, if Ch{A ∩ B} ≥ 0.5, Ch{B}
Ch{Ac ∩ B} ≥ 0.5, Ch{B}
then we have Ch{A|B} + Ch{Ac |B} = 0.5 + 0.5 = 1 immediately. Otherwise, without loss of generality, suppose Ch{Ac ∩ B} Ch{A ∩ B} < 0.5 < , Ch{B} Ch{B} then we have Ch{A|B} + Ch{Ac |B} =
Ch{A ∩ B} Ch{A ∩ B} + 1− = 1. Ch{B} Ch{B}
That is, Ch{·|B} is self-dual. Finally, for any countable sequence {Ai } of events, if Ch{Ai |B} < 0.5 for all i, it follows from the countable subadditivity of chance measure that ) (∞ ∞ [ X (∞ ) Ch Ai ∩ B Ch{Ai ∩ B} ∞ [ X i=1 i=1 Ch ≤ = Ai ∩ B ≤ Ch{Ai |B}. Ch{B} Ch{B} i=1 i=1 Suppose there is one term greater than 0.5, say Ch{A1 |B} ≥ 0.5,
Ch{Ai |B} < 0.5,
i = 2, 3, · · ·
If Ch{∪i Ai |B} = 0.5, then we immediately have (∞ ) ∞ [ X Ch Ai ∩ B ≤ Ch{Ai |B}. i=1
i=1
If Ch{∪i Ai |B} > 0.5, we may prove the above inequality by the following facts: ! ∞ ∞ [ \ c c A1 ∩ B ⊂ (Ai ∩ B) ∪ Ai ∩ B , i=2
i=1
168
Chapter 3 - Chance Theory
Ch{Ac1
∩ B} ≤
∞ X
Ch{Ai ∩ B} + Ch
(∞ \
i=2
Ch
(∞ [
) Aci
∩B
,
i=1
Ch
) Ai |B
=1−
i=1
(∞ \
) Aci
∩B
i=1
,
Ch{B} ∞ X
∞ X
Ch{Ac1 ∩ B} + Ch{Ai |B} ≥ 1 − Ch{B} i=1
Ch{Ai ∩ B}
i=2
Ch{B}
.
If there are at least two terms greater than 0.5, then the countable subadditivity is clearly true. Thus Ch{·|B} is countably subadditive. Hence Ch{·|B} is an uncertain measure. Definition 3.25 (Li and Liu [106]) The conditional chance distribution Φ: < → [0, 1] of a hybrid variable ξ given B is defined by Φ(x|B) = Ch {ξ ≤ x|B}
(3.69)
provided that Ch{B} > 0. Example 3.27: Let ξ and η be hybrid variables. Then the conditional chance distribution of ξ given η = y is
Φ(x|η = y) =
Ch{ξ ≤ x, η = y} , Ch{η = y}
1−
if
Ch{ξ ≤ x, η = y} < 0.5 Ch{η = y}
Ch{ξ > x, η = y} Ch{ξ > x, η = y} , if < 0.5 Ch{η = y} Ch{η = y} 0.5,
otherwise
provided that Ch{η = y} > 0. Definition 3.26 (Li and Liu [106]) The conditional chance density function φ of a hybrid variable ξ given B is a nonnegative function such that Z x Φ(x|B) = φ(y|B)dy, ∀x ∈ <, (3.70) −∞
Z
+∞
φ(y|B)dy = 1 −∞
where Φ(x|B) is the conditional chance distribution of ξ given B.
(3.71)
169
Section 3.15 - Hybrid Process
Definition 3.27 (Li and Liu [106]) Let ξ be a hybrid variable. Then the conditional expected value of ξ given B is defined by Z +∞ Z 0 E[ξ|B] = Ch{ξ ≥ r|B}dr − Ch{ξ ≤ r|B}dr (3.72) −∞
0
provided that at least one of the two integrals is finite. Following conditional chance and conditional expected value, we also have conditional variance, conditional moments, conditional critical values, conditional entropy as well as conditional convergence.
3.15
Hybrid Process
Definition 3.28 (Liu [133]) Let T be an index set, and (Θ, P, Cr)×(Ω, A, Pr) a chance space. A hybrid process is a measurable function from T ×(Θ, P, Cr)× (Ω, A, Pr) to the set of real numbers, i.e., for each t ∈ T and any Borel set B of real numbers, the set (3.73) {(θ, ω) ∈ Θ × Ω X(t, θ, ω) ∈ B} is an event. That is, a hybrid process X( θ, ω) is a function of three variables such that the function Xt∗ (θ, ω) is a hybrid variable for each t∗ . For each fixed (θ∗ , ω ∗ ), the function Xt (θ∗ , ω ∗ ) is called a sample path of the hybrid process. A hybrid process Xt (θ, ω) is said to be sample-continuous if the sample path is continuous for almost all (θ, ω). Definition 3.29 (Liu [133]) A hybrid process Xt is said to have independent increments if Xt1 − Xt0 , Xt2 − Xt1 , · · · , Xtk − Xtk−1 (3.74) are independent hybrid variables for any times t0 < t1 < · · · < tk . A hybrid process Xt is said to have stationary increments if, for any given t > 0, the Xs+t − Xs are identically distributed hybrid variables for all s > 0. Example 3.28: Let Xt be a fuzzy process and let Yt be a stochastic process. Then Xt + Yt is a hybrid process. Hybrid Renewal Process Definition 3.30 (Liu [133]) Let ξ1 , ξ2 , · · · be iid positive hybrid variables. Define S0 = 0 and Sn = ξ1 + ξ2 + · · · + ξn for n ≥ 1. Then the hybrid process Nt = max n Sn ≤ t (3.75) n≥0
is called a hybrid renewal process.
170
Chapter 3 - Chance Theory
If ξ1 , ξ2 , · · · denote the interarrival times of successive events. Then Sn can be regarded as the waiting time until the occurrence of the nth event, and Nt is the number of renewals in (0, t]. Each sample path of Nt is a right-continuous and increasing step function taking only nonnegative integer values. Furthermore, the size of each jump of Nt is always 1. In other words, Nt has at most one renewal at each time. In particular, Nt does not jump at time 0. Since Nt ≥ n if and only if Sn ≤ t, we have Ch{Nt ≥ n} = Ch{Sn ≤ t}.
(3.76)
Theorem 3.48 (Liu [133]) Let Nt be a hybrid renewal process. Then we have ∞ X E[Nt ] = Ch{Sn ≤ t}. (3.77) n=1
Proof: Since Nt takes only nonnegative integer values, we have Z ∞ ∞ Z n X E[Nt ] = Ch{Nt ≥ r}dr = Ch{Nt ≥ r}dr 0
=
∞ X
n=1
Ch{Nt ≥ n} =
n=1
∞ X
n−1
Ch{Sn ≤ t}.
n=1
The theorem is proved. D Process Definition 3.31 (Liu [133]) Let Bt be a Brownian motion, and let Ct be a C process. Then Dt = (Bt , Ct ) (3.78) is called a D process. The D process is said to be standard if both Bt and Ct are standard. Definition 3.32 (Liu [133]) Let Bt be a standard Brownian motion, and let Ct be a standard C process. Then the hybrid process Dt∗ = et + σ1 Bt + σ2 Ct
(3.79)
is called a scalar D process. The parameter e is called the drift coefficient, σ1 is called the random diffusion coefficient, and σ2 is called the fuzzy diffusion coefficient. Definition 3.33 (Liu [133]) Let Bt be a standard Brownian motion, and let Ct be a standard C process. Then the hybrid process Gt = exp(et + σ1 Bt + σ2 Ct ) is called a geometric D process.
(3.80)
171
Section 3.16 - Hybrid Calculus
C. t
... .......... .. ... ..... ...... .... ............ ... .. .. ..... .. .. ..... .. ...... ..... . . . . . . ... . . . . . .... ........................ ......... ........ ... ... ... ..... ........... ....... ... ... ...... .... ......... .............. ........... . ... ........ ... ..... .... ... ........ ........ . . ... . . ............ ..... . . . ... . . . . .......... ...... ......... ... . . . . . . . . . ... . . . ..... ... ... .... ... . . . . . . ..... . . . ... . . . . . . . . ......................... . ... ...... ..... .... ... . . . . . . . . . . . . . . ... ..... . . . . ... ... ..... .... ........................ ....... . . . . . . . . . ... ...................................... . ................ ..... ... ... ... ... ......... .................. ... ....... .......... ............... ............. ...... ...... ... ... ............. ........ . . . . . . . . ...... ........... ..... ...... . . . . . . . ... . . ........... ......... .... ..... ...... ..... .. .......................................... .. ... ... ................................. .. ........................................................................................ .............. ...... .... .. ........... . ... ................................................. . ....... .. .. ..... ............ ... ............. ........................................ ... ............. .......... . . . . . . . . . . . ... . . . . .................... ........ ...... ... ..... .. ... ....... .. ... ... ... ... ......... ...................... ..... .. ....................... .... ... . . . . . . . . . .... ....................... . . . . ... . ................... ..... ....... ... .... ........ ......... ................ ... ................... ..... ........... .... .. ...... ...... ........ ... . ... . . . . . . . . . . . ... . . ....... ..... ................ ... ..... ... ............ ... ...... .... ... ...... .. ...... ... .................................................................................................................................................................................................................................................................... ...
Bt
Figure 3.2: A Sample Path of Standard D Process A Hybrid Stock Model Liu [133] assumed that stock price follows geometric D process, and presented a hybrid stock model in which the bond price Xt and the stock price Yt are determined by ( Xt = X0 exp(rt) (3.81) Yt = Y0 exp(et + σ1 Bt + σ2 Ct ) where r is the riskless interest rate, e is the stock drift, σ1 is the random stock diffusion, σ2 is the fuzzy stock diffusion, Bt is a standard Brownian motion, and Ct is a standard C process. For exploring further development of hybrid stock models, the interested readers may consult Gao [51].
3.16
Hybrid Calculus
Let Dt be a standard D process, and dt an infinitesimal time interval. Then dDt = Dt+dt − Dt = (dBt , dCt ) is a hybrid process. Definition 3.34 (Liu [133]) Let Xt = (Yt , Zt ) where Yt and Zt are scalar hybrid processes, and let Dt = (Bt , Ct ) be a standard D process. For any partition of closed interval [a, b] with a = t1 < t2 < · · · < tk+1 = b, the mesh is written as ∆ = max |ti+1 − ti |. 1≤i≤k
Then the hybrid integral of Xt with respect to Dt is Z b k X Xt dDt = lim Yti (Bti+1 − Bti ) + Zti (Cti+1 − Cti ) a
∆→0
i=1
(3.82)
172
Chapter 3 - Chance Theory
provided that the limit exists in mean square and is a hybrid variable. Remark 3.9: The hybrid integral may also be written as follows, Z b Z b Xt dDt = (Yt dBt + Zt dCt ) . a
(3.83)
a
Example 3.29: Let Bt be a standard Brownian motion, and Ct a standard C process. Then Z s (σ1 dBt + σ2 dCt ) = σ1 Bs + σ2 Cs 0
where σ1 and σ2 are constants, random variables, fuzzy variables, or hybrid variables. Example 3.30: Let Bt be a standard Brownian motion, and Ct a standard C process. Then Z s 1 (Bt dBt + Ct dCt ) = (Bs2 − s + Cs2 ). 2 0 Example 3.31: Let Bt be a standard Brownian motion, and Ct a standard C process. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, write ∆Bi = Bti+1 − Bti ,
∆Ci = Cti+1 − Cti
and obtain k X
Bs Cs =
Bti+1 Cti+1 − Bti Cti
i=1 k X
=
Bti ∆Ci +
i=1 Z s
→
k X
Cti ∆Bi +
i=1 Z s
Bt dCt + 0
k X
∆Bi ∆Ci
i=1
Ct dBt + 0 0
as ∆ → 0. That is, Z
s
(Ct dBt + Bt dCt ) = Bs Cs . 0
Theorem 3.49 (Liu [133]) Let Bt be a standard Brownian motion, Ct a standard C process, and h(t, b, c) a twice continuously differentiable function. Define Xt = h(t, Bt , Ct ). Then we have the following chain rule dXt =
∂h ∂h (t, Bt , Ct )dt + (t, Bt , Ct )dBt ∂t ∂b ∂h 1 ∂2h + (t, Bt , Ct )dCt + (t, Bt , Ct )dt. ∂c 2 ∂b2
(3.84)
Section 3.17 - Hybrid Differential Equation
173
Proof: Since the function h is twice continuously differentiable, by using Taylor series expansion, the infinitesimal increment of Xt has a second-order approximation ∆Xt =
∂h ∂h ∂h (t, Bt , Ct )∆t + (t, Bt , Ct )∆Bt + (t, Bt , Ct )∆Ct ∂t ∂b ∂c
+
1 ∂2h 1 ∂2h 1 ∂2h (t, Bt , Ct )(∆t)2 + (t, Bt , Ct )(∆Bt )2 + (t, Bt , Ct )(∆Ct )2 2 2 2 ∂t 2 ∂b 2 ∂c2
+
∂2h ∂2h ∂2h (t, Bt , Ct )∆t∆Bt + (t, Bt , Ct )∆t∆Ct + (t, Bt , Ct )∆Bt ∆Ct . ∂t∂b ∂t∂c ∂b∂c
Since we can ignore the terms (∆t)2 , (∆Ct )2 , ∆t∆Bt , ∆t∆Ct , ∆Bt ∆Ct and replace (∆Bt )2 with ∆t, the chain rule is obtained because it makes Z s Z s Z s Z 1 s ∂2h ∂h ∂h ∂h dt + dBt + dCt + dt Xs = X0 + 2 0 ∂b2 0 ∂t 0 ∂b 0 ∂c for any s ≥ 0. Remark 3.10: The infinitesimal increments dBt and dCt in (3.84) may be replaced with the derived D process dYt = ut dt + v1t dBt + v2t dCt
(3.85)
where ut and v2t are absolutely integrable hybrid processes, and v1t is a square integrable hybrid process, thus producing dh(t, Yt ) =
∂h ∂h 1 ∂2h 2 (t, Yt )dt + (t, Yt )dYt + (t, Yt )v1t dt. ∂t ∂b 2 ∂b2
(3.86)
Remark 3.11: Assume that B1t , B2t , · · · , Bmt are standard Brownian motions, C1t , C2t , · · · , Cnt are standard C processes, and h(t, b1 , b2 , · · · , bm , c1 , c2 , · · · , cn ) is a twice continuously differentiable function. Define Xt = h(t, B1t , B2t , · · · , Bmt , C1t , C2t , · · · , Cnt ). You [244] proved the following chain rule dXt =
m n m X X ∂h ∂h ∂h 1 X ∂2h dt + dBit + dCjt + dt. ∂t ∂bi ∂cj 2 i=1 ∂b2i i=1 j=1
(3.87)
Example 3.32: Applying the chain rule, we obtain the following formulas, d(Bt Ct ) = Ct dBt + Bt dCt , d(tBt Ct ) = Bt Ct dt + tCt dBt + tBt dCt , 2
d(t + Bt2 + Ct2 ) = 2tdt + 2Bt dBt + 2Ct dCt + dt.
174
3.17
Chapter 3 - Chance Theory
Hybrid Differential Equation
Hybrid differential equation is a type of differential equation driven by both Brownian motion and C process. Definition 3.35 (Liu [133]) Suppose Bt is a standard Brownian motion, Ct is a standard C process, and f, g1 , g2 are some given functions. Then dXt = f (t, Xt )dt + g1 (t, Xt )dBt + g2 (t, Xt )dCt
(3.88)
is called a hybrid differential equation. A solution is a hybrid process Xt that satisfies (3.88) identically in t. Remark 3.12: Note that there is no precise definition for the terms dXt , dt, dBt and dCt in the hybrid differential equation (3.88). The mathematically meaningful form is the hybrid integral equation Z s Z s Z s Xs = X0 + f (t, Xt )dt + g1 (t, Xt )dBt + g2 (t, Xt )dCt . (3.89) 0
0
0
However, the differential form is more convenient for us. This is the main reason why we accept the differential form. Example 3.33: Let Bt be a standard Brownian motion, and let a ˜ and ˜b be two fuzzy variables. Then the hybrid differential equation dXt = a ˜dt + ˜bdBt has a solution Xt = a ˜t + ˜bBt . The hybrid differential equation dXt = a ˜Xt dt + ˜bXt dBt has a solution Xt = exp
˜b2 a ˜− 2
!
! t + ˜bBt
.
Example 3.34: Let Ct be a standard C process, and let ξ and η be two random variables. Then the hybrid differential equation dXt = ξdt + ηdCt has a solution Xt = ξt + ηCt .
175
Section 3.17 - Hybrid Differential Equation
The hybrid differential equation dXt = ξXt dt + ηXt dCt has a solution Xt = exp (ξt + ηCt ) . Example 3.35: Let Bt be a standard Brownian motion, and Ct a standard C process. Then the hybrid differential equation dXt = adt + bdBt + cdCt has a solution Xt = at + bBt + cCt which is just a scalar D process. Example 3.36: Let Bt be a standard Brownian motion, and Ct a standard C process. Then the hybrid differential equation dXt = aXt dt + bXt dBt + cXt dCt has a solution
Xt = exp
a−
which is just a geometric D process.
b2 2
t + bBt + cCt
Chapter 4
Uncertainty Theory A classical measure is essentially a set function satisfying nonnegativity and ´ countable additivity axioms. Classical measure theory, developed by Emile Borel and Henri Lebesgue around 1900, has been widely applied in both theory and practice. However, the additivity axiom of classical measure theory has been challenged by many mathematicians. The earliest challenge was from the theory of capacities by Choquet [22] in which monotonicity and continuity axioms were assumed. Sugeno [214] generalized classical measure theory to fuzzy measure theory by replacing additivity axiom with monotonicity and semicontinuity axioms. In order to deal with general uncertainty, “self-duality” plus “countable subadditivity” is much more important than “continuity” and “semicontinuity”. For this reason, Liu [132] founded an uncertainty theory that is a branch of mathematics based on normality, monotonicity, self-duality, and countable subadditivity axioms. Uncertainty theory provides the commonness of probability theory, credibility theory and chance theory. The emphasis in this chapter is mainly on uncertain measure, uncertainty space, uncertain variable, uncertainty distribution, expected value, variance, moments, independence, identical distribution, critical values, entropy, distance, convergence almost surely, convergence in measure, convergence in mean, convergence in distribution, conditional uncertainty, uncertain process, canonical process, uncertain calculus, and uncertain differential equation.
4.1
Uncertainty Space
Let Γ be a nonempty set, and let L be a σ-algebra over Γ. Each element Λ ∈ L is called an event. In order to present an axiomatic definition of uncertain measure, it is necessary to assign to each event Λ a number M{Λ} which indicates the level that Λ will occur. In order to ensure that the number
178
Chapter 4 - Uncertainty Theory
M{Λ} has certain mathematical properties, Liu [132] proposed the following four axioms:
M{Γ} = 1. Axiom 2. (Monotonicity) M{Λ1 } ≤ M{Λ2 } whenever Λ1 ⊂ Λ2 . Axiom 3. (Self-Duality) M{Λ} + M{Λc } = 1 for any event Λ. Axiom 1. (Normality)
Axiom 4. (Countable Subadditivity) For every countable sequence of events {Λi }, we have (∞ ) ∞ [ X M Λi ≤ M{Λi }. (4.1) i=1
i=1
. Remark 4.1: Pathology occurs if self-duality axiom is not assumed. For example, we define a set function that takes value 1 for each set. Then it satisfies all axioms but self-duality. Is it not strange if such a set function serves as a measure? Remark 4.2: Pathology occurs if subadditivity is not assumed. For example, suppose that a universal set contains 3 elements. We define a set function that takes value 0 for each singleton, and 1 for each set with at least 2 elements. Then such a set function satisfies all axioms but subadditivity. Is it not strange if such a set function serves as a measure? Remark 4.3: Pathology occurs if countable subadditivity axiom is replaced with finite subadditivity axiom. For example, assume the universal set consists of all real numbers. We define a set function that takes value 0 if the set is bounded, 0.5 if both the set and complement are unbounded, and 1 if the complement of the set is bounded. Then such a set function is finitely subadditive but not countably subadditive. Is it not strange if such a set function serves as a measure? Definition 4.1 (Liu [132]) The set function M is called an uncertain measure if it satisfies the normality, monotonicity, self-duality, and countable subadditivity axioms. Example 4.1: Probability, credibility and chance measures are instances of uncertain measure. Example 4.2: Let Pr be a probability measure, Cr a credibility measure, and a a number in [0, 1]. Then
M{Λ} = a Pr{Λ} + (1 − a)Cr{Λ} is an uncertain measure.
(4.2)
179
Section 4.1 - Uncertainty Space
Example 4.3: Let Γ = {γ1 , γ2 , γ3 }. For this case, there are only 8 events. Define M{γ1 } = 0.6, M{γ2 } = 0.3, M{γ3 } = 0.2,
M{γ1 , γ2 } = 0.8, M{γ1 , γ3 } = 0.7, M{γ2 , γ3 } = 0.4, M{∅} = 0, M{Γ} = 1.
M is an uncertain measure because it satisfies the four axioms. Example 4.4: Let Γ = [−1, 1] with Borel algebra L and Lebesgue measure Then
π. Then the set function
M{Λ} =
π{Λ},
if π{Λ} < 0.5
c
1 − π{Λ }, if π{Λc } < 0.5 0.5,
otherwise
is an uncertain measure. Theorem 4.1 Suppose that
M is an uncertain measure. Then we have M{∅} = 0, (4.3) 0 ≤ M{Λ} ≤ 1
(4.4)
for any event Λ. Proof: It follows from the normality and self-duality axioms that M{∅} = 1 − M{Γ} = 1 − 1 = 0. It follows from the monotonicity axiom that 0 ≤ M{Λ} ≤ 1 because ∅ ⊂ Λ ⊂ Γ. Theorem 4.2 Suppose that M is an uncertain measure. Then for any events Λ1 and Λ2 , we have
M{Λ1 } ∨ M{Λ2 } ≤ M{Λ1 ∪ Λ2 } ≤ M{Λ1 } + M{Λ2 }
(4.5)
Proof: The left-hand inequality follows from the monotonicity axiom and the right-hand inequality follows from the countable subadditivity axiom immediately. Theorem 4.3 Let Γ = {γ1 , γ2 , · · · }. If
M is an uncertain measure, then
M{γi } + M{γj } ≤ 1 ≤
∞ X k=1
for any i and j.
M{γk }
(4.6)
180
Chapter 4 - Uncertainty Theory
Proof: Since
M is increasing and self-dual, we have, for any i and j, M{γi } + M{γj } ≤ M{Γ\{γj }} + M{γj } = 1.
Since Γ = ∪k {γk } and
M is countably subadditive, we have
1 = M{Γ} = M
(
∞ [
) {γk }
≤
k=1
∞ X
M{γk }.
k=1
The theorem is proved. Uncertainty Null-Additivity Theorem Null-additivity is a direct deduction from subadditivity. We first prove a more general theorem. Theorem 4.4 Let {Λi } be a decreasing sequence of events with as i → ∞. Then for any event Λ, we have lim
i→∞
M{Λi } → 0
M{Λ ∪ Λi } = i→∞ lim M{Λ\Λi } = M{Λ}.
(4.7)
Proof: It follows from the monotonicity and countable subadditivity axioms that M{Λ} ≤ M{Λ ∪ Λi } ≤ M{Λ} + M{Λi }
for each i. Thus we get M{Λ ∪ Λi } → (Λ\Λi ) ⊂ Λ ⊂ ((Λ\Λi ) ∪ Λi ), we have
M{Λ} by using M{Λi } → 0.
Since
M{Λ\Λi } ≤ M{Λ} ≤ M{Λ\Λi } + M{Λi }. Hence
M{Λ\Λi } → M{Λ} by using M{Λi } → 0.
Remark 4.4: It follows from the above theorem that the uncertain measure is null-additive, i.e., M{Λ1 ∪ Λ2 } = M{Λ1 } + M{Λ2 } if either M{Λ1 } = 0 or M{Λ2 } = 0. In other words, the uncertain measure remains unchanged if the event is enlarged or reduced by an event with measure zero. Uncertainty Asymptotic Theorem Theorem 4.5 (Uncertainty Asymptotic Theorem) For any events Λ1 , Λ2 , · · · , we have lim M{Λi } > 0, if Λi ↑ Γ, (4.8) i→∞
lim
i→∞
M{Λi } < 1,
if Λi ↓ ∅.
(4.9)
181
Section 4.2 - Uncertain Variables
Proof: Assume Λi ↑ Γ. Since Γ = ∪i Λi , it follows from the countable subadditivity axioms that 1 = M{Γ} ≤
∞ X
M{Λi }.
i=1
Since M{Λi } is increasing with respect to i, we have limi→∞ M{Λi } > 0. If Λi ↓ ∅, then Λci ↑ Γ. It follows from the first inequality and self-duality axiom that lim M{Λi } = 1 − lim M{Λci } < 1. i→∞
i→∞
The theorem is proved. Example 4.5: Assume Γ is the set of real numbers. Let α be a number with 0 < α ≤ 0.5. Define a set function as follows, 0, if Λ = ∅ α, if Λ is upper bounded (4.10) M{Λ} = 0.5, if both Λ and Λc are upper unbounded c 1 − α, if Λ is upper bounded 1, if Λ = Γ. It is easy to verify that M is an uncertain measure. Write Λi = (−∞, i] for i = 1, 2, · · · Then Λi ↑ Γ and limi→∞ M{Λi } = α. Furthermore, we have Λci ↓ ∅ and limi→∞ M{Λci } = 1 − α. Uncertainty Space Definition 4.2 (Liu [132]) Let Γ be a nonempty set, L a σ-algebra over Γ, and M an uncertain measure. Then the triplet (Γ, L, M) is called an uncertainty space.
4.2
Uncertain Variables
Definition 4.3 (Liu [132]) An uncertain variable is a measurable function from an uncertainty space (Γ, L, M) to the set of real numbers, i.e., for any Borel set B of real numbers, the set {ξ ∈ B} = {γ ∈ Γ ξ(γ) ∈ B} (4.11) is an event. Example 4.6: Random variable, fuzzy variable and hybrid variable are instances of uncertain variable.
182
Chapter 4 - Uncertainty Theory
Definition 4.4 An uncertain variable ξ on the uncertainty space (Γ, L, M) is said to be (a) nonnegative if M{ξ < 0} = 0; (b) positive if M{ξ ≤ 0} = 0; (c) continuous if M{ξ = x} is a continuous function of x; (d) simple if there exists a finite sequence {x1 , x2 , · · · , xm } such that
M {ξ 6= x1 , ξ 6= x2 , · · · , ξ 6= xm } = 0;
(4.12)
(e) discrete if there exists a countable sequence {x1 , x2 , · · · } such that
M {ξ 6= x1 , ξ 6= x2 , · · · } = 0. (4.13) It is clear that 0 ≤ M{ξ = x} ≤ 1, and there is at most one point x0 such that M{ξ = x0 } > 0.5. For a continuous uncertain variable, we always have 0 ≤ M{ξ = x} ≤ 0.5. Definition 4.5 Let ξ1 and ξ2 be uncertain variables defined on the uncertainty space (Γ, L, M). We say ξ1 = ξ2 if ξ1 (γ) = ξ2 (γ) for almost all γ ∈ Γ. Uncertain Vector Definition 4.6 An n-dimensional uncertain vector is a measurable function from an uncertainty space (Γ, L, M) to the set of n-dimensional real vectors, i.e., for any Borel set B of
183
Section 4.2 - Uncertain Variables
The vector ξ = (ξ1 , ξ2 , · · · , ξn ) is proved to be an uncertain vector if we can prove that B contains all Borel sets of
i=1
is an event. Next, the class B is a σ-algebra of
i=1
is an event. This means that ∪i Bi ∈ B. Since the smallest σ-algebra containing all open intervals of
∀γ ∈ Γ.
(4.15)
Example 4.7: Let ξ1 and ξ2 be two uncertain variables. Then the sum ξ = ξ1 + ξ2 is an uncertain variable defined by ξ(γ) = ξ1 (γ) + ξ2 (γ),
∀γ ∈ Γ.
The product ξ = ξ1 ξ2 is also an uncertain variable defined by ξ(γ) = ξ1 (γ) · ξ2 (γ),
∀γ ∈ Γ.
The reader may wonder whether ξ(γ1 , γ2 , · · · , γn ) defined by (4.15) is an uncertain variable. The following theorem answers this question. Theorem 4.7 Let ξ be an n-dimensional uncertain vector, and f :
184
Chapter 4 - Uncertainty Theory
Proof: Assume that ξ is an uncertain vector on the uncertainty space (Γ, L, M). For any Borel set B of <, since f is a measurable function, the f −1 (B) is a Borel set of
4.3
Identification Function
As we know, a random variable may be characterized by a probability density function, and a fuzzy variable may be described by a membership function. This section will introduce an identification function to characterize an uncertain variable. Definition 4.8 An uncertain variable ξ is said to have an identification function (λ, ρ) if (i) λ(x) is a nonnegative function and ρ(x) is a nonnegative and integrable function such that Z sup λ(x) + x∈<
ρ(x)dx = 1;
(4.16)
<
(ii) for any Borel set B of real numbers, we have Z M{ξ ∈ B} = 21 sup λ(x) + sup λ(x) − supc λ(x) + ρ(x)dx. x∈B x∈B x∈< B
(4.17)
Remark 4.5: It is not true that all uncertain variables have their own identification functions. Remark 4.6: The uncertain variable with identification function (λ, ρ) is essentially a fuzzy variable if sup λ(x) = 1. x∈<
For this case, λ is a membership function and ρ ≡ 0, a.e. Remark 4.7: The uncertain variable with identification function (λ, ρ) is essentially a random variable if Z ρ(x)dx = 1. <
For this case, ρ is a probability density function and λ ≡ 0. Remark 4.8: Let ξ be an uncertain variable with identification function (λ, ρ). If λ(x) is a continuous function, then we have
M{ξ = x} = λ(x) , 2
∀x ∈ <.
(4.18)
185
Section 4.4 - Uncertainty Distribution
Theorem 4.8 Suppose λ(x) is a nonnegative function and ρ(x) is a nonnegative and integrable function satisfying (4.16). Then there is an uncertain variable ξ such that (4.17) holds. Proof: Let < be the universal set. For each Borel set B of real numbers, we define a set function Z M{B} = 12 sup λ(x) + sup λ(x) − supc λ(x) + ρ(x)dx. x∈B x∈B x∈< B It is clear that M is normal, increasing, self-dual, and countably subadditive. That is, the set function M is indeed an uncertain measure. Now we define an uncertain variable ξ as an identity function from the uncertainty space (<, A, M) to <. We may verify that ξ meets (4.17). The theorem is proved.
4.4
Uncertainty Distribution
Definition 4.9 (Liu [132]) The uncertainty distribution Φ: < → [0, 1] of an uncertain variable ξ is defined by Φ(x) = M γ ∈ Γ ξ(γ) ≤ x . (4.19) Theorem 4.9 An uncertainty distribution is an increasing function such that 0 ≤ lim Φ(x) < 1, 0 < lim Φ(x) ≤ 1. (4.20) x→−∞
x→+∞
Proof: It is obvious that an uncertainty distribution Φ is an increasing function, and the inequalities (4.20) follow from the uncertainty asymptotic theorem immediately. Definition 4.10 A continuous uncertain variable is said to be (a) singular if its uncertainty distribution is a singular function; (b) absolutely continuous if its uncertainty distribution is absolutely continuous. Definition 4.11 (Liu [132]) The uncertainty density function φ: < → [0, +∞) of an uncertain variable ξ is a function such that Z x Φ(x) = φ(y)dy, ∀x ∈ <, (4.21) −∞
Z
+∞
φ(y)dy = 1 −∞
where Φ is the uncertainty distribution of ξ.
(4.22)
186
Chapter 4 - Uncertainty Theory
Example 4.8: The uncertainty density function may not exist even if the uncertainty distribution is continuous and differentiable a.e. Suppose f is the Cantor function, and set if x < 0 0, f (x), if 0 ≤ x ≤ 1 (4.23) Φ(x) = 1, if x > 1. Then Φ is an increasing and continuous function, and is an uncertainty distribution. Note that Φ0 (x) = 0 almost everywhere, and Z +∞ Φ0 (x)dx = 0 6= 1. −∞
Thus the uncertainty density function does not exist. Theorem 4.10 Let ξ be an uncertain variable whose uncertainty density function φ exists. Then we have Z x Z +∞ M{ξ ≤ x} = φ(y)dy, M{ξ ≥ x} = φ(y)dy. (4.24) −∞
x
Proof: The first part follows immediately from the definition. In addition, by the self-duality of uncertain measure, we have Z +∞ Z x Z +∞ M{ξ ≥ x} = 1 − M{ξ < x} = φ(y)dy − φ(y)dy = φ(y)dy. −∞
−∞
x
The theorem is proved. Joint Uncertainty Distribution Definition 4.12 Let (ξ1 , ξ2 , · · · , ξn ) be an uncertain vector. Then the joint uncertainty distribution Φ :
−∞
−∞
holds for all (x1 , x2 , · · · , xn ) ∈
−∞
−∞
where Φ is the joint uncertainty distribution of (ξ1 , ξ2 , · · · , ξn ).
187
Section 4.5 - Expected Value
4.5
Expected Value
Expected value is the average value of uncertain variable in the sense of uncertain measure, and represents the size of uncertain variable. Definition 4.14 (Liu [132]) Let ξ be an uncertain variable. Then the expected value of ξ is defined by Z
+∞
E[ξ] = 0
M{ξ ≥ r}dr −
0
Z
−∞
M{ξ ≤ r}dr
(4.25)
provided that at least one of the two integrals is finite. Theorem 4.11 Let ξ be an uncertain variable whose uncertainty density function φ exists. If the Lebesgue integral Z
+∞
xφ(x)dx −∞
is finite, then we have +∞
Z E[ξ] =
xφ(x)dx.
(4.26)
−∞
Proof: It follows from the definition of expected value operator and Fubini Theorem that Z +∞ Z 0 E[ξ] = M{ξ ≥ r}dr − M{ξ ≤ r}dr −∞
0
Z
+∞
Z
+∞
Z
+∞
= 0
Z
Z φ(x)dx dr −
0
Z
−∞
r x
Z
0
0
Z
−∞
0 +∞
=
Z
0
φ(x)dr dx
x
0
xφ(x)dx +
xφ(x)dx −∞
0
Z
φ(x)dx dr
−∞
Z
φ(x)dr dx −
=
r
+∞
=
xφ(x)dx. −∞
The theorem is proved. Theorem 4.12 Let ξ be an uncertain variable with uncertainty distribution Φ. If lim Φ(x) = 0, lim Φ(x) = 1 x→−∞
x→∞
188
Chapter 4 - Uncertainty Theory
and the Lebesgue-Stieltjes integral Z +∞ xdΦ(x) −∞
is finite, then we have Z
+∞
xdΦ(x).
E[ξ] =
(4.27)
−∞
R +∞ Proof: Since the Lebesgue-Stieltjes integral −∞ xdΦ(x) is finite, we immediately have Z 0 Z 0 Z y Z +∞ xdΦ(x) = xdΦ(x) xdΦ(x) = xdΦ(x), lim lim y→+∞
0
y→−∞
0
and Z
+∞
lim
y→+∞
y
Z xdΦ(x) = 0,
y
−∞
y
lim
xdΦ(x) = 0.
y→−∞
−∞
It follows from Z +∞ xdΦ(x) ≥ y lim Φ(z) − Φ(y) = y (1 − Φ(y)) ≥ 0, z→+∞
y
Z
y
xdΦ(x) ≤ y Φ(y) − lim Φ(z) = yΦ(y) ≤ 0, z→−∞
−∞
for y > 0,
for y < 0
that lim y (1 − Φ(y)) = 0,
y→+∞
lim yΦ(y) = 0.
y→−∞
Let 0 = x0 < x1 < x2 < · · · < xn = y be a partition of [0, y]. Then we have n−1 X
xdΦ(x) 0
i=0
and
n−1 X
y
Z xi (Φ(xi+1 ) − Φ(xi )) →
y
Z (1 − Φ(xi+1 ))(xi+1 − xi ) → 0
i=0
M{ξ ≥ r}dr
as max{|xi+1 − xi | : i = 0, 1, · · · , n − 1} → 0. Since n−1 X
xi (Φ(xi+1 ) − Φ(xi )) −
i=0
n−1 X
(1 − Φ(xi+1 )(xi+1 − xi ) = y(Φ(y) − 1) → 0
i=0
as y → +∞. This fact implies that Z +∞ Z M{ξ ≥ r}dr = 0
0
+∞
xdΦ(x).
189
Section 4.5 - Expected Value
A similar way may prove that Z Z 0 M{ξ ≤ r}dr = −
0
xdΦ(x).
−∞
−∞
It follows that the equation (4.27) holds. Theorem 4.13 Let ξ be an uncertain variable with finite expected value. Then for any real numbers a and b, we have E[aξ + b] = aE[ξ] + b.
(4.28)
Proof: Step 1: We first prove that E[ξ + b] = E[ξ] + b for any real number b. If b ≥ 0, we have Z +∞ Z 0 E[ξ + b] = M{ξ + b ≥ r}dr − M{ξ + b ≤ r}dr −∞ 0
0
+∞
Z = 0
M{ξ ≥ r − b}dr − Z
= E[ξ] + 0
b
Z
−∞
M{ξ ≤ r − b}dr
(M{ξ ≥ r − b} + M{ξ < r − b})dr
= E[ξ] + b. If b < 0, then we have Z 0 E[aξ + b] = E[ξ] − (M{ξ ≥ r − b} + M{ξ < r − b})dr = E[ξ] + b. b
Step 2: We prove E[aξ] = aE[ξ]. If a = 0, then the equation E[aξ] = aE[ξ] holds trivially. If a > 0, we have Z +∞ Z 0 E[aξ] = M{aξ ≥ r}dr − M{aξ ≤ r}dr 0
+∞
Z = 0
Z
M{ξ ≥ r/a}dr −
+∞
=a 0
−∞ 0
Z
−∞ Z 0
M{ξ ≥ t}dt − a
−∞
M{ξ ≤ r/a}dr M{ξ ≤ t}dt
= aE[ξ]. If a < 0, we have +∞
Z E[aξ] = 0
+∞
Z = 0
Z
M{aξ ≥ r}dr −
Z
M{ξ ≤ r/a}dr −
+∞
=a 0
= aE[ξ].
M{ξ ≥ t}dt − a
0
−∞ Z 0
M{aξ ≤ r}dr
−∞ Z 0 −∞
M{ξ ≥ r/a}dr M{ξ ≤ t}dt
190
Chapter 4 - Uncertainty Theory
Finally, for any real numbers a and b, it follows from Steps 1 and 2 that the theorem holds. Theorem 4.14 Let f be a convex function on [a, b], and ξ an uncertain variable that takes values in [a, b] and has expected value e. Then E[f (ξ)] ≤
b−e e−a f (a) + f (b). b−a b−a
(4.29)
Proof: For each γ ∈ Γ, we have a ≤ ξ(γ) ≤ b and ξ(γ) =
b − ξ(γ) ξ(γ) − a a+ b. b−a b−a
It follows from the convexity of f that f (ξ(γ)) ≤
ξ(γ) − a b − ξ(γ) f (a) + f (b). b−a b−a
Taking expected values on both sides, we obtain the inequality.
4.6
Variance
The variance of an uncertain variable provides a measure of the spread of the distribution around its expected value. A small value of variance indicates that the uncertain variable is tightly concentrated around its expected value; and a large value of variance indicates that the uncertain variable has a wide spread around its expected value. Definition 4.15 (Liu [132]) Let ξ be an uncertain variable with finite expected value e. Then the variance of ξ is defined by V [ξ] = E[(ξ − e)2 ]. Theorem 4.15 If ξ is an uncertain variable with finite expected value, a and b are real numbers, then V [aξ + b] = a2 V [ξ]. Proof: It follows from the definition of variance that V [aξ + b] = E (aξ + b − aE[ξ] − b)2 = a2 E[(ξ − E[ξ])2 ] = a2 V [ξ]. Theorem 4.16 Let ξ be an uncertain variable with expected value e. Then V [ξ] = 0 if and only if M{ξ = e} = 1. Proof: If V [ξ] = 0, then E[(ξ − e)2 ] = 0. Note that Z +∞ E[(ξ − e)2 ] = M{(ξ − e)2 ≥ r}dr 0
which implies
M{(ξ − e)2 ≥ r} = 0 for any r > 0. Hence we have M{(ξ − e)2 = 0} = 1.
191
Section 4.7 - Moments
That is, M{ξ = e} = 1. Conversely, if M{ξ = e} = 1, then we have M{(ξ − e)2 ≥ r} = 0 for any r > 0. Thus Z
+∞
V [ξ] = 0
M{(ξ − e)2
= 0} = 1 and
M{(ξ − e)2 ≥ r}dr = 0.
The theorem is proved. Theorem 4.17 Let ξ be an uncertain variable that takes values in [a, b] and has expected value e. Then V [ξ] ≤ (e − a)(b − e).
(4.30)
Proof: It follows from Theorem 4.14 immediately by defining f (x) = (x−e)2 .
4.7
Moments
Definition 4.16 (Liu [132]) Let ξ be an uncertain variable. Then for any positive integer k, (a) the expected value E[ξ k ] is called the kth moment; (b) the expected value E[|ξ|k ] is called the kth absolute moment; (c) the expected value E[(ξ − E[ξ])k ] is called the kth central moment; (d) the expected value E[|ξ −E[ξ]|k ] is called the kth absolute central moment. Note that the first central moment is always 0, the first moment is just the expected value, and the second central moment is just the variance. Theorem 4.18 Let ξ be a nonnegative uncertain variable, and k a positive number. Then the k-th moment Z +∞ k E[ξ ] = k rk−1 M{ξ ≥ r}dr. (4.31) 0
Proof: It follows from the nonnegativity of ξ that Z ∞ Z ∞ Z E[ξ k ] = M{ξ k ≥ x}dx = M{ξ ≥ r}drk = k 0
0
0
∞
rk−1 M{ξ ≥ r}dr.
The theorem is proved. Theorem 4.19 Let ξ be an uncertain variable, and t a positive number. If E[|ξ|t ] < ∞, then lim xt M{|ξ| ≥ x} = 0. (4.32) x→∞
Conversely, if (4.32) holds for some positive number t, then E[|ξ|s ] < ∞ for any 0 ≤ s < t.
192
Chapter 4 - Uncertainty Theory
Proof: It follows from the definition of expected value operator that +∞
Z
E[|ξ|t ] =
0
M{|ξ|t ≥ r}dr < ∞.
Thus we have +∞
Z lim
x→∞
xt /2
M{|ξ|t ≥ r}dr = 0.
The equation (4.32) is proved by the following relation, Z
+∞
M{|ξ|
t
xt /2
Z
xt
≥ r}dr ≥ xt /2
M{|ξ|t ≥ r}dr ≥ 12 xt M{|ξ| ≥ x}.
Conversely, if (4.32) holds, then there exists a number a such that xt M{|ξ| ≥ x} ≤ 1, ∀x ≥ a. Thus we have E[|ξ|s ] =
Z
a
0
Z
a
= 0
Z
a
M{|ξ|s ≥ r}dr +
Z
M{|ξ|s ≥ r}dr +
Z
M{|ξ|
s
≤ 0
M{|ξ|s ≥ r}dr
+∞
srs−1 M{|ξ| ≥ r}dr
a
a
Z ≥ r}dr + s
+∞
rs−t−1 dr
0
< +∞.
+∞
Z by
+∞
r dr < ∞ for any p < −1 p
0
The theorem is proved. Theorem 4.20 Let ξ be an uncertain variable that takes values in [a, b] and has expected value e. Then for any positive integer k, the kth absolute moment and kth absolute central moment satisfy the following inequalities, E[|ξ|k ] ≤
E[|ξ − e|k ] ≤
b−e k e−a k |a| + |b| , b−a b−a
b−e e−a (e − a)k + (b − e)k . b−a b−a
(4.33)
(4.34)
Proof: It follows from Theorem 4.14 immediately by defining f (x) = |x|k and f (x) = |x − e|k .
193
Section 4.9 - Identical Distribution
4.8
Independence
Definition 4.17 (Liu [132]) The uncertain variables ξ1 , ξ2 , · · · , ξn are said to be independent if " n # n X X E fi (ξi ) = E[fi (ξi )] (4.35) i=1
i=1
for any measurable functions f1 , f2 , · · · , fn provided that the expected values exist and are finite. What are the sufficient conditions for independence of uncertain variables? This is an open problem. Theorem 4.21 If ξ and η are independent uncertain variables with finite expected values, then we have E[aξ + bη] = aE[ξ] + bE[η]
(4.36)
for any real numbers a and b. Proof: The theorem follows from the definition by defining f1 (x) = ax and f2 (x) = bx. Theorem 4.22 Suppose that ξ1 , ξ2 , · · · , ξn are independent uncertain variables, and f1 , f2 , · · · , fn are measurable functions. Then the uncertain variables f1 (ξ1 ), f2 (ξ2 ), · · · , fn (ξn ) are independent. Proof: The theorem follows from the definition because the compound of measurable functions is also measurable.
4.9
Identical Distribution
This section introduces the concept of identical distribution of uncertain variables. Definition 4.18 The uncertain variables ξ and η are identically distributed if M{ξ ∈ B} = M{η ∈ B} (4.37) for any Borel set B of real numbers. Theorem 4.23 Let ξ and η be identically distributed uncertain variables, and f : < → < a measurable function. Then f (ξ) and f (η) are identically distributed uncertain variables.
194
Chapter 4 - Uncertainty Theory
Proof: For any Borel set B of real numbers, we have
M{f (ξ) ∈ B} = M{ξ ∈ f −1 (B)} = M{η ∈ f −1 (B)} = M{f (η) ∈ B}. Hence f (ξ) and f (η) are identically distributed uncertain variables. Theorem 4.24 If ξ and η are identically distributed uncertain variables, then they have the same uncertainty distribution. Proof: Since ξ and η are identically distributed uncertain variables, we have Thus ξ and η have the same uncertainty distribution.
M{ξ ∈ (−∞, x]} = M{η ∈ (−∞, x]} for any x.
Theorem 4.25 If ξ and η are identically distributed uncertain variables whose uncertainty density functions exist, then they have the same uncertainty density function. Proof: It follows from Theorem 4.24 immediately.
4.10
Critical Values
Definition 4.19 (Liu [132]) Let ξ be an uncertain variable, and α ∈ (0, 1]. Then ξsup (α) = sup r M {ξ ≥ r} ≥ α (4.38) is called the α-optimistic value to ξ, and ξinf (α) = inf r M {ξ ≤ r} ≥ α
(4.39)
is called the α-pessimistic value to ξ. Theorem 4.26 Let ξ be an uncertain variable and α a number between 0 and 1. We have (a) if c ≥ 0, then (cξ)sup (α) = cξsup (α) and (cξ)inf (α) = cξinf (α); (b) if c < 0, then (cξ)sup (α) = cξinf (α) and (cξ)inf (α) = cξsup (α). Proof: (a) If c = 0, then the part (a) is obvious. In the case of c > 0, we have (cξ)sup (α) = sup{r M{cξ ≥ r} ≥ α} = c sup{r/c |
M{ξ ≥ r/c} ≥ α}
= cξsup (α). A similar way may prove (cξ)inf (α) = cξinf (α). In order to prove the part (b), it suffices to prove that (−ξ)sup (α) = −ξinf (α) and (−ξ)inf (α) = −ξsup (α). In fact, we have (−ξ)sup (α) = sup{r M{−ξ ≥ r} ≥ α} = − inf{−r |
M{ξ ≤ −r} ≥ α}
= −ξinf (α). Similarly, we may prove that (−ξ)inf (α) = −ξsup (α). The theorem is proved.
195
Section 4.11 - Entropy
Theorem 4.27 Let ξ be an uncertain variable. Then we have (a) if α > 0.5, then ξinf (α) ≥ ξsup (α); (b) if α ≤ 0.5, then ξinf (α) ≤ ξsup (α). ¯ Proof: Part (a): Write ξ(α) = (ξinf (α) + ξsup (α))/2. If ξinf (α) < ξsup (α), then we have ¯ ¯ 1 ≥ M{ξ < ξ(α)} + M{ξ > ξ(α)} ≥ α + α > 1. A contradiction proves ξinf (α) ≥ ξsup (α). Part (b): Assume that ξinf (α) > ξsup (α). It follows from the definition ¯ of ξinf (α) that M{ξ ≤ ξ(α)} < α. Similarly, it follows from the definition of ¯ ξsup (α) that M{ξ ≥ ξ(α)} < α. Thus ¯ ¯ 1 ≤ M{ξ ≤ ξ(α)} + M{ξ ≥ ξ(α)} < α + α ≤ 1.
A contradiction proves ξinf (α) ≤ ξsup (α). The theorem is verified. Theorem 4.28 Let ξ be an uncertain variable. Then ξsup (α) is a decreasing function of α, and ξinf (α) is an increasing function of α. Proof: It follows from the definition immediately.
4.11
Entropy
This section provides a definition of entropy to characterize the uncertainty of uncertain variables resulting from information deficiency. Definition 4.20 (Liu [132]) Suppose that ξ is a discrete uncertain variable taking values in {x1 , x2 , · · · }. Then its entropy is defined by H[ξ] =
∞ X
S(M{ξ = xi })
(4.40)
i=1
where S(t) = −t ln t − (1 − t) ln(1 − t). Example 4.9: Suppose that ξ is a discrete uncertain variable taking values in {x1 , x2 , · · · }. If there exists some index k such that M{ξ = xk } = 1, and 0 otherwise, then its entropy H[ξ] = 0. Example 4.10: Suppose that ξ is a simple uncertain variable taking values in {x1 , x2 , · · · , xn }. If M{ξ = xi } = 0.5 for all i = 1, 2, · · · , n, then its entropy H[ξ] = n ln 2. Theorem 4.29 Suppose that ξ is a discrete uncertain variable taking values in {x1 , x2 , · · · }. Then H[ξ] ≥ 0 (4.41) and equality holds if and only if ξ is essentially a deterministic/crisp number.
196
Chapter 4 - Uncertainty Theory
Proof: The nonnegativity is clear. In addition, H[ξ] = 0 if and only if
M{ξ = xi } = 0 or 1 for each i. That is, there exists one and only one index k such that M{ξ = xk } = 1, i.e., ξ is essentially a deterministic/crisp number.
This theorem states that the entropy of an uncertain variable reaches its minimum 0 when the uncertain variable degenerates to a deterministic/crisp number. In this case, there is no uncertainty. Theorem 4.30 Suppose that ξ is a simple uncertain variable taking values in {x1 , x2 , · · · , xn }. Then H[ξ] ≤ n ln 2 and equality holds if and only if
(4.42)
M{ξ = xi } = 0.5 for all i = 1, 2, · · · , n.
Proof: Since the function S(t) reaches its maximum ln 2 at t = 0.5, we have H[ξ] =
n X
S(M{ξ = xi }) ≤ n ln 2
i=1
and equality holds if and only if
M{ξ = xi } = 0.5 for all i = 1, 2, · · · , n.
This theorem states that the entropy of an uncertain variable reaches its maximum when the uncertain variable is an equipossible one. In this case, there is no preference among all the values that the uncertain variable will take.
4.12
Distance
Definition 4.21 (Liu [132]) The distance between uncertain variables ξ and η is defined as d(ξ, η) = E[|ξ − η|].
(4.43)
Theorem 4.31 Let ξ, η, τ be uncertain variables, and let d(·, ·) be the distance. Then we have (a) (Nonnegativity) d(ξ, η) ≥ 0; (b) (Identification) d(ξ, η) = 0 if and only if ξ = η; (c) (Symmetry) d(ξ, η) = d(η, ξ); (d) (Triangle Inequality) d(ξ, η) ≤ 2d(ξ, τ ) + 2d(η, τ ). Proof: The parts (a), (b) and (c) follow immediately from the definition. Now we prove the part (d). It follows from the countable subadditivity axiom
197
Section 4.13 - Inequalities
that Z
+∞
M {|ξ − η| ≥ r} dr
+∞
M {|ξ − τ | + |τ − η| ≥ r} dr
+∞
M {{|ξ − τ | ≥ r/2} ∪ {|τ − η| ≥ r/2}} dr
+∞
(M{|ξ − τ | ≥ r/2} + M{|τ − η| ≥ r/2}) dr
+∞
M{|ξ − τ | ≥ r/2}dr +
d(ξ, η) = 0
Z ≤ 0
Z ≤ 0
Z ≤ 0
Z = 0
Z
+∞
0
M{|τ − η| ≥ r/2}dr
= 2E[|ξ − τ |] + 2E[|τ − η|] = 2d(ξ, τ ) + 2d(τ, η).
Example 4.11: Let Γ = {γ1 , γ2 , γ3 }. Define M{∅} = 0, M{Γ} = 1 and M{Λ} = 1/2 for any subset Λ (excluding ∅ and Γ). We set uncertain variables ξ, η and τ as follows, ( ξ(γ) =
1, if γ 6= γ3 0, otherwise,
( η(γ) =
−1, if γ 6= γ1 0, otherwise,
τ (γ) ≡ 0.
It is easy to verify that d(ξ, τ ) = d(τ, η) = 1/2 and d(ξ, η) = 3/2. Thus
d(ξ, η) =
4.13
3 (d(ξ, τ ) + d(τ, η)). 2
Inequalities
Theorem 4.32 (Liu [132]) Let ξ be an uncertain variable, and f a nonnegative function. If f is even and increasing on [0, ∞), then for any given number t > 0, we have (ξ)] . M{|ξ| ≥ t} ≤ E[f f (t)
(4.44)
198
Chapter 4 - Uncertainty Theory
Proof: It is clear that M{|ξ| ≥ f −1 (r)} is a monotone decreasing function of r on [0, ∞). It follows from the nonnegativity of f (ξ) that Z
+∞
M{f (ξ) ≥ r}dr
+∞
M{|ξ| ≥ f −1 (r)}dr
f (t)
M{|ξ| ≥ f −1 (r)}dr
f (t)
dr · M{|ξ| ≥ f −1 (f (t))}
E[f (ξ)] = 0
Z = 0
Z ≥ 0
Z ≥ 0
= f (t) · M{|ξ| ≥ t} which proves the inequality. Theorem 4.33 (Liu [132], Markov Inequality) Let ξ be an uncertain variable. Then for any given numbers t > 0 and p > 0, we have p
] M{|ξ| ≥ t} ≤ E[|ξ| . tp
(4.45)
Proof: It is a special case of Theorem 4.32 when f (x) = |x|p . Theorem 4.34 (Liu [132], Chebyshev Inequality) Let ξ be an uncertain variable whose variance V [ξ] exists. Then for any given number t > 0, we have
M {|ξ − E[ξ]| ≥ t} ≤ Vt[ξ] . 2
(4.46)
Proof: It is a special case of Theorem 4.32 when the uncertain variable ξ is replaced with ξ − E[ξ], and f (x) = x2 . Theorem 4.35 (Liu [132], H¨ older’s Inequality) Let p and q be positive real numbers with 1/p + 1/q = 1, and let ξ and η be independent uncertain variables with E[|ξ|p ] < ∞ and E[|η|q ] < ∞. Then we have p p E[|ξη|] ≤ p E[|ξ|p ] q E[|η|q ]. (4.47) Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume√E[|ξ|p ] > 0 and E[|η|q ] > 0. It is easy to prove that the function √ f (x, y) = p x q y is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
199
Section 4.14 - Convergence Concepts
Letting x0 = E[|ξ|p ], y0 = E[|η|q ], x = |ξ|p and y = |η|q , we have f (|ξ|p , |η|q ) − f (E[|ξ|p ], E[|η|q ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|q − E[|η|q ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|q )] ≤ f (E[|ξ|p ], E[|η|q ]). Hence the inequality (4.47) holds. Theorem 4.36 (Liu [132], Minkowski Inequality) Let p be a real number with p ≥ 1, and let ξ and η be independent uncertain variables with E[|ξ|p ] < ∞ and E[|η|p ] < ∞. Then we have p p p p E[|ξ + η|p ] ≤ p E[|ξ|p ] + p E[|η|p ]. (4.48) Proof: The inequality holds trivially if at least one of ξ and η is zero a.s. Now we assume √ E[|ξ|p ] > 0 and E[|η|p ] > 0. It is easy to prove that the function √ f (x, y) = ( p x + p y)p is a concave function on D = {(x, y) : x ≥ 0, y ≥ 0}. Thus for any point (x0 , y0 ) with x0 > 0 and y0 > 0, there exist two real numbers a and b such that f (x, y) − f (x0 , y0 ) ≤ a(x − x0 ) + b(y − y0 ),
∀(x, y) ∈ D.
Letting x0 = E[|ξ|p ], y0 = E[|η|p ], x = |ξ|p and y = |η|p , we have f (|ξ|p , |η|p ) − f (E[|ξ|p ], E[|η|p ]) ≤ a(|ξ|p − E[|ξ|p ]) + b(|η|p − E[|η|p ]). Taking the expected values on both sides, we obtain E[f (|ξ|p , |η|p )] ≤ f (E[|ξ|p ], E[|η|p ]). Hence the inequality (4.48) holds. Theorem 4.37 (Liu [132], Jensen’s Inequality) Let ξ be an uncertain variable, and f : < → < a convex function. If E[ξ] and E[f (ξ)] are finite, then f (E[ξ]) ≤ E[f (ξ)].
(4.49)
Especially, when f (x) = |x|p and p ≥ 1, we have |E[ξ]|p ≤ E[|ξ|p ]. Proof: Since f is a convex function, for each y, there exists a number k such that f (x) − f (y) ≥ k · (x − y). Replacing x with ξ and y with E[ξ], we obtain f (ξ) − f (E[ξ]) ≥ k · (ξ − E[ξ]). Taking the expected values on both sides, we have E[f (ξ)] − f (E[ξ]) ≥ k · (E[ξ] − E[ξ]) = 0 which proves the inequality.
200
4.14
Chapter 4 - Uncertainty Theory
Convergence Concepts
We have the following four convergence concepts of uncertain sequence: convergence almost surely (a.s.), convergence in measure, convergence in mean, and convergence in distribution. Table 4.1: Relationship among Convergence Concepts Convergence in Mean
Convergence
⇒
in Measure
⇒
Convergence in Distribution
Definition 4.22 (Liu [132]) Suppose that ξ, ξ1 , ξ2 , · · · are uncertain variables defined on the uncertainty space (Γ, L, M). The sequence {ξi } is said to be convergent a.s. to ξ if there exists an event Λ with M{Λ} = 1 such that lim |ξi (γ) − ξ(γ)| = 0
i→∞
(4.50)
for every γ ∈ Λ. In that case we write ξi → ξ, a.s. Definition 4.23 (Liu [132]) Suppose that ξ, ξ1 , ξ2 , · · · are uncertain variables. We say that the sequence {ξi } converges in measure to ξ if lim
i→∞
M {|ξi − ξ| ≥ ε} = 0
(4.51)
for every ε > 0. Definition 4.24 (Liu [132]) Suppose that ξ, ξ1 , ξ2 , · · · are uncertain variables with finite expected values. We say that the sequence {ξi } converges in mean to ξ if lim E[|ξi − ξ|] = 0. (4.52) i→∞
In addition, the sequence {ξi } is said to converge in mean square to ξ if lim E[|ξi − ξ|2 ] = 0.
i→∞
(4.53)
Definition 4.25 (Liu [132]) Suppose that Φ, Φ1 , Φ2 , · · · are the uncertainty distributions of uncertain variables ξ, ξ1 , ξ2 , · · · , respectively. We say that {ξi } converges in distribution to ξ if Φi → Φ at any continuity point of Φ. Theorem 4.26 (Liu [132]) Suppose that ξ, ξ1 , ξ2 , · · · are uncertain variables. If {ξi } converges in mean to ξ, then {ξi } converges in measure to ξ. Proof: It follows from the Markov inequality that for any given number ε > 0, we have M{|ξi − ξ| ≥ ε} ≤ E[|ξiε− ξ|] → 0 as i → ∞. Thus {ξi } converges in measure to ξ. The theorem is proved.
Section 4.15 - Conditional Uncertainty
201
Theorem 4.27 (Liu [132]) Suppose ξ, ξ1 , ξ2 , · · · are uncertain variables. If {ξi } converges in measure to ξ, then {ξi } converges in distribution to ξ. Proof: Let x be a given continuity point of the uncertainty distribution Φ. On the one hand, for any y > x, we have {ξi ≤ x} = {ξi ≤ x, ξ ≤ y} ∪ {ξi ≤ x, ξ > y} ⊂ {ξ ≤ y} ∪ {|ξi − ξ| ≥ y − x}. It follows from the countable subadditivity axiom that Φi (x) ≤ Φ(y) + M{|ξi − ξ| ≥ y − x}. Since {ξi } converges in measure to ξ, we have M{|ξi − ξ| ≥ y − x} → 0 as i → ∞. Thus we obtain lim supi→∞ Φi (x) ≤ Φ(y) for any y > x. Letting y → x, we get lim sup Φi (x) ≤ Φ(x). (4.54) i→∞
On the other hand, for any z < x, we have {ξ ≤ z} = {ξi ≤ x, ξ ≤ z} ∪ {ξi > x, ξ ≤ z} ⊂ {ξi ≤ x} ∪ {|ξi − ξ| ≥ x − z} which implies that Φ(z) ≤ Φi (x) + M{|ξi − ξ| ≥ x − z}. Since M{|ξi − ξ| ≥ x − z} → 0, we obtain Φ(z) ≤ lim inf i→∞ Φi (x) for any z < x. Letting z → x, we get Φ(x) ≤ lim inf Φi (x). i→∞
(4.55)
It follows from (4.54) and (4.55) that Φi (x) → Φ(x). The theorem is proved.
4.15
Conditional Uncertainty
We consider the uncertain measure of an event A after it has been learned that some other event B has occurred. This new uncertain measure of A is called the conditional uncertain measure of A given B. In order to define a conditional uncertain measure M{A|B}, at first we have to enlarge M{A ∩ B} because M{A ∩ B} < 1 for all events whenever M{B} < 1. It seems that we have no alternative but to divide M{A ∩ B} by M{B}. Unfortunately, M{A∩B}/M{B} is not always an uncertain measure. However, the value M{A|B} should not be greater than M{A ∩ B}/M{B} (otherwise the normality will be lost), i.e., {A ∩ B} M{A|B} ≤ MM . {B}
(4.56)
202
Chapter 4 - Uncertainty Theory
On the other hand, in order to preserve the self-duality, we should have ∩ B} M{A|B} = 1 − M{Ac |B} ≥ 1 − M{A M{B} . c
(4.57)
Furthermore, since (A ∩ B) ∪ (Ac ∩ B) = B, we have M{B} ≤ M{A ∩ B} + M{Ac ∩ B} by using the countable subadditivity axiom. Thus
M{Ac ∩ B} ≤ M{A ∩ B} ≤ 1. (4.58) M{B} M{B} Hence any numbers between 1− M{Ac ∩B}/M{B} and M{A∩B}/M{B} are 0≤1−
reasonable values that the conditional uncertain measure may take. Based on the maximum uncertainty principle, we have the following conditional uncertain measure.
Definition 4.28 (Liu [132]) Let (Γ, L, M) be an uncertainty space, and A, B ∈ L. Then the conditional uncertain measure of A given B is defined by M{A ∩ B} , if M{A ∩ B} < 0.5 M{B} M{B} c M{A|B} = 1 − M{A ∩ B} , if M{Ac ∩ B} < 0.5 (4.59) M {B} M {B}
provided that
0.5,
otherwise
M{B} > 0.
It follows immediately from the definition of conditional uncertain measure that M{Ac ∩ B} ≤ M{A|B} ≤ M{A ∩ B} . 1− (4.60) M{B} M{B} Furthermore, the conditional uncertain measure obeys the maximum uncertainty principle, and takes values as close to 0.5 as possible. Remark 4.9: Conditional uncertain measure coincides with conditional probability, conditional credibility, and conditional chance. Theorem 4.38 (Liu [132]) Let (Γ, L, M) be an uncertainty space, and B an event with M{B} > 0. Then M{·|B} defined by (4.59) is an uncertain measure, and (Γ, L, M{·|B}) is an uncertainty space. Proof: It is sufficient to prove that M{·|B} satisfies the normality, monotonicity, self-duality and countable subadditivity axioms. At first, it satisfies the normality axiom, i.e., c M{∅} = 1. ∩ B} =1− M{Γ|B} = 1 − M{Γ M{B} M{B}
203
Section 4.15 - Conditional Uncertainty
For any events A1 and A2 with A1 ⊂ A2 , if
M{A1 ∩ B} ≤ M{A2 ∩ B} < 0.5, M{B} M{B} then
M{A2 ∩ B} 1 ∩ B} M{A1 |B} = M{A M{B} ≤ M{B} = M{A2 |B}.
If
M{A1 ∩ B} ≤ 0.5 ≤ M{A2 ∩ B} , M{B} M{B} then M{A1 |B} ≤ 0.5 ≤ M{A2 |B}. If M{A1 ∩ B} ≤ M{A2 ∩ B} , 0.5 < M{B} M{B} then we have M{A1 |B} = 1 −
M{Ac1 ∩ B} ∨ 0.5 ≤ 1 − M{Ac2 ∩ B} ∨ 0.5 = M{A |B}. 2 M{B} M{B} This means that M{·|B} satisfies the monotonicity axiom. For any event A, if M{A ∩ B} ≥ 0.5, M{Ac ∩ B} ≥ 0.5, M{B} M{B} c then we have M{A|B} + M{A |B} = 0.5 + 0.5 = 1 immediately. Otherwise,
without loss of generality, suppose
M{A ∩ B} < 0.5 < M{Ac ∩ B} , M{B} M{B} then we have M {A ∩ B} M {A ∩ B} M{A|B} + M{A |B} = M{B} + 1 − M{B} = 1. That is, M{·|B} satisfies the self-duality axiom. Finally, for any countable sequence {Ai } of events, if M{Ai |B} < 0.5 for all i, it follows from the c
countable subadditivity axiom that (∞ ) ∞ X [ (∞ ) M Ai ∩ B M{Ai ∩ B} X ∞ [ i=1 i=1 M Ai ∩ B ≤ ≤ = M{Ai |B}. M{B} M{B} i=1 i=1 Suppose there is one term greater than 0.5, say
M{A1 |B} ≥ 0.5, M{Ai |B} < 0.5,
i = 2, 3, · · ·
204 If
Chapter 4 - Uncertainty Theory
M{∪i Ai |B} = 0.5, then we immediately have M
(∞ [
) Ai ∩ B
≤
i=1
∞ X
M{Ai |B}.
i=1
If M{∪i Ai |B} > 0.5, we may prove the above inequality by the following facts: ! ∞ ∞ [ \ Ac1 ∩ B ⊂ (Ai ∩ B) ∪ Aci ∩ B , i=2
M
{Ac1
∩ B} ≤
∞ X
i=1
M{Ai ∩ B} + M
(∞ \
i=2
M
(∞ [
∩B
,
i=1
M
) Ai |B
) Aci
=1−
i=1
(∞ \
) Aci
i=1
∩B
M{B}
,
∞ X
M{Ai ∩ B} M {Ac1 ∩ B} i=2 M{Ai |B} ≥ 1 − M{B} + M{B} . i=1
∞ X
If there are at least two terms greater than 0.5, then the countable subadditivity is clearly true. Thus M{·|B} satisfies the countable subadditivity axiom. Hence M{·|B} is an uncertain measure. Furthermore, (Γ, L, M{·|B}) is an uncertainty space. Example 4.12: Let ξ and η be two uncertain variables. Then we have
M {ξ = x|η = y} =
provided that
M{ξ = x, η = y} , M{η = y} M{ξ 6= x, η = y} , 1− M{η = y} 0.5,
M{ξ = x, η = y} < 0.5 M{η = y} M{ξ 6= x, η = y} < 0.5 if M{η = y} if
otherwise
M{η = y} > 0.
Definition 4.29 (Liu [132]) The conditional uncertainty distribution Φ: < → [0, 1] of an uncertain variable ξ given B is defined by Φ(x|B) = M {ξ ≤ x|B} provided that
M{B} > 0.
(4.61)
205
Section 4.16 - Uncertain Process
Example 4.13: Let ξ and η be uncertain variables. Then the conditional uncertainty distribution of ξ given η = y is
Φ(x|η = y) =
provided that
M{ξ ≤ x, η = y} < 0.5 M{η = y} M{ξ > x, η = y} < 0.5 if M{η = y}
M{ξ ≤ x, η = y} , M{η = y} M{ξ > x, η = y} , 1− M{η = y}
if
0.5,
otherwise
M{η = y} > 0.
Definition 4.30 (Liu [132]) The conditional uncertainty density function φ of an uncertain variable ξ given B is a nonnegative function such that Z x Φ(x|B) = φ(y|B)dy, ∀x ∈ <, (4.62) −∞
Z
+∞
φ(y|B)dy = 1
(4.63)
−∞
where Φ(x|B) is the conditional uncertainty distribution of ξ given B. Definition 4.31 (Liu [132]) Let ξ be an uncertain variable. Then the conditional expected value of ξ given B is defined by Z E[ξ|B] = 0
+∞
M{ξ ≥ r|B}dr −
Z
0
−∞
M{ξ ≤ r|B}dr
(4.64)
provided that at least one of the two integrals is finite. Following conditional uncertain measure and conditional expected value, we also have conditional variance, conditional moments, conditional critical values, conditional entropy as well as conditional convergence.
4.16
Uncertain Process
Definition 4.32 (Liu [133]) Let T be an index set and let (Γ, L, M) be an uncertainty space. An uncertain process is a measurable function from T × (Γ, L, M) to the set of real numbers, i.e., for each t ∈ T and any Borel set B of real numbers, the set {γ ∈ Γ X(t, γ) ∈ B} (4.65) is an event.
206
Chapter 4 - Uncertainty Theory
That is, an uncertain process Xt (γ) is a function of two variables such that the function Xt∗ (γ) is an uncertain variable for each t∗ . For each fixed γ ∗ , the function Xt (γ ∗ ) is called a sample path of the uncertain process. An uncertain process Xt (γ) is said to be sample-continuous if the sample path is continuous for almost all γ. Definition 4.33 (Liu [133]) An uncertain process Xt is said to have independent increments if Xt1 − Xt0 , Xt2 − Xt1 , · · · , Xtk − Xtk−1
(4.66)
are independent uncertain variables for any times t0 < t1 < · · · < tk . An uncertain process Xt is said to have stationary increments if, for any given t > 0, the increments Xs+t −Xs are identically distributed uncertain variables for all s > 0. Uncertain Renewal Process Definition 4.34 (Liu [133]) Let ξ1 , ξ2 , · · · be iid positive uncertain variables. Define S0 = 0 and Sn = ξ1 + ξ2 + · · · + ξn for n ≥ 1. Then the uncertain process Nt = max n Sn ≤ t (4.67) n≥0
is called an uncertain renewal process. If ξ1 , ξ2 , · · · denote the interarrival times of successive events. Then Sn can be regarded as the waiting time until the occurrence of the nth event, and Nt is the number of renewals in (0, t]. Each sample path of Nt is a right-continuous and increasing step function taking only nonnegative integer values. Furthermore, the size of each jump of Nt is always 1. In other words, Nt has at most one renewal at each time. In particular, Nt does not jump at time 0. Since Nt ≥ n is equivalent to Sn ≤ t, we immediately have
M{Nt ≥ n} = M{Sn ≤ t}.
(4.68)
Theorem 4.39 (Liu [133]) Let Nt be an uncertain renewal process. Then we have ∞ X E[Nt ] = M{Sn ≤ t}. (4.69) n=1
Proof: Since Nt takes only nonnegative integer values, we have Z ∞ ∞ Z n X E[Nt ] = M{Nt ≥ r}dr = M{Nt ≥ r}dr 0
=
∞ X n=1
The theorem is proved.
M{Nt ≥ n} =
n=1 ∞ X
n−1
M{Sn ≤ t}.
n=1
Section 4.16 - Uncertain Process
207
Canonical Process Definition 4.35 (Liu [133]) An uncertain process Wt is said to be a canonical process if (i) W0 = 0 and Wt is sample-continuous, (ii) Wt has stationary and independent increments, (iii) W1 is an uncertain variable with expected value 0 and variance 1. Theorem 4.40 (Existence Theorem) There is a canonical process. Proof: In fact, standard Brownian motion and standard C process are instances of canonical process. Example 4.14: Let Bt be a standard Brownian motion, and Ct a standard C process. Then for each number a ∈ [0, 1], we may verify that Wt = aBt + (1 − a)Ct
(4.70)
is a canonical process. Theorem 4.41 Let Wt be a canonical process. Then E[Wt ] = 0 for any t. Proof: Let f (t) = E[Wt ]. Then for any times t1 and t2 , by using the property of stationary and independent increments, we obtain f (t1 + t2 ) = E[Wt1 +t2 ] = E[Wt1 +t2 − Wt2 + Wt2 − W0 ] = E[Wt1 ] + E[Wt2 ] = f (t1 ) + f (t2 ) which implies that there is a constant e such that f (t) = et. The theorem is proved via f (1) = 0. Theorem 4.42 Let Wt be a canonical process. If for any t1 and t2 , we have V [Wt1 +t2 ] = V [Wt1 ] + V [Wt2 ], then V [Wt ] = t for any t. Proof: Let f (t) = V [Wt ]. Then for any times t1 and t2 , by using the property of stationary and independent increments as well as variance condition, we obtain f (t1 + t2 ) = V [Wt1 +t2 ] = V [Wt1 +t2 − Wt2 + Wt2 − W0 ] = V [Wt1 ] + V [Wt2 ] = f (t1 ) + f (t2 ) which implies that there is a constant σ such that f (t) = σ 2 t. It follows from f (1) = 1 that σ 2 = 1 and f (t) = t. Hence the theorem is proved.
208
Chapter 4 - Uncertainty Theory
Theorem 4.43 Let Wt be a canonical process. If for any t1 and t2 , we have p p p V [Wt1 +t2 ] = V [Wt1 ] + V [Wt2 ], then V [Wt ] = t2 for any t. p Proof: Let f (t) = V [Wt ]. Then for any times t1 and t2 , by using the property of stationary and independent increments as well as variance condition, we obtain p p f (t1 + t2 ) = V [Wt1 +t2 ] = V [Wt1 +t2 − Wt2 + Wt2 − W0 ] p p = V [Wt1 ] + V [Wt2 ] = f (t1 ) + f (t2 ) which implies that there is a constant σ such that f (t) = σt. It follows from f (1) = 1 that σ = 1 and f (t) = t. The theorem is verified. Definition 4.36 For any partition of closed interval [0, t] with 0 = t1 < t2 < · · · < tk+1 = t, the mesh is written as ∆ = max |ti+1 − ti |. 1≤i≤k
Let α > 0 be a real number. Then the α-variation of uncertain process Wt is lim
∆→0
k X
|Wti − Wti |α
(4.71)
i=1
provided that the limit exists in mean square and is an uncertain process. Especially, the α-variation is called total variation if α = 1; and squared variation if α = 2. Definition 4.37 (Liu [133]) Let Wt be a canonical process. Then et + σWt is called a derived canonical process, and the uncertain process Gt = exp(et + σWt )
(4.72)
is called a geometric canonical process. An Uncertain Stock Model Assume that stock price follows geometric canonical process. Then we have an uncertain stock model in which the bond price Xt and the stock price Yt are determined by ( Xt = X0 exp(rt) (4.73) Yt = Y0 exp(et + σWt ) where r is the riskless interest rate, e is the stock drift, σ is the stock diffusion, and Wt is a canonical process.
209
Section 4.17 - Uncertain Calculus
4.17
Uncertain Calculus
Let Wt be a canonical process, and dt an infinitesimal time interval. Then dWt = Wt+dt − Wt
(4.74)
is an uncertain process with E[dWt ] = 0 and dt2 ≤ E[dWt2 ] ≤ dt. Definition 4.38 (Liu [133]) Let Xt be an uncertain process and let Wt be a canonical process. For any partition of closed interval [a, b] with a = t1 < t2 < · · · < tk+1 = b, the mesh is written as ∆ = max |ti+1 − ti |. 1≤i≤k
Then the uncertain integral of Xt with respect to Wt is Z
k X
b
Xt dWt = lim
∆→0
a
Xti · (Wti+1 − Wti )
(4.75)
i=1
provided that the limit exists in mean square and is an uncertain variable. Example 4.15: Let Wt be a canonical process. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have s
Z
dWt = lim
∆→0
0
k X
(Wti+1 − Wti ) ≡ Ws − W0 = Ws .
i=1
Example 4.16: Let Wt be a canonical process. Then for any partition 0 = t1 < t2 < · · · < tk+1 = s, we have sWs =
k X
ti+1 Wti+1 − ti Wti
i=1
=
k X
ti (Wti+1 − Wti ) +
i=1 Z s
k X
Wti+1 (ti+1 − ti )
i=1
Z
→
tdWt + 0
s
Wt dt 0
as ∆ → 0. It follows that Z
s
Z tdWt = sWs −
0
s
Wt dt. 0
210
Chapter 4 - Uncertainty Theory
Theorem 4.44 (Liu [133]) Let Wt be a canonical process, and let h(t, w) be a twice continuously differentiable function. Define Xt = h(t, Wt ). Then we have the following chain rule dXt =
∂h ∂h 1 ∂2h (t, Wt )dWt2 . (t, Wt )dt + (t, Wt )dWt + ∂t ∂w 2 ∂w2
(4.76)
Proof: Since the function h is twice continuously differentiable, by using Taylor series expansion, the infinitesimal increment of Xt has a second-order approximation ∆Xt =
∂h 1 ∂2h ∂h (t, Wt )(∆Wt )2 (t, Wt )∆t + (t, Wt )∆Wt + ∂t ∂w 2 ∂w2 +
1 ∂2h ∂2h 2 (t, W )(∆t) + (t, Wt )∆t∆Wt . t 2 ∂t2 ∂t∂w
Since we can ignore the terms (∆t)2 and ∆t∆Wt , the chain rule is proved because it makes Z s Z Z s ∂h 1 s ∂2h ∂h (t, Wt )dt + (t, Wt )dWt + (t, Wt )dWt2 Xs = X0 + 2 0 ∂w2 0 ∂w 0 ∂t for any s ≥ 0. Remark 4.10: The infinitesimal increment dWt in (4.76) may be replaced with the derived canonical process dYt = ut dt + vt dWt
(4.77)
where ut is an absolutely integrable uncertain process, and vt is a square integrable uncertain process, thus producing dh(t, Yt ) =
∂h ∂h 1 ∂2h (t, Yt )dt + (t, Yt )dYt + (t, Yt )vt2 dWt2 . ∂t ∂w 2 ∂w2
(4.78)
Example 4.17: Applying the chain rule, we obtain the following formula d(tWt ) = Wt dt + tdWt . Hence we have s
Z sWs =
Z d(tWt ) =
0
That is, Z
s
Z
0
s
tdWt . 0
Z tdWt = sWs −
0
s
Wt dt +
s
Wt dt. 0
211
Section 4.18 - Uncertain Differential Equation
Theorem 4.45 (Liu [133], Integration by Parts) Suppose that Wt is a canonical process and F (t) is an absolutely continuous function. Then Z s Z s Wt dF (t). (4.79) F (t)dWt = F (s)Ws − 0
0
Proof: By defining h(t, Wt ) = F (t)Wt and using the chain rule, we get d(F (t)Wt ) = Wt dF (t) + F (t)dWt . Thus
Z
s
Z
s
Z 0
0
0
s
F (t)dWt
Wt dF (t) +
d(F (t)Wt ) =
F (s)Ws = which is just (4.79).
4.18
Uncertain Differential Equation
Definition 4.39 (Liu [133]) Suppose Wt is a canonical process, and f and g are some given functions. Then dXt = f (t, Xt )dt + g(t, Xt )dWt
(4.80)
is called an uncertain differential equation. A solution is an uncertain process Xt that satisfies (4.80) identically in t. Remark 4.11: Note that there is no precise definition for the terms dXt , dt and dWt in the uncertain differential equation (4.80). The mathematically meaningful form is the uncertain integral equation Z s Z s Xs = X0 + f (t, Xt )dt + g(t, Xt )dWt . (4.81) 0
0
However, the differential form is more convenient for us. This is the main reason why we accept the differential form. Example 4.18: Let Wt be a canonical process. Then the uncertain differential equation dXt = adt + bdWt has a solution Xt = at + bWt .
Appendix A
Measurable Sets Algebra and σ-algebra are very important and fundamental concepts in measure theory. Definition A.1 Let Ω be a nonempty set. A collection A is called an algebra over Ω if the following conditions hold: (a) Ω ∈ A; (b) if A ∈ A, then Ac ∈ A; (c) if Ai ∈ A for i = 1, 2, · · · , n, then ∪ni=1 Ai ∈ A. If the condition (c) is replaced with closure under countable union, then A is called a σ-algebra over Ω. Example A.1: Assume that Ω is a nonempty set. Then {∅, Ω} is the smallest σ-algebra over Ω, and the power set P (all subsets of Ω) is the largest σ-algebra over Ω. Example A.2: Let A be the set of all finite disjoint unions of all intervals of the form (−∞, a], (a, b], (b, ∞) and ∅. Then A is an algebra over <, but not a σ-algebra because Ai = (0, (i − 1)/i] ∈ A for all i but ∞ [
Ai = (0, 1) 6∈ A.
i=1
Theorem A.1 A σ-algebra A is closed under difference, countable union, countable intersection, upper limit, lower limit, and limit. That is, A2 \ A1 ∈ A;
∞ [
Ai ∈ A;
i=1
lim sup Ai = i→∞
∞ \
Ai ∈ A;
(A.1)
i=1 ∞ [ ∞ \ k=1 i=k
Ai ∈ A;
(A.2)
214
Appendix A - Measurable Sets
lim inf Ai = i→∞
∞ \ ∞ [
Ai ∈ A;
(A.3)
k=1 i=k
lim Ai ∈ A.
i→∞
(A.4)
Theorem A.2 The intersection of any collection of σ-algebras is a σ-algebra. Furthermore, for any nonempty class C, there is a unique minimal σ-algebra containing C. Theorem A.3 (Monotone Class Theorem) Assume that A0 is an algebra over Ω, and C is a monotone class of subsets of Ω (if Ai ∈ C and Ai ↑ A or Ai ↓ A, then A ∈ C). If A0 ⊂ C and σ(A0 ) is the smallest σ-algebra containing A0 , then σ(A0 ) ⊂ C.
Definition A.2 Let Ω be a nonempty set, and A a σ-algebra over Ω. Then (Ω, A) is called a measurable space, and any elements in A are called measurable sets. Theorem A.4 The smallest σ-algebra B containing all open intervals is called the Borel algebra of <, any elements in B are called a Borel set.
Borel algebra B is a special σ-algebra over <. It follows from Theorem A.2 that Borel algebra is unique. It has been proved that open set, closed set, the set of rational numbers, the set of irrational numbers, and countable set of real numbers are all Borel sets. Example A.3: We divide the interval [0, 1] into three equal open intervals from which we choose the middle one, i.e., (1/3, 2/3). Then we divide each of the remaining two intervals into three equal open intervals, and choose the middle one in each case, i.e., (1/9, 2/9) and (7/9, 8/9). We perform this process and obtain Dij for j = 1, 2, · · · , 2i−1 and i = 1, 2, · · · Note that {Dij } is a sequence of mutually disjoint open intervals. Without loss of generality, suppose Di1 < Di2 < · · · < Di,2i−1 for i = 1, 2, · · · Define the set i−1
D=
∞ 2[ [
Dij .
(A.5)
i=1 j=1
Then C = [0, 1] \ D is called the Cantor set. In other words, x ∈ C if and only if x can be expressed in ternary form using only digits 0 and 2, i.e., x=
∞ X ai i=1
3i
(A.6)
where ai = 0 or 2 for i = 1, 2, · · · The Cantor set is closed, uncountable, and a Borel set. Let Ω1 , Ω2 , · · · , Ωn be any sets (not necessarily subsets of the same space). The product Ω = Ω1 × Ω2 × · · · × Ωn is the set of all ordered n-tuples of the form (x1 , x2 , · · · , xn ), where xi ∈ Ωi for i = 1, 2, · · · , n.
Appendix A - Measurable Sets
215
Definition A.3 Let Ai be σ-algebras over Ωi , i = 1, 2, · · · , n, respectively. Write Ω = Ω1 × Ω2 × · · · × Ωn . A measurable rectangle in Ω is a set A = A1 × A2 × · · · × An , where Ai ∈ Ai for i = 1, 2, · · · , n. The smallest σalgebra containing all measurable rectangles of Ω is called the product σalgebra, denoted by A = A1 × A2 × · · · × An . Note that the product σ-algebra A is the smallest σ-algebra containing measurable rectangles, rather than the product of A1 , A2 , · · · , An . Product σ-algebra may be easily extended to countably infinite case by defining a measurable rectangle as a set of the form A = A1 × A2 × · · · where Ai ∈ Ai for all i and Ai = Ωi for all but finitely many i. The smallest σ-algebra containing all measurable rectangles of Ω = Ω1 × Ω2 × · · · is called the product σ-algebra, denoted by
A = A1 × A2 × · · ·
Appendix B
Classical Measures This appendix introduces the concepts of classical measure, measure space, Lebesgue measure, and product measure. Definition B.1 Let Ω be a nonempty set, and A a σ-algebra over Ω. A classical measure π is a set function on A satisfying Axiom 1. (Nonnegativity) π{A} ≥ 0 for any A ∈ A; Axiom 2. (Countable Additivity) For every countable sequence of mutually disjoint measurable sets {Ai }, we have (∞ ) ∞ [ X π Ai = π{Ai }. (B.1) i=1
i=1
Example B.1: Length, area and volume are instances of measure concept. Definition B.2 Let Ω be a nonempty set, A a σ-algebra over Ω, and π a measure on A. Then the triplet (Ω, A, π) is called a measure space. It has been proved that there is a unique measure π on the Borel algebra of < such that π{(a, b]} = b − a for any interval (a, b] of <. Definition B.3 The measure π on the Borel algebra of < such that π{(a, b]} = b − a,
∀(a, b]
is called the Lebesgue measure. Theorem B.1 (Product Measure Theorem) Let (Ωi , Ai , πi ), i = 1, 2, · · · , n be measure spaces such that πi {Ωi } < ∞ for i = 1, 2, · · · , n. Write Ω = Ω 1 × Ω2 × · · · × Ωn ,
A = A1 × A2 × · · · × An .
217
Appendix B - Classical Measures
Then there is a unique measure π on
A such that
π{A1 × A2 × · · · × An } = π1 {A1 } × π2 {A2 } × · · · × πn {An }
(B.2)
for every measurable rectangle A1 × A2 × · · · × An . The measure π is called the product of π1 , π2 , · · · , πn , denoted by π = π1 × π2 × · · · × πn . The triplet (Ω, A, π) is called the product measure space. Theorem B.2 (Infinite Product Measure Theorem) Assume that (Ωi , Ai , πi ) are measure spaces such that πi {Ωi } = 1 for i = 1, 2, · · · Write
A = A1 × A2 × · · · Then there is a unique measure π on A such that Ω = Ω1 × Ω2 × · · ·
π{A1 ×· · ·×An ×Ωn+1 ×Ωn+2 ×· · · } = π1 {A1 }×π2 {A2 }×· · ·×πn {An } (B.3) for any measurable rectangle A1 × · · · × An × Ωn+1 × Ωn+2 × · · · and all n = 1, 2, · · · The measure π is called the infinite product, denoted by π = π1 × π2 × · · · The triplet (Ω, A, π) is called the infinite product measure space.
Appendix C
Measurable Functions This appendix introduces the concepts of measurable function, simple function, step function, absolutely continuous function, singular function, and Cantor function. Definition C.1 A function f from (Ω, A) to the set of real numbers is said to be measurable if f −1 (B) = {ω ∈ Ω|f (ω) ∈ B} ∈ A for any Borel set B of real numbers. If Ω is a Borel set, then assumed to be the Borel algebra over Ω.
(C.1)
A is always
Theorem C.1 The function f is measurable from (Ω, A) to < if and only if f −1 (I) ∈ A for any open interval I of <. Definition C.2 A function f : < → < is said to be continuous if for any given x ∈ < and ε > 0, there exists a δ > 0 such that |f (y) − f (x)| < ε whenever |y − x| < δ. Example C.1: Any continuous function f is measurable, because f −1 (I) is an open set (not necessarily interval) of < for any open interval I ∈ <. Example C.2: A monotone function f from < to < is measurable because f −1 (I) is an interval for any interval I. Example C.3: A function is said to be simple if it takes a finite set of values. A function is said to be step if it takes a countably infinite set of values. Generally speaking, a step (or simple) function is not necessarily measurable except that it can be written as f (x) = ai if x ∈ Ai , where Ai are measurable sets, i = 1, 2, · · · , respectively.
219
Appendix C - Measurable Functions
Example C.4: Let f be a measurable function from (Ω, A) to <. Then its positive part and negative part ( ( f (ω), if f (ω) ≥ 0 −f (ω), if f (ω) ≤ 0 + − f (ω) = f (ω) = 0, otherwise, 0, otherwise are measurable functions, because + ω f (ω) > t = ω f (ω) > t ∪ ω f (ω) ≤ 0 if t < 0 , − ω f (ω) > t = ω f (ω) < −t ∪ ω f (ω) ≥ 0 if t < 0 . Example C.5: Let f1 and f2 be measurable functions from (Ω, A) to <. Then f1 ∨ f2 and f1 ∧ f2 are measurable functions, because ω f1 (ω) ∨ f2 (ω) > t = ω f1 (ω) > t ∪ ω f2 (ω) > t , ω f1 (ω) ∧ f2 (ω) > t = ω f1 (ω) > t ∩ ω f2 (ω) > t . Theorem C.2 Let {fi } be a sequence of measurable functions from (Ω, A) to <. Then the following functions are measurable: sup fi (ω); 1≤i<∞
inf fi (ω);
1≤i<∞
lim sup fi (ω);
lim inf fi (ω).
i→∞
i→∞
(C.2)
Especially, if limi→∞ fi (ω) exists, then it is also a measurable function. Theorem C.3 Let f be a nonnegative measurable function from (Ω, A) to <. Then there exists an increasing sequence {hi } of nonnegative simple measurable functions such that lim hi (ω) = f (ω),
i→∞
∀ω ∈ Ω.
Furthermore, the functions hi may be defined as follows, k − 1 , if k − 1 ≤ f (ω) < k , k = 1, 2, · · · , i2i 2i 2i 2i hi (ω) = i, if i ≤ f (ω)
(C.3)
(C.4)
for i = 1, 2, · · · Definition C.3 A function f : < → < is said to be Lipschitz continuous if there is a positive number K such that |f (y) − f (x)| ≤ K|y − x|,
∀x, y ∈ <.
(C.5)
220
Appendix C - Measurable Functions
Definition C.4 A function f : < → < is said to be absolutely continuous if for any given ε > 0, there exists a small number δ > 0 such that m X
|f (yi ) − f (xi )| < ε
(C.6)
i=1
for every finite disjoint class {(xi , yi ), i = 1, 2, · · · , m} of bounded open intervals for which m X |yi − xi | < δ. (C.7) i=1
Definition C.5 A continuous and increasing function f : < → < is said to be singular if f is not a constant and its derivative f 0 = 0 almost everywhere. Example C.6: Let C be the Cantor set. We define a function g on C as follows, ! ∞ ∞ X X ai ai g = (C.8) i i+1 3 2 i=1 i=1 where ai = 0 or 2 for i = 1, 2, · · · Then g(x) is an increasing function and g(C) = [0, 1]. The Cantor function is defined on [0, 1] as follows, (C.9) f (x) = sup g(y) y ∈ C, y ≤ x . It is clear that the Cantor function f (x) is increasing such that f (0) = 0,
f (1) = 1,
f (x) = g(x),
∀x ∈ C.
Moreover, f (x) is a continuous function and f 0 (x) = 0 almost everywhere. Thus the Cantor function f is a singular function.
Appendix D
Lebesgue Integral This appendix introduces Lebesgue integral, Lebesgue-Stieltjes integral, monotone convergence theorem, Lebesgue dominated convergence theorem, and Fubini theorem. Definition D.1 Let h(x) be a nonnegative simple measurable function defined by c1 , if x ∈ A1 c2 , if x ∈ A2 h(x) = · ·· cm , if x ∈ Am where A1 , A2 , · · · , Am are Borel sets. Then the Lebesgue integral of h on a Borel set A is Z m X ci π{A ∩ Ai }. (D.1) h(x)dx = A
i=1
Definition D.2 Let f (x) be a nonnegative measurable function on the Borel set A, and {hi (x)} a sequence of nonnegative simple measurable functions such that hi (x) ↑ f (x) as i → ∞. Then the Lebesgue integral of f on A is Z Z f (x)dx = lim hi (x)dx. (D.2) i→∞
A
A
Definition D.3 Let f (x) be a measurable function on the Borel set A, and define ( ( f (x), if f (x) > 0 −f (x), if f (x) < 0 f + (x) = f − (x) = 0, otherwise, 0, otherwise. Then the Lebesgue integral of f on A is Z Z Z f (x)dx = f + (x)dx − f − (x)dx A
A
A
(D.3)
222
Appendix D - Lebesgue Integral
provided that at least one of
R A
f + (x)dx and
R A
f − (x)dx is finite.
Definition D.4 Let f (x) R R be a measurable function on the Borel set A. If both of A f + (x)dx and A f − (x)dx are finite, then the function f is said to be integrable on A. Theorem D.1 (Monotone Convergence Theorem) Let {fi } be an increasing sequence of measurable functions on A. If there is an integrable function g such that fi (x) ≥ g(x) for all i, then we have Z Z lim fi (x)dx = lim fi (x)dx. (D.4) A i→∞
i→∞
A
Example D.1: The condition fi ≥ g cannot be removed in the monotone convergence theorem. For example, let fi (x) = 0 if x ≤ i and −1 otherwise. Then fi (x) ↑ 0 everywhere on A. However, Z Z fi (x)dx. lim fi (x)dx = 0 6= −∞ = lim < i→∞
i→∞
<
Theorem D.2 (Lebesgue Dominated Convergence Theorem) Let {fi } be a sequence of measurable functions on A whose limitation limi→∞ fi (x) exists a.s. If there is an integrable function g such that |fi (x)| ≤ g(x) for any i, then we have Z Z lim fi (x)dx = lim fi (x)dx. (D.5) A i→∞
i→∞
A
Example D.2: The condition |fi | ≤ g in the Lebesgue dominated convergence theorem cannot be removed. Let A = (0, 1), fi (x) = i if x ∈ (0, 1/i) and 0 otherwise. Then fi (x) → 0 everywhere on A. However, Z Z lim fi (x)dx = 0 6= 1 = lim fi (x)dx. A i→∞
i→∞
A
Theorem D.3 (Fubini Theorem) Let f (x, y) be an integrable function on <2 . Then we have (a) f (x, y) is an integrable function of x for almost all y; (b) f (x, y) is an integrable function of y for almost all x; Z Z Z Z Z (c) f (x, y)dxdy = f (x, y)dy dx = f (x, y)dx dy. <2
<
<
<
<
Theorem D.4 Let Φ be an increasing and right-continuous function on <. Then there exists a unique measure π on the Borel algebra of < such that π{(a, b]} = Φ(b) − Φ(a)
(D.6)
for all a and b with a < b. Such a measure is called the Lebesgue-Stieltjes measure corresponding to Φ.
223
Appendix D - Lebesgue Integral
Definition D.5 Let Φ(x) be an increasing and right-continuous function on <, and let h(x) be a nonnegative simple measurable function, i.e., c1 , if x ∈ A1 c2 , if x ∈ A2 h(x) = .. . cm , if x ∈ Am . Then the Lebesgue-Stieltjes integral of h on the Borel set A is Z h(x)dΦ(x) = A
m X
ci π{A ∩ Ai }
(D.7)
i=1
where π is the Lebesgue-Stieltjes measure corresponding to Φ. Definition D.6 Let f (x) be a nonnegative measurable function on the Borel set A, and let {hi (x)} be a sequence of nonnegative simple measurable functions such that hi (x) ↑ f (x) as i → ∞. Then the Lebesgue-Stieltjes integral of f on A is Z Z f (x)dΦ(x) = lim hi (x)dΦ(x). (D.8) i→∞
A
A
Definition D.7 Let f (x) be a measurable function on the Borel set A, and define ( ( f (x), if f (x) > 0 −f (x), if f (x) < 0 + − f (x) = f (x) = 0, otherwise, 0, otherwise. Then the Lebesgue-Stieltjes integral of f on A is Z Z Z f (x)dΦ(x) = f + (x)dΦ(x) − f − (x)dΦ(x) A
provided that at least one of
R A
A
A
f + (x)dΦ(x) and
R A
(D.9)
f − (x)dΦ(x) is finite.
Appendix E
Euler-Lagrange Equation Let Z L(φ) =
b
F (x, φ(x), φ0 (x))dx,
(E.1)
a
where F is a known function with continuous first and second partial derivatives. If L has an extremum (maximum or minimum) at φ(x), then d ∂F ∂F − =0 (E.2) ∂φ(x) dx ∂φ0 (x) which is called the Euler-Lagrange equation. If φ0 (x) is not involved, then the Euler-Lagrange equation reduces to ∂F = 0. ∂φ(x)
(E.3)
Note that the Euler-Lagrange equation is only a necessary condition for the existence of an extremum. However, if the existence of an extremum is clear and there exists only one solution to the Euler-Lagrange equation, then this solution must be the curve for which the extremum is achieved.
Appendix F
Maximum Uncertainty Principle An event has no uncertainty if its measure is 1 (or 0) because we may believe that the event occurs (or not). An event is the most uncertain if its measure is 0.5 because the event and its complement may be regarded as “equally likely”. In practice, if there is no information about the measure of an event, we should assign 0.5 to it. Sometimes, only partial information is available. For this case, the value of measure may be specified in some range. What value does the measure take? For the safety purpose, we should assign it the value as close to 0.5 as possible. This is the maximum uncertainty principle proposed by Liu [132]. Maximum Uncertainty Principle: For any event, if there are multiple reasonable values that a measure may take, then the value as close to 0.5 as possible is assigned to the event. Perhaps the reader would like to ask what values are reasonable. The answer is problem-dependent. At least, the values should ensure that all axioms about the measure are satisfied, and should be consistent with the given information. Example F.1: Let Λ be an event. Based on some given information, the measure value M{Λ} is on the interval [a, b]. By using the maximum uncertainty principle, we should assign a, if 0.5 < a ≤ b M{Λ} = 0.5, if a ≤ 0.5 ≤ b b, if a ≤ b < 0.5.
Appendix G
Uncertainty Relations Probability theory is a branch of mathematics based on the normality, nonnegativity, and countable additivity axioms. In fact, those three axioms may be replaced with four axioms: normality, monotonicity, self-duality, and countable additivity. Thus all of probability, credibility, chance, and uncertain measures meet the normality, monotonicity and self-duality axioms. The essential difference among those measures is how to determine the measure of union. For any mutually disjoint events {Ai } with supi π{Ai } < 0.5, if π satisfies the countable additivity axiom, i.e., π
(∞ [
) Ai
=
∞ X
π{Ai },
(G.1)
i=1
i=1
then π is a probability measure; if π satisfies the maximality axiom, i.e., π
(∞ [
) Ai
i=1
= sup π{Ai },
(G.2)
1≤i<∞
then π is a credibility measure; if π satisfies the countable subadditivity axiom, i.e., (∞ ) ∞ [ X π Ai ≤ π{Ai }, (G.3) i=1
i=1
then π is an uncertain measure. Since additivity and maximality are special cases of subadditivity, probability and credibility are special cases of chance measure, and three of them are in the category of uncertain measure. This fact also implies that random variable and fuzzy variable are special cases of hybrid variables, and three of them are instances of uncertain variables.
227
Appendix G - Uncertainty Relations ............................................................................................................................................................................................................................................................................................ ....... .. ... ... ........................................................................................................... ... ... ................. ......................... . . . . . . . . . ... . . . . . ... . . ............. ........ . . . . . . . . . ... . . . . . . ... . . .......... ...................................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... ... . . . . . .................... ............ ........ ............... . . . . . . . . . . . . . . . . . . . . . . . . . . ... ... . . ....... ....... ... ..... . . ..... . . ... ........ . .... .... ... ... . . ... ... . ... ... ... . . ... .... . . . . ... ... ... ... ... .... ... ... ... ... ..... ...... ... .... ..... ....... ...... . . . . . . . . . . . ... .... . . . .......... ... .................. ............................................................... ..................................................... ... ... .......... ..... ... ... ............ ........... . . . . . . . . ................ . . ... ... . . .......... . . ...................... . . . . . . . . . . . . . . . ... ... . . . ...................................................................................................... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .... .. ............................................................................................................................................................................................................................................................................................
Credibility Model
Hybrid Model
Probability Model
Uncertainty Model
Figure G.1: Relations among Uncertainties Some Questions Is the real world as extreme as probability model? Is the real world as extreme as credibility model? Is the real world as extreme as hybrid model? In many cases, the answer is negative. Perhaps uncertainty model is more realistic.
Bibliography [1] Alefeld G, Herzberger J, Introduction to Interval Computations, Academic Press, New York, 1983. [2] Atanassov KT, Intuitionistic Fuzzy Sets: Theory and Applications, PhysicaVerlag, Heidelberg, 1999. [3] Bamber D, Goodman IR, Nguyen HT, Extension of the concept of propositional deduction from classical logic to probability: An overview of probability-selection approaches, Information Sciences, Vol.131, 195-250, 2001. [4] Bedford T, and Cooke MR, Probabilistic Risk Analysis, Cambridge University Press, 2001. [5] Bandemer H, and Nather W, Fuzzy Data Analysis, Kluwer, Dordrecht, 1992. [6] Bellman RE, and Zadeh LA, Decision making in a fuzzy environment, Management Science, Vol.17, 141-164, 1970. [7] Bhandari D, and Pal NR, Some new information measures of fuzzy sets, Information Sciences, Vol.67, 209-228, 1993. [8] Black F, and Scholes M, The pricing of option and corporate liabilities, Journal of Political Economy, Vol.81, 637-654, 1973. [9] Bouchon-Meunier B, Mesiar R, Ralescu DA, Linear non-additive setfunctions, International Journal of General Systems, Vol.33, No.1, 89-98, 2004. [10] Buckley JJ, Possibility and necessity in optimization, Fuzzy Sets and Systems, Vol.25, 1-13, 1988. [11] Buckley JJ, Stochastic versus possibilistic programming, Fuzzy Sets and Systems, Vol.34, 173-177, 1990. [12] Cadenas JM, and Verdegay JL, Using fuzzy numbers in linear programming, IEEE Transactions on Systems, Man and Cybernetics–Part B, Vol.27, No.6, 1016-1022, 1997. [13] Campos L, and Gonz´ alez, A, A subjective approach for ranking fuzzy numbers, Fuzzy Sets and Systems, Vol.29, 145-153, 1989. [14] Campos L, and Verdegay JL, Linear programming problems and ranking of fuzzy numbers, Fuzzy Sets and Systems, Vol.32, 1-11, 1989. [15] Campos FA, Villar J, and Jimenez M, Robust solutions using fuzzy chance constraints, Engineering Optimization, Vol.38, No.6, 627-645, 2006.
230
Bibliography
[16] Carlsson C, Fuller R, and Majlender P, A possibilistic approach to selecting portfolios with highest utility score, Fuzzy Sets and Systems, Vol.131, No.1, 13-21, 2002. [17] Chanas S, and Kuchta D, Multiobjective programming in optimization of interval objective functions—a generalized approach, European Journal of Operational Research, Vol.94, 594-598, 1996. [18] Chen A, and Ji ZW, Path finding under uncertainty, Journal of Advance Transportation, Vol.39, No.1, 19-37, 2005. [19] Chen SJ, and Hwang CL, Fuzzy Multiple Attribute Decision Making: Methods and Applications, Springer-Verlag, Berlin, 1992. [20] Chen Y, Fung RYK, Yang J, Fuzzy expected value modelling approach for determining target values of engineering characteristics in QFD, International Journal of Production Research, Vol.43, No.17, 3583-3604, 2005. [21] Chen Y, Fung RYK, Tang JF, Rating technical attributes in fuzzy QFD by integrating fuzzy weighted average method and fuzzy expected value operator, European Journal of Operational Research, Vol.174, No.3, 1553-1566, 2006. [22] Choquet G, Theory of capacities, Annals de l’Institute Fourier, Vol.5, 131295, 1954. [23] Dai W, Reflection principle of Liu process, http://orsc.edu.cn/process/ 071110.pdf. [24] Das B, Maity K, Maiti A, A two warehouse supply-chain model under possibility/necessity/credibility measures, Mathematical and Computer Modelling, Vol.46, No.3-4, 398-409, 2007. [25] De Cooman G, Possibility theory I-III, International Journal of General Systems, Vol.25, 291-371, 1997. [26] De Luca A, and Termini S, A definition of nonprobabilistic entropy in the setting of fuzzy sets theory, Information and Control, Vol.20, 301-312, 1972. [27] Dempster AP, Upper and lower probabilities induced by a multivalued mapping, Ann. Math. Stat., Vol.38, No.2, 325-339, 1967. [28] Dubois D, and Prade H, Operations on fuzzy numbers, International Journal of System Sciences, Vol.9, 613-626, 1978. [29] Dubois D, and Prade H, Fuzzy Sets and Systems, Theory and Applications, Academic Press, New York, 1980. [30] Dubois D, and Prade H, Twofold fuzzy sets: An approach to the representation of sets with fuzzy boundaries based on possibility and necessity measures, The Journal of Fuzzy Mathematics, Vol.3, No.4, 53-76, 1983. [31] Dubois D, and Prade H, Fuzzy logics and generalized modus ponens revisited, Cybernetics and Systems, Vol.15, 293-331, 1984. [32] Dubois D, and Prade H, Fuzzy cardinality and the modeling of imprecise quantification, Fuzzy Sets and Systems, Vol.16, 199-230, 1985. [33] Dubois D, and Prade H, A note on measures of specificity for fuzzy sets, International Journal of General Systems, Vol.10, 279-283, 1985.
Bibliography
231
[34] Dubois D, and Prade H, The mean value of a fuzzy number, Fuzzy Sets and Systems, Vol.24, 279-300, 1987. [35] Dubois D, and Prade H, Twofold fuzzy sets and rough sets — some issues in knowledge representation, Fuzzy Sets and Systems, Vol.23, 3-18, 1987. [36] Dubois D, and Prade H, Possibility Theory: An Approach to Computerized Processing of Uncertainty, Plenum, New York, 1988. [37] Dubois D, and Prade H, Rough fuzzy sets and fuzzy rough sets, International Journal of General Systems, Vol.17, 191-200, 1990. [38] Dunyak J, Saad IW, and Wunsch D, A theory of independent fuzzy probability for system reliability, IEEE Transactions on Fuzzy Systems, Vol.7, No.3, 286-294, 1999. [39] Esogbue AO, and Liu B, Reservoir operations optimization via fuzzy criterion decision processes, Fuzzy Optimization and Decision Making, Vol.5, No.3, 289-305, 2006. [40] Feng X, and Liu YK, Measurability criteria for fuzzy random vectors, Fuzzy Optimization and Decision Making, Vol.5, No.3, 245-253, 2006. [41] Feng Y, Yang LX, A two-objective fuzzy k-cardinality assignment problem, Journal of Computational and Applied Mathematics, Vol.197, No.1, 233-244, 2006. [42] Fung RYK, Chen YZ, Chen L, A fuzzy expected value-based goal programing model for product planning using quality function deployment, Engineering Optimization, Vol.37, No.6, 633-647, 2005. [43] Gao J, and Liu B, New primitive chance measures of fuzzy random event, International Journal of Fuzzy Systems, Vol.3, No.4, 527-531, 2001. [44] Gao J, Liu B, and Gen M, A hybrid intelligent algorithm for stochastic multilevel programming, IEEJ Transactions on Electronics, Information and Systems, Vol.124-C, No.10, 1991-1998, 2004. [45] Gao J, and Liu B, Fuzzy multilevel programming with a hybrid intelligent algorithm, Computers & Mathematics with Applications, Vol.49, 1539-1548, 2005. [46] Gao J, and Lu M, Fuzzy quadratic minimum spanning tree problem, Applied Mathematics and Computation, Vol.164, No.3, 773-788, 2005. [47] Gao J, and Feng X, A hybrid intelligent algorithm for fuzzy dynamic inventory problem, Journal of Information and Computing Science, Vol.1, No.4, 235244, 2006. [48] Gao J, Credibilistic game with fuzzy information, Journal of Uncertain Systems, Vol.1, No.1, 74-80, 2007. [49] Gao J, and Zhou J, Uncertain Process Online, http://orsc.edu.cn/process. [50] Gao J, Credibilistic option pricing: a new model, http://orsc.edu.cn/process/ 071124.pdf. [51] Gao X, Option pricing formula for hybrid stock model with randomness and fuzziness, http://orsc.edu.cn/process/080112.pdf.
232
Bibliography
[52] Gil MA, Lopez-Diaz M, Ralescu DA, Overview on the development of fuzzy random variables, Fuzzy Sets and Systems, Vol.157, No.19, 2546-2557, 2006. [53] Gonz´ alez, A, A study of the ranking function approach through mean values, Fuzzy Sets and Systems, Vol.35, 29-41, 1990. [54] Guan J, and Bell DA, Evidence Theory and its Applications, North-Holland, Amsterdam, 1991. [55] Guo R, Zhao R, Guo D, and Dunne T, Random fuzzy variable modeling on repairable system, Journal of Uncertain Systems, Vol.1, No.3, 222-234, 2007. [56] Guo R, Guo D, Thiart C, and Li X, Bivariate credibility-copulas, Journal of Uncertain Systems, Vol.1, No.4, 303-314, 2007. [57] Hansen E, Global Optimization Using Interval Analysis, Marcel Dekker, New York, 1992. [58] He Y, and Xu J, A class of random fuzzy programming model and its application to vehicle routing problem, World Journal of Modelling and Simulation, Vol.1, No.1, 3-11, 2005. [59] Heilpern S, The expected value of a fuzzy number, Fuzzy Sets and Systems, Vol.47, 81-86, 1992. [60] Higashi M, and Klir GJ, On measures of fuzziness and fuzzy complements, International Journal of General Systems, Vol.8, 169-180, 1982. [61] Higashi M, and Klir GJ, Measures of uncertainty and information based on possibility distributions, International Journal of General Systems, Vol.9, 4358, 1983. [62] Hisdal E, Conditional possibilities independence and noninteraction, Fuzzy Sets and Systems, Vol.1, 283-297, 1978. [63] Hisdal E, Logical Structures for Representation of Knowledge and Uncertainty, Physica-Verlag, Heidelberg, 1998. [64] Hong DH, Renewal process with T-related fuzzy inter-arrival times and fuzzy rewards, Information Sciences, Vol.176, No.16, 2386-2395, 2006. [65] Inuiguchi M, and Ram´ık J, Possibilistic linear programming: A brief review of fuzzy mathematical programming and a comparison with stochastic programming in portfolio selection problem, Fuzzy Sets and Systems, Vol.111, No.1, 3-28, 2000. [66] Ishibuchi H, and Tanaka H, Multiobjective programming in optimization of the interval objective function, European Journal of Operational Research, Vol.48, 219-225, 1990. [67] Jaynes ET, Information theory and statistical mechanics, Physical Reviews, Vol.106, No.4, 620-630, 1957. [68] Ji XY, and Shao Z, Model and algorithm for bilevel Newsboy problem with fuzzy demands and discounts, Applied Mathematics and Computation, Vol.172, No.1, 163-174, 2006. [69] Ji XY, and Iwamura K, New models for shortest path problem with fuzzy arc lengths, Applied Mathematical Modelling, Vol.31, 259-269, 2007.
Bibliography
233
[70] John RI, Type 2 fuzzy sets: An appraisal of theory and applications, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.6, No.6, 563-576, 1998. [71] Kacprzyk J, and Esogbue AO, Fuzzy dynamic programming: Main developments and applications, Fuzzy Sets and Systems, Vol.81, 31-45, 1996. [72] Kacprzyk J, Multistage Fuzzy Control, Wiley, Chichester, 1997. [73] Karnik NN, Mendel JM, and Liang Q, Type-2 fuzzy logic systems, IEEE Transactions on Fuzzy Systems, Vol.7, No.6, 643-658, 1999. [74] Karnik NN, Mendel JM, and Liang Q, Centroid of a type-2 fuzzy set, Information Sciences, Vol.132, 195-220, 2001. [75] Kaufmann A, Introduction to the Theory of Fuzzy Subsets, Vol.I, Academic Press, New York, 1975. [76] Kaufmann A, and Gupta MM, Introduction to Fuzzy Arithmetic: Theory and Applications, Van Nostrand Reinhold, New York, 1985. [77] Kaufmann A, and Gupta MM, Fuzzy Mathematical Models in Engineering and Management Science, 2nd ed, North-Holland, Amsterdam, 1991. [78] Ke H, and Liu B, Project scheduling problem with stochastic activity duration times, Applied Mathematics and Computation, Vol.168, No.1, 342-353, 2005. [79] Ke H, and Liu B, Project scheduling problem with mixed uncertainty of randomness and fuzziness, European Journal of Operational Research, Vol.183, No.1, 135-147, 2007. [80] Ke H, and Liu B, Fuzzy project scheduling problem and its hybrid intelligent algorithm, Technical Report, 2005. [81] Klement EP, Puri ML, and Ralescu DA, Limit theorems for fuzzy random variables, Proceedings of the Royal Society of London Series A, Vol.407, 171182, 1986. [82] Klir GJ, and Folger TA, Fuzzy Sets, Uncertainty, and Information, PrenticeHall, Englewood Cliffs, NJ, 1980. [83] Klir GJ, and Yuan B, Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice-Hall, New Jersey, 1995. [84] Knopfmacher J, On measures of fuzziness, Journal of Mathematical Analysis and Applications, Vol.49, 529-534, 1975. [85] Kosko B, Fuzzy entropy and conditioning, Information Sciences, Vol.40, 165174, 1986. [86] Kruse R, and Meyer KD, Statistics with Vague Data, D. Reidel Publishing Company, Dordrecht, 1987. [87] Kwakernaak H, Fuzzy random variables–I: Definitions and theorems, Information Sciences, Vol.15, 1-29, 1978. [88] Kwakernaak H, Fuzzy random variables–II: Algorithms and examples for the discrete case, Information Sciences, Vol.17, 253-278, 1979. [89] Lai YJ, and Hwang CL, Fuzzy Multiple Objective Decision Making: Methods and Applications, Springer-Verlag, New York, 1994.
234
Bibliography
[90] Lee ES, Fuzzy multiple level programming, Applied Mathematics and Computation, Vol.120, 79-90, 2001. [91] Lee KH, First Course on Fuzzy Theory and Applications, Springer-Verlag, Berlin, 2005. [92] Lertworasirkul S, Fang SC, Joines JA, and Nuttle HLW, Fuzzy data envelopment analysis (DEA): a possibility approach, Fuzzy Sets and Systems, Vol.139, No.2, 379-394, 2003. [93] Li HL, and Yu CS, A fuzzy multiobjective program with quasiconcave membership functions and fuzzy coefficients, Fuzzy Sets and Systems, Vol.109, No.1, 59-81, 2000. [94] Li J, Xu J, and Gen M, A class of multiobjective linear programming model with fuzzy random coefficients, Mathematical and Computer Modelling, Vol.44, Nos.11-12, 1097-1113, 2006. [95] Li P, and Liu B, Entropy of credibility distributions for fuzzy variables, IEEE Transactions on Fuzzy Systems, Vol.16, No.1, 123-129, 2008. [96] Li SM, Ogura Y, and Nguyen HT, Gaussian processes and martingales for fuzzy valued random variables with continuous parameter, Information Sciences, Vol.133, 7-21, 2001. [97] Li SM, Ogura Y, and Kreinovich V, Limit Theorems and Applications of Set-Valued and Fuzzy Set-Valued Random Variables, Kluwer, Boston, 2002. [98] Li SQ, Zhao RQ, and Tang WS, Fuzzy random homogeneous Poisson process and compound Poisson process, Journal of Information and Computing Science, Vol.1, No.4, 207-224, 2006. [99] Li X, and Liu B, The independence of fuzzy variables with applications, International Journal of Natural Sciences & Technology, Vol.1, No.1, 95-100, 2006. [100] Li X, and Liu B, A sufficient and necessary condition for credibility measures, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.14, No.5, 527-535, 2006. [101] Li X, and Liu B, New independence definition of fuzzy random variable and random fuzzy variable, World Journal of Modelling and Simulation, Vol.2, No.5, 338-342, 2006. [102] Li X, and Liu B, Maximum entropy principle for fuzzy variables, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.15, Supp.2, 43-52, 2007. [103] Li X, and Liu B, Chance measure for hybrid events with fuzziness and randomness, Soft Computing, to be published. [104] Li X, and Liu B, Moment estimation theorems for various types of uncertain variable, Technical Report, 2007. [105] Li X, and Liu B, On distance between fuzzy variables, Technical Report, 2007. [106] Li X, and Liu B, Conditional chance measure for hybrid events, Technical Report, 2007.
Bibliography
235
[107] Li X, and Liu B, Cross-entropy and generalized entropy for fuzzy variables, Technical Report, 2007. [108] Li X, Expected value and variance of geometric Liu process, http://orsc. edu.cn/process/071123.pdf. [109] Liu B, Dependent-chance goal programming and its genetic algorithm based approach, Mathematical and Computer Modelling, Vol.24, No.7, 43-52, 1996. [110] Liu B, and Esogbue AO, Fuzzy criterion set and fuzzy criterion dynamic programming, Journal of Mathematical Analysis and Applications, Vol.199, No.1, 293-311, 1996. [111] Liu B, Dependent-chance programming: A class of stochastic optimization, Computers & Mathematics with Applications, Vol.34, No.12, 89-104, 1997. [112] Liu B, and Iwamura K, Modelling stochastic decision systems using dependent-chance programming, European Journal of Operational Research, Vol.101, No.1, 193-203, 1997. [113] Liu B, and Iwamura K, Chance constrained programming with fuzzy parameters, Fuzzy Sets and Systems, Vol.94, No.2, 227-237, 1998. [114] Liu B, and Iwamura K, A note on chance constrained programming with fuzzy coefficients, Fuzzy Sets and Systems, Vol.100, Nos.1-3, 229-233, 1998. [115] Liu B, Minimax chance constrained programming models for fuzzy decision systems, Information Sciences, Vol.112, Nos.1-4, 25-38, 1998. [116] Liu B, Dependent-chance programming with fuzzy decisions, IEEE Transactions on Fuzzy Systems, Vol.7, No.3, 354-360, 1999. [117] Liu B, and Esogbue AO, Decision Criteria and Optimal Inventory Processes, Kluwer, Boston, 1999. [118] Liu B, Uncertain Programming, Wiley, New York, 1999. [119] Liu B, Dependent-chance programming in fuzzy environments, Fuzzy Sets and Systems, Vol.109, No.1, 97-106, 2000. [120] Liu B, Uncertain programming: A unifying optimization theory in various uncertain environments, Applied Mathematics and Computation, Vol.120, Nos.13, 227-234, 2001. [121] Liu B, and Iwamura K, Fuzzy programming with fuzzy decisions and fuzzy simulation-based genetic algorithm, Fuzzy Sets and Systems, Vol.122, No.2, 253-262, 2001. [122] Liu B, Fuzzy random chance-constrained programming, IEEE Transactions on Fuzzy Systems, Vol.9, No.5, 713-720, 2001. [123] Liu B, Fuzzy random dependent-chance programming, IEEE Transactions on Fuzzy Systems, Vol.9, No.5, 721-726, 2001. [124] Liu B, Theory and Practice of Uncertain Programming, Physica-Verlag, Heidelberg, 2002. [125] Liu B, Toward fuzzy optimization without mathematical ambiguity, Fuzzy Optimization and Decision Making, Vol.1, No.1, 43-63, 2002. [126] Liu B, and Liu YK, Expected value of fuzzy variable and fuzzy expected value models, IEEE Transactions on Fuzzy Systems, Vol.10, No.4, 445-450, 2002.
236
Bibliography
[127] Liu B, Random fuzzy dependent-chance programming and its hybrid intelligent algorithm, Information Sciences, Vol.141, Nos.3-4, 259-271, 2002. [128] Liu B, Inequalities and convergence concepts of fuzzy and rough variables, Fuzzy Optimization and Decision Making, Vol.2, No.2, 87-100, 2003. [129] Liu B, Uncertainty Theory, Springer-Verlag, Berlin, 2004. [130] Liu B, A survey of credibility theory, Fuzzy Optimization and Decision Making, Vol.5, No.4, 387-408, 2006. [131] Liu B, A survey of entropy of fuzzy variables, Journal of Uncertain Systems, Vol.1, No.1, 4-13, 2007. [132] Liu B, Uncertainty Theory, 2nd ed., Springer-Verlag, Berlin, 2007. [133] Liu B, Fuzzy process, hybrid process and uncertain process, Journal of Uncertain Systems, Vol.2, No.1, 3-16, 2008. [134] Liu B, Theory and Practice of Uncertain Programming, 2nd ed., http://orsc. edu.cn/liu/up.pdf. [135] Liu LZ, Li YZ, The fuzzy quadratic assignment problem with penalty: New models and genetic algorithm, Applied Mathematics and Computation, Vol.174, No.2, 1229-1244, 2006. [136] Liu XC, Entropy, distance measure and similarity measure of fuzzy sets and their relations, Fuzzy Sets and Systems, Vol.52, 305-318, 1992. [137] Liu XW, Measuring the satisfaction of constraints in fuzzy linear programming, Fuzzy Sets and Systems, Vol.122, No.2, 263-275, 2001. [138] Liu YK, and Liu B, Random fuzzy programming with chance measures defined by fuzzy integrals, Mathematical and Computer Modelling, Vol.36, Nos.4-5, 509-524, 2002. [139] Liu YK, and Liu B, Fuzzy random programming problems with multiple criteria, Asian Information-Science-Life, Vol.1, No.3, 249-256, 2002. [140] Liu YK, and Liu B, Fuzzy random variables: A scalar expected value operator, Fuzzy Optimization and Decision Making, Vol.2, No.2, 143-160, 2003. [141] Liu YK, and Liu B, Expected value operator of random fuzzy variable and random fuzzy expected value models, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.11, No.2, 195-215, 2003. [142] Liu YK, and Liu B, A class of fuzzy random optimization: Expected value models, Information Sciences, Vol.155, Nos.1-2, 89-102, 2003. [143] Liu YK, and Liu B, On minimum-risk problems in fuzzy random decision systems, Computers & Operations Research, Vol.32, No.2, 257-283, 2005. [144] Liu YK, and Liu B, Fuzzy random programming with equilibrium chance constraints, Information Sciences, Vol.170, 363-395, 2005. [145] Liu YK, Fuzzy programming with recourse, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.13, No.4, 381-413, 2005. [146] Liu YK, Convergent results about the use of fuzzy simulation in fuzzy optimization problems, IEEE Transactions on Fuzzy Systems, Vol.14, No.2, 295304, 2006.
Bibliography
237
[147] Liu YK, and Wang SM, On the properties of credibility critical value functions, Journal of Information and Computing Science, Vol.1, No.4, 195-206, 2006. [148] Liu YK, and Gao J, The independence of fuzzy variables in credibility theory and its applications, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.15, Supp.2, 1-20, 2007. [149] Loo SG, Measures of fuzziness, Cybernetica, Vol.20, 201-210, 1977. [150] Lopez-Diaz M, Ralescu DA, Tools for fuzzy random variables: Embeddings and measurabilities, Computational Statistics & Data Analysis, Vol.51, No.1, 109-114, 2006. [151] Lu M, On crisp equivalents and solutions of fuzzy programming with different chance measures, Information: An International Journal, Vol.6, No.2, 125133, 2003. [152] Lucas C, and Araabi BN, Generalization of the Dempster-Shafer Theory: A fuzzy-valued measure, IEEE Transactions on Fuzzy Systems, Vol.7, No.3, 255-270, 1999. [153] Luhandjula MK, Fuzziness and randomness in an optimization framework, Fuzzy Sets and Systems, Vol.77, 291-297, 1996. [154] Luhandjula MK, and Gupta MM, On fuzzy stochastic optimization, Fuzzy Sets and Systems, Vol.81, 47-55, 1996. [155] Luhandjula MK, Optimisation under hybrid uncertainty, Fuzzy Sets and Systems, Vol.146, No.2, 187-203, 2004. [156] Luhandjula MK, Fuzzy stochastic linear programming: Survey and future research directions, European Journal of Operational Research, Vol.174, No.3, 1353-1367, 2006. [157] Maiti MK, Maiti MA, Fuzzy inventory model with two warehouses under possibility constraints, Fuzzy Sets and Systems, Vol.157, No.1, 52-73, 2006. [158] Maleki HR, Tata M, and Mashinchi M, Linear programming with fuzzy variables, Fuzzy Sets and Systems, Vol.109, No.1, 21-33, 2000. [159] Merton RC, Theory of rational option pricing, Bell Journal of Economics and Management Science, Vol.4, 141-183, 1973. [160] Mizumoto M, and Tanaka K, Some properties of fuzzy sets of type 2, Information and Control, Vol.31, 312-340, 1976. [161] Mohammed W, Chance constrained fuzzy goal programming with right-hand side uniform random variable coefficients, Fuzzy Sets and Systems, Vol.109, No.1, 107-110, 2000. [162] Molchanov IS, Limit Theorems for Unions of Random Closed Sets, SpringerVerlag, Berlin, 1993. [163] Nahmias S, Fuzzy variables, Fuzzy Sets and Systems, Vol.1, 97-110, 1978. [164] Negoita CV, and Ralescu D, On fuzzy optimization, Kybernetes, Vol.6, 193195, 1977. [165] Negoita CV, and Ralescu D, Simulation, Knowledge-based Computing, and Fuzzy Statistics, Van Nostrand Reinhold, New York, 1987.
238
Bibliography
[166] Neumaier A, Interval Methods for Systems of Equations, Cambridge University Press, New York, 1990. [167] Nguyen HT, On conditional possibility distributions, Fuzzy Sets and Systems, Vol.1, 299-309, 1978. [168] Nguyen HT, Fuzzy sets and probability, Fuzzy sets and Systems, Vol.90, 129132, 1997. [169] Nguyen HT, Kreinovich V, Zuo Q, Interval-valued degrees of belief: Applications of interval computations to expert systems and intelligent control, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.5, 317-358, 1997. [170] Nguyen HT, Nguyen NT, Wang TH, On capacity functionals in interval probabilities, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.5, 359-377, 1997. [171] Nguyen HT, Kreinovich V, Shekhter V, On the possibility of using complex values in fuzzy logic for representing inconsistencies, International Journal of Intelligent Systems, Vol.13, 683-714, 1998. [172] Nguyen HT, Kreinovich V, Wu BL, Fuzzy/probability similar to fractal/smooth, International Journal of Uncertainty, Fuzziness & KnowledgeBased Systems, Vol.7, 363-370, 1999. [173] Nguyen HT, Nguyen NT, On Chu spaces in uncertainty analysis, International Journal of Intelligent Systems, Vol.15, 425-440, 2000. [174] Nguyen HT, Some mathematical structures for computational information, Information Sciences, Vol.128, 67-89, 2000. [175] Nguyen VH, Fuzzy stochastic goal programming problems, European Journal of Operational Research, Vol.176, No.1, 77-86, 2007. [176] Øksendal B, Stochastic Differential Equations, 6th ed., Springer-Verlag, Berlin, 2005. [177] Pal NR, and Pal SK, Object background segmentation using a new definition of entropy, IEE Proc. E, Vol.136, 284-295, 1989. [178] Pal NR, and Pal SK, Higher order fuzzy entropy and hybrid entropy of a set, Information Sciences, Vol.61, 211-231, 1992. [179] Pal NR, Bezdek JC, and Hemasinha R, Uncertainty measures for evidential reasonning I: a review, International Journal of Approximate Reasoning, Vol.7, 165-183, 1992. [180] Pal NR, Bezdek JC, and Hemasinha R, Uncertainty measures for evidential reasonning II: a new measure, International Journal of Approximate Reasoning, Vol.8, 1-16, 1993. [181] Pal NR, and Bezdek JC, Measuring fuzzy uncertainty, IEEE Transactions on Fuzzy Systems, Vol.2, 107-118, 1994. [182] Pawlak Z, Rough sets, International Journal of Information and Computer Sciences, Vol.11, No.5, 341-356, 1982. [183] Pawlak Z, Rough sets and fuzzy sets, Fuzzy Sets and Systems, Vol.17, 99-102, 1985.
Bibliography
239
[184] Pawlak Z, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer, Dordrecht, 1991. [185] Pedrycz W, Optimization schemes for decomposition of fuzzy relations, Fuzzy Sets and Systems, Vol.100, 301-325, 1998. [186] Peng J, and Liu B, Stochastic goal programming models for parallel machine scheduling problems, Asian Information-Science-Life, Vol.1, No.3, 257-266, 2002. [187] Peng J, and Liu B, Parallel machine scheduling models with fuzzy processing times, Information Sciences, Vol.166, Nos.1-4, 49-66, 2004. [188] Peng J, and Liu B, A framework of birandom theory and optimization methods, Information: An International Journal, Vol.9, No.4, 629-640, 2006. [189] Peng J, and Zhao XD, Some theoretical aspects of the critical values of birandom variable, Journal of Information and Computing Science, Vol.1, No.4, 225-234, 2006. [190] Peng J, and Liu B, Birandom variables and birandom programming, Computers & Industrial Engineering, Vol.53, No.3, 433-453, 2007. [191] Peng J, Credibilistic stopping problems for fuzzy stock market, http://orsc. edu.cn/process/071125.pdf. [192] Puri ML, and Ralescu D, Differentials of fuzzy functions, Journal of Mathematical Analysis and Applications, Vol.91, 552-558, 1983. [193] Puri ML, and Ralescu D, Fuzzy random variables, Journal of Mathematical Analysis and Applications, Vol.114, 409-422, 1986. [194] Qin ZF, and Li X, Option pricing formula for fuzzy financial market, Journal of Uncertain Systems, Vol.2, No.1, 17-21, 2008. [195] Qin ZF, and Liu B, On some special hybrid variables, Technical Report, 2007. [196] Qin ZF, On analytic functions of complex Liu process, http://orsc.edu.cn/ process/071026.pdf. [197] Raj PA, and Kumer DN, Ranking alternatives with fuzzy weights using maximizing set and minimizing set, Fuzzy Sets and Systems, Vol.105, 365-375, 1999. [198] Ralescu DA, Sugeno M, Fuzzy integral representation, Fuzzy Sets and Systems, Vol.84, No.2, 127-133, 1996. [199] Ralescu AL, Ralescu DA, Extensions of fuzzy aggregation, Fuzzy Sets and systems, Vol.86, No.3, 321-330, 1997. [200] Ramer A, Conditional possibility measures, International Journal of Cybernetics and Systems, Vol.20, 233-247, 1989. [201] Ram´ık J, Extension principle in fuzzy optimization, Fuzzy Sets and Systems, Vol.19, 29-35, 1986. [202] Ram´ık J, and Rommelfanger H, Fuzzy mathematical programming based on some inequality relations, Fuzzy Sets and Systems, Vol.81, 77-88, 1996. [203] Saade JJ, Maximization of a function over a fuzzy domain, Fuzzy Sets and Systems, Vol.62, 55-70, 1994.
240
Bibliography
[204] Sakawa M, Nishizaki I, and Uemura Y, Interactive fuzzy programming for multi-level linear programming problems with fuzzy parameters, Fuzzy Sets and Systems, Vol.109, No.1, 3-19, 2000. [205] Sakawa M, Nishizaki I, Uemura Y, Interactive fuzzy programming for twolevel linear fractional programming problems with fuzzy parameters, Fuzzy Sets and Systems, Vol.115, 93-103, 2000. [206] Shafer G, A Mathematical Theory of Evidence, Princeton University Press, Princeton, NJ, 1976. [207] Shannon CE, The Mathematical Theory of Communication, The University of Illinois Press, Urbana, 1949. [208] Shao Z, and Ji XY, Fuzzy multi-product constraint newsboy problem, Applied Mathematics and Computation, Vol.180, No.1, 7-15, 2006. [209] Shih HS, Lai YJ, Lee ES, Fuzzy approach for multilevel programming problems, Computers and Operations Research, Vol.23, 73-91, 1996. [210] Shreve SE, Stochastic Calculus for Finance II: Continuous-Time Models, Springer, Berlin, 2004. [211] Slowinski R, and Teghem J, Fuzzy versus stochastic approaches to multicriteria linear programming under uncertainty, Naval Research Logistics, Vol.35, 673-695, 1988. [212] Slowinski R, and Vanderpooten D, A generalized definition of rough approximations based on similarity, IEEE Transactions on Knowledge and Data Engineering, Vol.12, No.2, 331-336, 2000. [213] Steuer RE, Algorithm for linear programming problems with interval objective function coefficients, Mathematics of Operational Research, Vol.6, 333348, 1981. [214] Sugeno M, Theory of Fuzzy Integrals and its Applications, Ph.D. Dissertation, Tokyo Institute of Technology, 1974. [215] Szmidt E, Kacprzyk J, Distances between intuitionistic fuzzy sets, Fuzzy Sets and Systems, Vol.114, 505-518, 2000. [216] Szmidt E, Kacprzyk J, Entropy for intuitionistic fuzzy sets, Fuzzy Sets and Systems, Vol.118, 467-477, 2001. [217] Tanaka H, and Asai K, Fuzzy linear programming problems with fuzzy numbers, Fuzzy Sets and Systems, Vol.13, 1-10, 1984. [218] Tanaka H, and Asai K, Fuzzy solutions in fuzzy linear programming problems, IEEE Transactions on Systems, Man and Cybernetics, Vol.14, 325-328, 1984. [219] Tanaka H, and Guo P, Possibilistic Data Analysis for Operations Research, Physica-Verlag, Heidelberg, 1999. [220] Torabi H, Davvaz B, Behboodian J, Fuzzy random events in incomplete probability models, Journal of Intelligent & Fuzzy Systems, Vol.17, No.2, 183-188, 2006. [221] Wang G, and Liu B, New theorems for fuzzy sequence convergence, Proceedings of the Second International Conference on Information and Management Sciences, Chengdu, China, August 24-30, 2003, pp.100-105.
Bibliography
241
[222] Wang Z, and Klir GJ, Fuzzy Measure Theory, Plenum Press, New York, 1992. [223] Yager RR, On measures of fuzziness and negation, Part I: Membership in the unit interval, International Journal of General Systems, Vol.5, 221-229, 1979. [224] Yager RR, On measures of fuzziness and negation, Part II: Lattices, Information and Control, Vol.44, 236-260, 1980. [225] Yager RR, A procedure for ordering fuzzy subsets of the unit interval, Information Sciences, Vol.24, 143-161, 1981. [226] Yager RR, Generalized probabilities of fuzzy events from fuzzy belief structures, Information Sciences, Vol.28, 45-62, 1982. [227] Yager RR, Measuring tranquility and anxiety in decision making: an application of fuzzy sets, International Journal of General Systems, Vol.8, 139-144, 1982. [228] Yager RR, Entropy and specificity in a mathematical theory of evidence, International Journal of General Systems, Vol.9, 249-260, 1983. [229] Yager RR, On ordered weighted averaging aggregation operators in multicriteria decision making, IEEE Transactions on Systems, Man and Cybernetics, Vol.18, 183-190, 1988. [230] Yager RR, Decision making under Dempster-Shafer uncertainties, International Journal of General Systems, Vol.20, 233-245, 1992. [231] Yager RR, On the specificity of a possibility distribution, Fuzzy Sets and Systems, Vol.50, 279-292, 1992. [232] Yager RR, Measures of entropy and fuzziness related to aggregation operators, Information Sciences, Vol.82, 147-166, 1995. [233] Yager RR, Modeling uncertainty using partial information, Information Sciences, Vol.121, 271-294, 1999. [234] Yager RR, Decision making with fuzzy probability assessments, IEEE Transactions on Fuzzy Systems, Vol.7, 462-466, 1999. [235] Yager RR, On the entropy of fuzzy measures, IEEE Transactions on Fuzzy Systems, Vol.8, 453-461, 2000. [236] Yager RR, On the evaluation of uncertain courses of action, Fuzzy Optimization and Decision Making, Vol.1, 13-41, 2002. [237] Yang L, and Liu B, On inequalities and critical values of fuzzy random variable, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.13, No.2, 163-175, 2005. [238] Yang L, and Liu B, A sufficient and necessary condition for chance distribution of birandom variable, Information: An International Journal, Vol.9, No.1, 33-36, 2006. [239] Yang L, and Liu B, On continuity theorem for characteristic function of fuzzy variable, Journal of Intelligent and Fuzzy Systems, Vol.17, No.3, 325-332, 2006. [240] Yang L, and Liu B, Chance distribution of fuzzy random variable and laws of large numbers, Technical Report, 2004.
242
Bibliography
[241] Yang N, Wen FS, A chance constrained programming approach to transmission system expansion planning, Electric Power Systems Research, Vol.75, Nos.2-3, 171-177, 2005. [242] Yazenin AV, On the problem of possibilistic optimization, Fuzzy Sets and Systems, Vol.81, 133-140, 1996. [243] You C, Multidimensional Liu process, differential and integral, Proceedings of the First Intelligent Computing Conference, Lushan, October 10-13, 2007, pp.153-158. (Also available at http://orsc.edu.cn/process/071015.pdf) [244] You C, Some extensions of Wiener-Liu process and Ito-Liu integral, http:// orsc.edu.cn/process/071019.pdf. [245] Zadeh LA, Fuzzy sets, Information and Control, Vol.8, 338-353, 1965. [246] Zadeh LA, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Transactions on Systems, Man and Cybernetics, Vol.3, 28-44, 1973. [247] Zadeh LA, The concept of a linguistic variable and its application to approximate reasoning, Information Sciences, Vol.8, 199-251, 1975. [248] Zadeh LA, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems, Vol.1, 3-28, 1978. [249] Zadeh LA, A theory of approximate reasoning, In: J Hayes, D Michie and RM Thrall, eds, Mathematical Frontiers of the Social and Policy Sciences, Westview Press, Boulder, Cororado, 69-129, 1979. [250] Zhao R and Liu B, Stochastic programming models for general redundancy optimization problems, IEEE Transactions on Reliability, Vol.52, No.2, 181191, 2003. [251] Zhao R, and Liu B, Renewal process with fuzzy interarrival times and rewards, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.11, No.5, 573-586, 2003. [252] Zhao R, and Liu B, Redundancy optimization problems with uncertainty of combining randomness and fuzziness, European Journal of Operational Research, Vol.157, No.3, 716-735, 2004. [253] Zhao R, and Liu B, Standby redundancy optimization problems with fuzzy lifetimes, Computers & Industrial Engineering, Vol.49, No.2, 318-338, 2005. [254] Zhao R, Tang WS, and Yun HL, Random fuzzy renewal process, European Journal of Operational Research, Vol.169, No.1, 189-201, 2006. [255] Zhao R, and Tang WS, Some properties of fuzzy random renewal process, IEEE Transactions on Fuzzy Systems, Vol.14, No.2, 173-179, 2006. [256] Zheng Y, and Liu B, Fuzzy vehicle routing model with credibility measure and its hybrid intelligent algorithm, Applied Mathematics and Computation, Vol.176, No.2, 673-683, 2006. [257] Zhou J, and Liu B, New stochastic models for capacitated location-allocation problem, Computers & Industrial Engineering, Vol.45, No.1, 111-125, 2003. [258] Zhou J, and Liu B, Analysis and algorithms of bifuzzy systems, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.12, No.3, 357-376, 2004.
Bibliography
243
[259] Zhou J, and Liu B, Convergence concepts of bifuzzy sequence, Asian Information-Science-Life, Vol.2, No.3, 297-310, 2004. [260] Zhou J, and Liu B, Modeling capacitated location-allocation problem with fuzzy demands, Computers & Industrial Engineering, Vol.53, No.3, 454-468, 2007. [261] Zhu Y, and Liu B, Continuity theorems and chance distribution of random fuzzy variable, Proceedings of the Royal Society of London Series A, Vol.460, 2505-2519, 2004. [262] Zhu Y, and Liu B, Some inequalities of random fuzzy variables with application to moment convergence, Computers & Mathematics with Applications, Vol.50, Nos.5-6, 719-727, 2005. [263] Zhu Y, and Ji XY, Expected values of functions of fuzzy variables, Journal of Intelligent and Fuzzy Systems, Vol.17, No.5, 471-478, 2006. [264] Zhu Y, and Liu B, Convergence concepts of random fuzzy sequence, Information: An International Journal, Vol.9, No.6, 845-852, 2006. [265] Zhu Y, and Liu B, Fourier spectrum of credibility distribution for fuzzy variables, International Journal of General Systems, Vol.36, No.1, 111-123, 2007. [266] Zhu Y, and Liu B, A sufficient and necessary condition for chance distribution of random fuzzy variables, International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems, Vol.15, Supp.2, 21-28, 2007. [267] Zhu Y, Fuzzy optimal control with application to portfolio selection, http:// orsc.edu.cn/process/080117.pdf. [268] Zimmermann HJ, Fuzzy Set Theory and its Applications, Kluwer Academic Publishers, Boston, 1985.
List of Frequently Used Symbols ξ, η, τ ξ, η, τ µ, ν φ, ψ Φ, Ψ Pr Cr Ch
M
E V H (Γ, L, M) (Ω, A, Pr) (Θ, P, Cr) (Θ, P, Cr) × (Ω, A, Pr) ∅ <
random, fuzzy, hybrid, or uncertain variables random, fuzzy, hybrid, or uncertain vectors membership functions probability, or credibility density functions probability, or credibility distributions probability measure credibility measure chance measure uncertain measure expected value variance entropy uncertainty space probability space credibility space chance space empty set set of real numbers set of n-dimensional real vectors maximum operator minimum operator a1 ≤ a2 ≤ · · · and ai → a a1 ≥ a2 ≥ · · · and ai → a A1 ⊂ A2 ⊂ · · · and A = A1 ∪ A2 ∪ · · · A1 ⊃ A2 ⊃ · · · and A = A1 ∩ A2 ∩ · · ·
Index algebra, 213 Borel algebra, 214 Borel set, 214 Brownian motion, 44 canonical process, 207 Cantor function, 220 Cantor set, 214 chance asymptotic theorem, 137 chance density function, 145 chance distribution, 143 chance measure, 131 chance semicontinuity law, 136 chance space, 129 chance subadditivity theorem, 134 Chebyshev inequality, 33, 108, 158 conditional chance, 165 conditional credibility, 114 conditional membership function, 116 conditional probability, 40 conditional uncertainty, 202 convergence almost surely, 35, 110 convergence in chance, 160 convergence in credibility, 110 convergence in distribution, 36, 111 convergence in mean, 36, 110 convergence in probability, 35 countable additivity axiom, 1 countable subadditivity axiom, 178 credibility asymptotic theorem, 57 credibility density function, 72 credibility distribution, 69 credibility extension theorem, 58 credibility inversion theorem, 65 credibility measure, 53 credibility semicontinuity law, 57 credibility space, 60 credibility subadditivity theorem, 55 critical value, 26, 96, 153, 194 distance, 32, 106, 157, 196
entropy, 28, 100, 156, 195 equipossible fuzzy variable, 68 Euler-Lagrange equation, 224 event, 1, 53, 130, 177 expected value, 14, 80, 146, 187 exponential distribution, 10 exponential membership function, 95 extension principle of Zadeh, 77 Fubini theorem, 222 fuzzy calculus, 124 fuzzy differential equation, 128 fuzzy integral equation, 128 fuzzy process, 119 fuzzy random variable, 129 fuzzy sequence, 110 fuzzy variable, 63 fuzzy vector, 64 hazard rate, 42, 118 H¨ older’s inequality, 34, 108, 158, 198 hybrid calculus, 171 hybrid differential equation, 174 hybrid integral equation, 174 hybrid process, 169 hybrid sequence, 160 hybrid variable, 137 hybrid vector, 142 identification function, 184 independence, 11, 74, 151, 193 Ito formula, 48 Ito integral, 47 Ito process, 49 Jensen’s inequality, 35, 109, 159, 199 Lebesgue integral, 221 Lebesgue measure, 216 Lebesgue-Stieltjes integral, 223 Lebesgue-Stieltjes measure, 222 Lebesgue unit interval, 3 Markov inequality, 33, 108, 158, 198 maximality axiom, 54
246 maximum entropy principle, 30, 103 maximum uncertainty principle, 225 measurable function, 218 membership function, 65 Minkowski inequality, 34, 109, 159 moment, 25, 95, 151, 191 monotone convergence theorem, 222 monotone class theorem, 214 monotonicity axiom, 53, 178 nonnegativity axiom, 1 normal distribution, 10 normal membership function, 93 normality axiom, 1, 53, 178 optimistic value, see critical value pessimistic value, see critical value power set, 213 probability continuity theorem, 2 probability density function, 9 probability distribution, 7 probability measure, 3 probability space, 3 product credibility axiom, 61 product credibility space, 62 product credibility theorem, 61 product probability space, 4 product probability theorem, 4 random fuzzy variable, 129 random sequence, 35 random variable, 4 random vector, 5 renewal process, 43, 119, 169, 206 self-duality axiom, 53, 178 set function, 1 σ-algebra, 213 simple function, 218 singular function, 220 step function, 218 stochastic differential equation, 50 stochastic integral equation, 50 stochastic process, 43 stock model, 124, 171, 208 trapezoidal fuzzy variable, 68 triangular fuzzy variable, 68 uncertain calculus, 209 uncertain differential equation, 211 uncertain integral equation, 211 uncertain measure, 178 uncertain process, 205
Index uncertain sequence, 200 uncertain variable, 181 uncertain vector, 182 uncertainty asymptotic theorem, 180 uncertainty density function, 185 uncertainty distribution, 185 uncertainty space, 181 uniform distribution, 10 variance, 23, 92, 149, 190 Wiener process, 44