1
Basic Probabilistic Tools for Stochastic Modeling
In this chapter, the readers will find a brief summary of the basic probability tools intensively used in this book. A more detailed version including proofs can be found in [JAN 06].
1.1. Probability space and random variables
Given a sample space Ω, the set of all possible events will be denoted by , which is assumed to have the structure of a σ -field or a σ -algebra. P will represent a probability measure.
DEFINITION 1.1.– A random variable (r.v.) with values in a topological space (E,ψ) is an application X from Ω to E such that:
[1.1]
where X-1(B) is called the inverse image of the set B defined by:
[1.2]
Particular cases:
a) If (E,ψ) = ( , β), X is called a real random variable.
b) If (E, ψ) = , where is the extended real line defined by and is the extended Borel σ -field of , that is the minimal σ -field containing all the elements of β and the extended intervals:
[1.3]
X is called a real extended value random variable.
c) If E = (n>1) with the product σ -field β(n) of β, X is called an n-dimensional real random variable.
d) If E = (n>1) with the product σ -field β(n) of β, X is called a real extended n-dimensional real random variable.
A random variable X is called discrete or continuous accordingly as X takes at most a denumerable or a non-denumerable infinite set of values.
DEFINITION 1.2.– The distribution function of the r.v. X, represented by FX, is the function from →[0,1] defined by:
[1.4]
Briefly, we write:
[1.5]
This last definition can be extended to the multi-dimensional case with a r.v. X being an n-dimensional real vector: X = (X1,…, Xn), a measurable application from (Ω, , P) to .
DEFINITION 1.3.– The distribution function of the r.v. X = (X1,…, Xn) , represented by FX, is the function from to [0,1] defined by:
[1.6]
Briefly, we write:
[1.7]
Each component Xi (i = 1,…,n) is itself a one-dimensional real r.v. whose d.f., called the marginal d.f., is given by:
[1.8]
The concept of random variable is stable under a lot of mathematical operations; so any Borel function of a r.v. X is also a r.v.
Moreover, if X and Y are two r.v., so are:
[1.9]
provided, in the last case, that Y does not vanish.
Concerning the convergence properties, we must mention the property that, if (Xn, n ≥ 1) is a convergent sequence of r.v. – that is, for all ω∈Ω, the sequence (Xn (ω)) converges to X (ω) – then the limit X is also a r.v. on Ω. This convergence, which may be called the sure convergence, can be weakened to give the concept of almost sure (a.s.) convergence of the given sequence.
DEFINITION 1.4.– The sequence (Xn (ω)) converges a.s. to X (ω) if:
This last notion means that the possible set where the given sequence does not converge is a null set, that is, a set N belonging to such that:
[1.11]
In general, let us remark that, given a null set, it is not true that every subset of it belongs to but of course if it belongs to , it is clearly a null set. To avoid unnecessary complications, we will assume from here onward that any considered probability space is complete, i.e. all the subsets of a null set also belong to and thus their probability is zero.
1.2. Expectation and independence
Using the concept of integral, it is possible to define the expectation of a random variable X represented by:
[1.12]
provided that this integral exists. The computation of the integral:
[1.13]
can be done using the induced measure μ on ( , β), defined by [1.4] and then using the distribution function F of X.
Indeed, we can write:
[1.14]
and if FX is the d.f. of X, it can be shown that:
[1.15]
The last integral is a Lebesgue–Stieltjes integral.
Moreover, if FX is absolutely continuous with fX as density, we obtain:
[1.16]
If g is a Borel function, then we also have (see, e.g. [CHU 00] and [LOÈ 63]):
[1.17]
and with a density for X:
[1.18]
It is clear that the expectation is a linear operator on integrable functions.
DEFINITION 1.5.– Let a be a real number and r be a positive real number, then the expectation:
[1.19]
is called the absolute moment of X, of order r, centered on a.
The moments are said to be centered moments of order r if a=E(X). In particular, for r = 2, we get the variance of X represented by σ2 (var(X)) :
[1.20]
REMARK 1.1.– From the linearity of the expectation, it is easy to prove that:
[1.21]
and so:
[1.22]
and, more generally, it can be proved that the variance is the smallest moment of order 2, whatever the number a is.
The set of all real r.v. such that the moment of order r exists is represented by Lr.
The last fundamental concept that we will now introduce in this section is stochastic independence, or more simply independence.
DEFINITION 1.6.– The events A1,…, An, (n > 1) are stochastically independent or independent iff:
[1.23]
For n = 2, relation [1.23] reduces to:
[1.24]
Let us remark that piecewise independence of the events A1,…, An, (n > 1) does not necessarily imply the independence of these sets and, thus, not the stochastic independence of these n events.
From relation [1.23], we find that:
[1.25]
If the functions FX, FX1,…, FX n are the distribution functions of the r.v. X = (X1,…, Xn), X1,…, Xn, we can write the preceding relation as follows:
[1.26]
It can be shown that this last condition is also sufficient for the independence of X = (X1,…, Xn), X1,…, Xn. If these d.f. have densities fX, fX1,…, fXn, relation [1.24]is equivalent to:
[1.27]
In case of the integrability of the n real r.v X1,X2,…,Xn,, a direct consequence of relation [1.26] is that we have a very important property for the expectation of the product of n independent r.v.:
[1.28]
The notion of independence gives the possibility of proving the result called the strong law of large numbers, which states that if (Xn, n ≥ 1) is a sequence of integrable independent and identically distributed r.v., then:
[1.29]
The next section will present the most useful distribution functions for stochastic modeling.
DEFINITION 1.7 (SKEWNESS AND KURTOSIS COEFFICIENTS).–
a) The skewness coefficient of Fisher is defined as follows:
From the odd value of this exponent, it follows that:
−γ1>0 gives a left dissymmetry giving a maximum of the density function situated to the left and a distribution with a right heavy queue, γ1 = 0 gives symmetric distribution with respect to the mean;
−γ1<0 gives a right dissymmetry giving a maximum of the density function situated to the right and a distribution with a left heavy queue.
b) The kurtosis coefficient also due to Fisher is defined as follows:
Its interpretation refers to the normal distribution for which its value is 3. Also some authors refer to the excess of kurtosis given by γ1-3 of course null in the normal case.
For γ2<3, distributions are called leptokurtic, being more plated around the mean than in the normal case and with heavy queues.
For γ2>3, distributions are less plated around the mean than in the normal case and with heavy...