ELEMENTS OF PROBABILITY THEORYpavl/Lec2_prob.pdf · Elements of Probability Theory † A collection...

$: ELEMENTS OF PROBABILITY THEORYpavl/Lec2_prob.pdf · Elements of Probability Theory † A collection of subsets of a set › is called a ¾{algebra if it contains › and is closed$
ELEMENTS OF PROBABILITY THEORY

Elements of Probability Theory

• A collection of subsets of a set Ω is called a σ–algebra if itcontains Ω and is closed under the operations of takingcomplements and countable unions of its elements.

• A sub-σ–algebra is a collection of subsets of a σ–algebra whichsatisfies the axioms of a σ–algebra.

• A measurable space is a pair (Ω,F) where Ω is a set and F is aσ–algebra of subsets of Ω.

• Let (Ω,F) and (E,G) be two measurable spaces. A functionX : Ω 7→ E such that the event

ω ∈ Ω : X(ω) ∈ A =: X ∈ A

belongs to F for arbitrary A ∈ G is called a measurablefunction or random variable.


• Let (Ω,F) be a measurable space. A function µ : F 7→ [0, 1] iscalled a probability measure if µ(∅) = 1, µ(Ω) = 1 andµ(∪∞k=1Ak) =

∑∞k=1 µ(Ak) for all sequences of pairwise disjoint

sets Ak∞k=1 ∈ F .

• The triplet (Ω,F , µ) is called a probability space.

• Let X be a random variable (measurable function) from(Ω,F , µ) to (E,G). If E is a metric space then we may defineexpectation with respect to the measure µ by

E[X] =∫

Ω

X(ω) dµ(ω).

• More generally, let f : E 7→ R be G–measurable. Then,

E[f(X)] =∫

Ω

f(X(ω)) dµ(ω).


• Let U be a topological space. We will use the notation B(U) todenote the Borel σ–algebra of U : the smallest σ–algebracontaining all open sets of U . Every random variable from aprobability space (Ω,F , µ) to a measurable space (E,B(E))induces a probability measure on E:

µX(B) = PX−1(B) = µ(ω ∈ Ω; X(ω) ∈ B), B ∈ B(E).

The measure µX is called the distribution (or sometimes thelaw) of X.

Example 1 Let I denote a subset of the positive integers. Avector ρ0 = ρ0,i, i ∈ I is a distribution on I if it has nonnegativeentries and its total mass equals 1:

∑i∈I ρ0,i = 1.


• We can use the distribution of a random variable to computeexpectations and probabilities:

E[f(X)] =∫

S

f(x) dµX(x)

and

P[X ∈ G] =∫

G

dµX(x), G ∈ B(E).

• When E = Rd and we can write dµX(x) = ρ(x) dx, then werefer to ρ(x) as the probability density function (pdf), ordensity with respect to Lebesque measure for X.

• When E = Rd then by Lp(Ω;Rd), or sometimes Lp(Ω; µ) oreven simply Lp(µ), we mean the Banach space of measurablefunctions on Ω with norm

‖X‖Lp =(E|X|p

)1/p

.


Example 2 i) Consider the random variable X : Ω 7→ R with pdf

γσ,m(x) := (2πσ)−12 exp

(− (x−m)2

2σ

).

Such an X is termed a Gaussian or normal random variable.The mean is

EX =∫

Rxγσ,m(x) dx = m

and the variance is

E(X −m)2 =∫

R(x−m)2γσ,m(x) dx = σ.

Since the mean and variance specify completely a Gaussianrandom variable on R, the Gaussian is commonly denoted byN (m,σ). The standard normal random variable is N (0, 1).


ii) Let m ∈ Rd and Σ ∈ Rd×d be symmetric and positive definite.The random variable X : Ω 7→ Rd with pdf

γΣ,m(x) :=((2π)ddetΣ

)− 12 exp

(−1

2〈Σ−1(x−m), (x−m)〉

)

is termed a multivariate Gaussian or normal randomvariable. The mean is

E(X) = m (1)

and the covariance matrix is

E((X −m)⊗ (X −m)

)= Σ. (2)

Since the mean and covariance matrix completely specify aGaussian random variable on Rd, the Gaussian is commonlydenoted by N (m,Σ).


Example 3 An exponential random variable T : Ω → R+ with rateλ > 0 satisfies

P(T > t) = e−λt, ∀t > 0.

We write T ∼ exp(λ). The related pdf is

fT (t) = λe−λt, t > 0,

0, t < 0.(3)

Notice that

ET =∫ ∞

−∞tfT (t)dt =

1λ

∫ ∞

0

(λt)e−λtd(λt) =1λ

.

If the times τn = tn+1 − tn are i.i.d random variables withτ0 ∼ exp(λ) then, for t0 = 0,

tn =n−1∑

k=0

τk


and it is possible to show that

P(0 6 tk 6 t < tk+1) =e−λt(λt)k

k!. (4)


• Assume that E|X| < ∞ and let G be a sub–σ–algebra of F .The conditional expectation of X with respect to G isdefined to be the function E[X|G] : Ω 7→ E which isG–measurable and satisfies

∫

G

E[X|G] dµ =∫

G

X dµ ∀G ∈ G.

• We can define E[f(X)|G] and the conditional probabilityP[X ∈ F |G] = E[IF (X)|G], where IF is the indicator function ofF , in a similar manner.

ELEMENTS OF THE THEORY OF STOCHASTICPROCESSES

Definition of a Stochastic Process

• Let T be an ordered set. A stochastic process is a collectionof random variables X = Xt; t ∈ T where, for each fixedt ∈ T , Xt is a random variable from (Ω,F) to (E,G).

• The measurable space Ω,F is called the sample space. Thespace (E,G) is called the state space .

• In this course we will take the set T to be [0, +∞).

• The state space E will usually be Rd equipped with theσ–algebra of Borel sets.

• A stochastic process X may be viewed as a function of botht ∈ T and ω ∈ Ω. We will sometimes write X(t), X(t, ω) orXt(ω) instead of Xt. For a fixed sample point ω ∈ Ω, thefunction Xt(ω) : T 7→ E is called a sample path (realization,trajectory) of the process X.

Definition of a Stochastic Process

• The finite dimensional distributions (fdd) of a stochasticprocess are the Ek–valued random variables(X(t1), X(t2), . . . , X(tk)) for arbitrary positive integer k andarbitrary times ti ∈ T, i ∈ 1, . . . , k.

• We will say that two processes Xt and Yt are equivalent if theyhave same finite dimensional distributions.

• From experiments or numerical simulations we can only obtaininformation about the (fdd) of a process.

Stationary Processes

• A process is called (strictly) stationary if all fdd areinvariant under are time translation: for any integer k andtimes ti ∈ T , the distribution of (X(t1), X(t2), . . . , X(tk)) isequal to that of (X(s + t1), X(s + t2), . . . , X(s + tk)) for any s

such that s + ti ∈ T for all i ∈ 1, . . . , k.• Let Xt be a stationary stochastic process with finite second

moment (i.e. Xt ∈ L2). Stationarity implies that EXt = µ,E((Xt − µ)(Xs − µ)) = C(t− s). The converse is not true.

• A stochastic process Xt ∈ L2 is called second-orderstationary (or stationary in the wide sense) if the firstmoment EXt is a constant and the second moment dependsonly on the difference t− s:

EXt = µ, E((Xt − µ)(Xs − µ)) = C(t− s).


• The function C(t) is called the correlation (or covariance)function of Xt.

• Let Xt ∈ L2 be a mean zero second order stationary process onR which is mean square continuous, i.e.

limt→s

E|Xt −Xs|2 = 0.

• Then the correlation function admits the representation

C(t) =∫ ∞

−∞eitxf(x) dx, t ∈ R.

• the function f(x) is called the spectral density of the processXt.

• In many cases, the experimentally measured quantity is thespectral density (or power spectrum) of the stochastic process.


• Given the correlation function of Xt, and assuming thatC(t) ∈ L1(R), we can calculate the spectral density through itsFourier transform:

f(x) =12π

∫ ∞

−∞e−itxC(t) dt.

• The correlation function of a second order stationary processenables us to associate a time scale to Xt, the correlationtime τcor:

τcor =1

C(0)

∫ ∞

0

C(τ) dτ =∫ ∞

0

E(XτX0)/E(X20 ) dτ.

• The slower the decay of the correlation function, the larger thecorrelation time is. We have to assume sufficiently fast decay ofcorrelations so that the correlation time is finite.


Example 4 Consider a second stationary process with correlationfunction

C(t) = C(0)e−γ|t|.

The spectral density of this process is

f(x) =12π

C(0)∫ ∞

−∞e−itxe−γ|t| dt

= C(0)1π

γ

γ2 + x2.

The correlation time is

τcor =∫ ∞

0

e−γt dt = γ−1.

Gaussian Processes

• The most important class of stochastic processes is that ofGaussian processes:

Definition 5 A Gaussian process is one for which E = Rd andall the finite dimensional distributions are Gaussian.

• A Gaussian process x(t) is characterized by its mean

m(t) := Ex(t)

and the covariance function

C(t, s) = E((

x(t)−m(t))⊗ (

x(s)−m(s)))

.

• Thus, the first two moments of a Gaussian process aresufficient for a complete characterization of the process.

• A corollary of this is that a second order stationary Gaussianprocess is also a stationary process.

Brownian Motion

• The most important continuous time stochastic process isBrownian motion. Brownian motion is a mean zero,continuous (i.e. it has continuous sample paths: for a.e ω ∈ Ωthe function Xt is a continuous function of time) process withindependent Gaussian increments.

• A process Xt has independent increments if for everysequence t0 < t1...tn the random variables

Xt1 −Xt0 , Xt2 −Xt1 , . . . , Xtn −Xtn−1

are independent.

• If, furthermore, for any t1, t2 and Borel set B ⊂ R

P(Xt2+s −Xt1+s ∈ B)

is independent of s, then the process Xt has stationaryindependent increments.

Brownian Motion

Definition 6 i) A one dimensional standard Brownian motionW (t) : R+ → R is a real valued stochastic process with thefollowing properties:

(a) W (0) = 0;

(b) W (t) is continuous;

(c) W (t) has independent increments.

(d) For every t > s > 0 W (t)−W (s) has a Gaussiandistribution with mean 0 and variance t− s. That is, thedensity of the random variable W (t)−W (s) is

g(x; t, s) =(2π(t− s)

)− 12

exp(− x2

2(t− s)

); (5)

Brownian Motion

ii) A d–dimensional standard Brownian motion W (t) : R+ → Rd

is a collection of d independent one dimensional Brownianmotions:

W (t) = (W1(t), . . . ,Wd(t)),

where Wi(t), i = 1, . . . , d are independent one dimensionalBrownian motions. The density of the Gaussian random vectorW (t)−W (s) is thus

g(x; t, s) =(2π(t− s)

)−d/2

exp(− ‖x‖2

2(t− s)

).

Brownian motion is sometimes referred to as the Wiener process .

Brownian Motion

0 1 2 3 4 5−4

−3

−2

−1

0

1

2

3

t

W(t)

Figure 1: Brownian sample paths

Brownian Motion

It is possible to prove rigorously the existence of the Wienerprocess (Brownian motion):

Theorem 1 (Wiener) There exists an almost-surely continuousprocess Wt with independent increments such and W0 = 0, suchthat for each t > the random variable Wt is N (0, t). Furthermore,Wt is almost surely locally Holder continuous with exponent α forany α ∈ (0, 1

2 ).

Notice that Brownian paths are not differentiable.

Brownian Motion

Brownian motion is a Gaussian process. For the d–dimensionalBrownian motion, and for I the d× d dimensional identity, we have(see (1) and (2))

EW (t) = 0 ∀t > 0

andE

((W (t)−W (s))⊗ (W (t)−W (s))

)= (t− s)I. (6)

Moreover,

E(W (t)⊗W (s)

)= min(t, s)I. (7)

Brownian Motion

• From the formula for the Gaussian density g(x, t− s), eqn. (5),we immediately conclude that W (t)−W (s) andW (t + u)−W (s + u) have the same pdf. Consequently,Brownian motion has stationary increments.

• Notice, however, that Brownian motion itself is not astationary process.

• Since W (t) = W (t)−W (0), the pdf of W (t) is

g(x, t) =1√2πt

e−x2/2t.

• We can easily calculate all moments of the Brownian motion:

E(xn(t)) =1√2πt

∫ +∞

−∞xne−x2/2t dx

= 1.3 . . . (n− 1)tn/2, n even,

0, n odd.

The Poisson Process

• Another fundamental continuous time process is the Poissonprocess :

Definition 7 The Poisson process with intensity λ, denoted byN(t), is an integer-valued, continuous time, stochastic processwith independent increments satisfying

P[(N(t)−N(s)) = k] =e−λ(t−s)

(λ(t− s)

)k

k!, t > s > 0, k ∈ N.

• Notice the connection to exponential random variables via (4).

• Both Brownian motion and the Poisson process arehomogeneous (or time-homogeneous): the incrementsbetween successive times s and t depend only on t− s.

The Path Space

• Let (Ω,F , µ) be a probability space, (E, ρ) a metric space andlet T = [0,∞). Let Xt be a stochastic process from (Ω,F , µ)to (E, ρ) with continuous sample paths.

• The above means that for every ω ∈ Ω we have thatXt ∈ CE := C([0,∞); E).

• The space of continuous functions CE is called the path spaceof the stochastic process.

• We can put a metric on E as follows:

ρE(X1, X2) :=∞∑

n=1

12n

max06t6n

min(ρ(X1

t , X2t ), 1

).

• We can then define the Borel sets on CE , using the topologyinduced by this metric, and Xt can be thought of as arandom variable on (Ω,F , µ) with state space (CE ,B(CE)).

The Path Space

• The probability measure PX−1t on (CE ,B(CE)) is called the

law of Xt.• The law of a stochastic process is a probability measure on its

path space.

Example 8 The space of continuous functions CE is the pathspace of Brownian motion (the Wiener process). The law ofBrownian motion, that is the measure that it induces onC([0,∞),Rd), is known as the Wiener measure.

ELEMENTS OF PROBABILITY THEORYpavl/Lec2_prob.pdf · Elements of Probability Theory † A collection...

Documents

Transcript of ELEMENTS OF PROBABILITY THEORYpavl/Lec2_prob.pdf · Elements of Probability Theory † A collection...