# ELEMENTS OF PROBABILITY pavl/Lec2_prob.pdfآ Elements of Probability Theory â€ A collection...

date post

18-Mar-2020Category

## Documents

view

4download

0

Embed Size (px)

### Transcript of ELEMENTS OF PROBABILITY pavl/Lec2_prob.pdfآ Elements of Probability Theory â€ A collection...

ELEMENTS OF PROBABILITY THEORY

Elements of Probability Theory

• A collection of subsets of a set Ω is called a σ–algebra if it contains Ω and is closed under the operations of taking complements and countable unions of its elements.

• A sub-σ–algebra is a collection of subsets of a σ–algebra which satisfies the axioms of a σ–algebra.

• A measurable space is a pair (Ω,F) where Ω is a set and F is a σ–algebra of subsets of Ω.

• Let (Ω,F) and (E,G) be two measurable spaces. A function X : Ω 7→ E such that the event

{ω ∈ Ω : X(ω) ∈ A} =: {X ∈ A}

belongs to F for arbitrary A ∈ G is called a measurable function or random variable.

Elements of Probability Theory

• Let (Ω,F) be a measurable space. A function µ : F 7→ [0, 1] is called a probability measure if µ(∅) = 1, µ(Ω) = 1 and µ(∪∞k=1Ak) =

∑∞ k=1 µ(Ak) for all sequences of pairwise disjoint

sets {Ak}∞k=1 ∈ F . • The triplet (Ω,F , µ) is called a probability space. • Let X be a random variable (measurable function) from

(Ω,F , µ) to (E,G). If E is a metric space then we may define expectation with respect to the measure µ by

E[X] = ∫

Ω

X(ω) dµ(ω).

• More generally, let f : E 7→ R be G–measurable. Then,

E[f(X)] = ∫

Ω

f(X(ω)) dµ(ω).

Elements of Probability Theory

• Let U be a topological space. We will use the notation B(U) to denote the Borel σ–algebra of U : the smallest σ–algebra containing all open sets of U . Every random variable from a probability space (Ω,F , µ) to a measurable space (E,B(E)) induces a probability measure on E:

µX(B) = PX−1(B) = µ(ω ∈ Ω; X(ω) ∈ B), B ∈ B(E).

The measure µX is called the distribution (or sometimes the law) of X.

Example 1 Let I denote a subset of the positive integers. A vector ρ0 = {ρ0,i, i ∈ I} is a distribution on I if it has nonnegative entries and its total mass equals 1:

∑ i∈I ρ0,i = 1.

Elements of Probability Theory

• We can use the distribution of a random variable to compute expectations and probabilities:

E[f(X)] = ∫

S

f(x) dµX(x)

and

P[X ∈ G] = ∫

G

dµX(x), G ∈ B(E).

• When E = Rd and we can write dµX(x) = ρ(x) dx, then we refer to ρ(x) as the probability density function (pdf), or density with respect to Lebesque measure for X.

• When E = Rd then by Lp(Ω;Rd), or sometimes Lp(Ω; µ) or even simply Lp(µ), we mean the Banach space of measurable functions on Ω with norm

‖X‖Lp = ( E|X|p

)1/p .

Elements of Probability Theory

Example 2 i) Consider the random variable X : Ω 7→ R with pdf

γσ,m(x) := (2πσ)− 1 2 exp

( − (x−m)

2

2σ

) .

Such an X is termed a Gaussian or normal random variable. The mean is

EX = ∫

R xγσ,m(x) dx = m

and the variance is

E(X −m)2 = ∫

R (x−m)2γσ,m(x) dx = σ.

Since the mean and variance specify completely a Gaussian random variable on R, the Gaussian is commonly denoted by N (m,σ). The standard normal random variable is N (0, 1).

Elements of Probability Theory

ii) Let m ∈ Rd and Σ ∈ Rd×d be symmetric and positive definite. The random variable X : Ω 7→ Rd with pdf

γΣ,m(x) := ( (2π)ddetΣ

)− 12 exp ( −1

2 〈Σ−1(x−m), (x−m)〉

)

is termed a multivariate Gaussian or normal random variable. The mean is

E(X) = m (1)

and the covariance matrix is

E ( (X −m)⊗ (X −m)

) = Σ. (2)

Since the mean and covariance matrix completely specify a Gaussian random variable on Rd, the Gaussian is commonly denoted by N (m,Σ).

Elements of Probability Theory

Example 3 An exponential random variable T : Ω → R+ with rate λ > 0 satisfies

P(T > t) = e−λt, ∀t > 0. We write T ∼ exp(λ). The related pdf is

fT (t) = { λe−λt, t > 0,

0, t < 0. (3)

Notice that

ET = ∫ ∞ −∞

tfT (t)dt = 1 λ

∫ ∞ 0

(λt)e−λtd(λt) = 1 λ

.

If the times τn = tn+1 − tn are i.i.d random variables with τ0 ∼ exp(λ) then, for t0 = 0,

tn = n−1∑

k=0

τk

Elements of Probability Theory

and it is possible to show that

P(0 6 tk 6 t < tk+1) = e−λt(λt)k

k! . (4)

Elements of Probability Theory

• Assume that E|X| < ∞ and let G be a sub–σ–algebra of F . The conditional expectation of X with respect to G is defined to be the function E[X|G] : Ω 7→ E which is G–measurable and satisfies

∫

G

E[X|G] dµ = ∫

G

X dµ ∀G ∈ G.

• We can define E[f(X)|G] and the conditional probability P[X ∈ F |G] = E[IF (X)|G], where IF is the indicator function of F , in a similar manner.

ELEMENTS OF THE THEORY OF STOCHASTIC PROCESSES

Definition of a Stochastic Process

• Let T be an ordered set. A stochastic process is a collection of random variables X = {Xt; t ∈ T} where, for each fixed t ∈ T , Xt is a random variable from (Ω,F) to (E,G).

• The measurable space {Ω,F} is called the sample space. The space (E,G) is called the state space .

• In this course we will take the set T to be [0, +∞). • The state space E will usually be Rd equipped with the

σ–algebra of Borel sets.

• A stochastic process X may be viewed as a function of both t ∈ T and ω ∈ Ω. We will sometimes write X(t), X(t, ω) or Xt(ω) instead of Xt. For a fixed sample point ω ∈ Ω, the function Xt(ω) : T 7→ E is called a sample path (realization, trajectory) of the process X.

Definition of a Stochastic Process

• The finite dimensional distributions (fdd) of a stochastic process are the Ek–valued random variables (X(t1), X(t2), . . . , X(tk)) for arbitrary positive integer k and arbitrary times ti ∈ T, i ∈ {1, . . . , k}.

• We will say that two processes Xt and Yt are equivalent if they have same finite dimensional distributions.

• From experiments or numerical simulations we can only obtain information about the (fdd) of a process.

Stationary Processes

• A process is called (strictly) stationary if all fdd are invariant under are time translation: for any integer k and times ti ∈ T , the distribution of (X(t1), X(t2), . . . , X(tk)) is equal to that of (X(s + t1), X(s + t2), . . . , X(s + tk)) for any s such that s + ti ∈ T for all i ∈ {1, . . . , k}.

• Let Xt be a stationary stochastic process with finite second moment (i.e. Xt ∈ L2). Stationarity implies that EXt = µ, E((Xt − µ)(Xs − µ)) = C(t− s). The converse is not true.

• A stochastic process Xt ∈ L2 is called second-order stationary (or stationary in the wide sense) if the first moment EXt is a constant and the second moment depends only on the difference t− s:

EXt = µ, E((Xt − µ)(Xs − µ)) = C(t− s).

Stationary Processes

• The function C(t) is called the correlation (or covariance) function of Xt.

• Let Xt ∈ L2 be a mean zero second order stationary process on R which is mean square continuous, i.e.

lim t→s

E|Xt −Xs|2 = 0.

• Then the correlation function admits the representation

C(t) = ∫ ∞ −∞

eitxf(x) dx, t ∈ R.

• the function f(x) is called the spectral density of the process Xt.

• In many cases, the experimentally measured quantity is the spectral density (or power spectrum) of the stochastic process.

Stationary Processes

• Given the correlation function of Xt, and assuming that C(t) ∈ L1(R), we can calculate the spectral density through its Fourier transform:

f(x) = 1 2π

∫ ∞ −∞

e−itxC(t) dt.

• The correlation function of a second order stationary process enables us to associate a time scale to Xt, the correlation time τcor:

τcor = 1

C(0)

∫ ∞ 0

C(τ) dτ = ∫ ∞

0

E(XτX0)/E(X20 ) dτ.

• The slower the decay of the correlation function, the larger the correlation time is. We have to assume sufficiently fast decay of correlations so that the correlation time is finite.

Stationary Processes

Example 4 Consider a second stationary process with correlation function

C(t) = C(0)e−γ|t|.

The spectral density of this process is

f(x) = 1 2π

C(0) ∫ ∞ −∞

e−itxe−γ|t| dt

= C(0) 1 π

γ

γ2 + x2 .

The correlation time is

τcor = ∫ ∞

0

e−γt dt = γ−1.

Gaussian Processes

• The most important class of stochastic processes is that of Gaussian processes:

Definition 5 A Gaussian process is one for which E = Rd and all the finite dimensional distributions are Gaussian.

• A Gaussian process x(t) is characterized by its mean

m(t) := Ex(t)

and the covariance function

C(t, s) = E ((

x(t)−m(t))⊗ (x(s)−m(s)) ) .

• Thus, the first two moments of a Gaussian process are sufficient for a complete charac

Recommended

*View more*