• date post

18-Jul-2016
• Category

## Documents

• view

23

6

Embed Size (px)

description

Stochastic Processes Notes

### Transcript of Stochastic Processes Notes

• ECM3724 Stochastic Processes 1

ECM3724 Stochastic Processes

1 Overview of Probability

We call (X,, P ) a probability space. Here is the sample space, X : R is a random variable (RV)and P is a probability (measure). This is a function on subsets of . Elements are called outcomes.Subsets of are called events.

Given A R, P{X A} = { : X() A}. Given x R, {X = x} = { : X() = x}.

ExampleSuppose we toss a coin twice, then = {HH,HT, TH, TT}, || = 4, X is the number of heads. Then Xis the of values that X takes, that is X = {0, 1, 2}. Now {X = 1} = { : X() = 1} = {HT, TH}.P (X = 1) = P (HT ) + P (TH).

(or X) could be discrete, for example X = {x1, x2, ...}. We requirexX P (X = x) = 1. (or X)

could be continuous. If X = [0, 1], P (X A) =AfX(x)dx. Here fX(x) is a probability density function (pdf),

fX(x) 0,

XfX(x)dx = 1.

ExpectationIf X = {x1, x2, ...} and g : X R then E(g(X)) =

g(xi)P (X = xi). If g(X) = X, E(X) := X , then mean

of X. If g(X) = (X X)2 then E(g(X)) := V ar(X) = 2X > 0, the variance of X. Also V ar(X) = E(X2) 2X .In the continuous case, E(g(X)) =

Xg(x)fX(x)dx.

Common DistributionsIf X Unif [0, 1] then we say that X is distributed as Uniform[0, 1]. If A = [a, b] X then P (X A) =

ba

1dx =b a. If A = Q [0, 1], P (X A) = 0. There exist subsets of [0, 1] for which P (X A) is undefined.

If X Ber(p) then we say that X is distributed as a Bernoulli trial with success probability p. X = {0, 1},P (X = 0) = 1 p, P (X = 1) = p, p (0, 1). We can extend this over multiple independent, identicallydistributed (IID) trials (recall that events A and B are independent if P (A|B) = P (A), P (B|A) = P (B) orP (A B) = P (A)P (B)). In this case X Bin(n, p), that is X is distributed binomially with n trials and successprobability p. In this case X = {0, 1, 2, ..., n}. For 0 r n, P (X = r) =

(nr

)pr(1 p)nr. The Binomial

distribution has E(X) = np, V ar(X) = np(1 p).

Let X = {0, 1, 2, ...}. We say that X Poisson() if P (X = r) = rer! for r N {0}. Recall thatexp(x) =

r0

xr

r! , which means thatr P (X = r) = 1. For the Poisson distribution the mean and variance are

both . Other discrete distributions include the geometric and hypergeometric distributions.

We say X Exp() if the pdf of X is given by fX(x) = ex for x 0. Now

P (X > x) =

x

eudu = [eu]x = ex.

The exponential distribution is memoryless, that is the time to wait does not depend on the time already waited.More specifically, P (X > t+ s|X > s) = P (X > t) = et.

Gaussian/Normal distributionWe say X N(, 2) if the pdf is

fX(x) =1

2piexp

( (x )

2

22

)for x R. E(X) = , V ar(X) = 2.

The Central Limit Theorem (CLT)

Supoose (Xi)ni=1 are IID RVs with E(Xi) = , V ar(Xi) =

2 and let Sn =ni=1Xi. Then if Zn =

Snnn

d

• ECM3724 Stochastic Processes 2

N(0, 1), that is the distribution converges to N(0, 1). If A = [a, b] then P (Zn A)n ba

exp(u2/22pi

du.

This can be applied, for example if we take Sn to be the number of heads in n coin tosses. E(Sn) = n/2,V ar(Sn) =

ni=1 V ar(Xi) = n/4. Hence

P

(Sn n2

12

n [a, b]

)=

ba

exp(u2/2)2pi

du.

Moment Generating Functions (MGF)For a RV X, the MGF of X is the function MX(t) = E(e

tx), a function of t R. In the discrete case,E(etx) =

i etxiP (X = xi), where X X = {x1, x2, ...}. In the continuous case, E(etx) =

X

etxfX(x)dx.

Properties

MX(0) = 1. drMXdtr t=0 = E(Xr). If Z = X + Y and X,Y are independent then MZ(t) = MX(t)MY (t). If X,Y are RVs and MX(t) = MY (t) then X and Y have the same probability distribution, provided MX(t)

is continuous in a neighbourhood of t = 0.

ExerciseCompute MX(t) for the Bernoulli distribution. Compute MY (t) for Y = X1 + ... + Xn, where the Xi are IIDBernoulli RVs. What is the distribution of Y ? What happens as n (and p 0 with = np fixed.We have MX(t) = p(e

t 1) + 1. Hence My(t) = (p(et 1) + 1)n by the properties of the MGF. If Y Bin(n, p)then P (Y = r) =

(nr

)pr(1 p)nr so

E(ety) =

nr=0

(n

r

)pr(1 p)nretr = (p(et 1) + 1)n

by the Binomial theorem. Hence the sum of IID Bernoulli trials has a Binomial distribution. If we fix > 0 andlet = np with n (so p 0). Note that in a special case with n and p close to 12 then we can apply theCLT. In this case we can use the fact that limn

(1 + xn

)n= ex then

MY (t) =

(1 +

(et 1)n

)n exp ((et 1)) .

If Z Poisson(), P (Y = r) = rer! for r 0. Then

MY (t) =

r=0

etrre

r!=

r=0

(et)re

r!= exp((et 1))

since ex =r=0

xr

r! . This agrees with the limiting case of the Binomial distribution.

Probability generating functions (PGF)These are useful in cases when X takes integer values.DefinitionSuppose X takes values in N. The PGF for X is the function GX() with GX() = E(X), that is GX() =n=0

nP (X = n). If = et, then we recover MX(t).

Properties

GX(1) = 1. dGXd =

n=1 n

n1P (X = n), dGXd =1 = E(X).

GX(1) = E(X(X 1)).

• ECM3724 Stochastic Processes 3

GX+Y () = GX()GY () if X,Y are independent. V ar(X) = GX(1) +GX(1) [GX(1)]2. Given a series for GX(), the coefficient of n is precisely P (X = n). Moreover, GX(0) = P (X = 0).

The final property can be compared to M X(0)MX(0)2 = V ar(X).

ExampleX = X1 + ... + Xn, where the Xi Bernoulli(p). Now GXi() = (1 p + p). If the Xi are independent thenGX() = GX1()GX2()...GXn() = (1 p+ p)n. So X Bin(n, p).

ExampleConsider GX() =

12 . What distribution does X have? Now

GX() = E(X) =

n=0

nP (X = n) =1

2 =1

2

n=0

(

2

)nwhich means that P (X = n) = 12n+1 , that is X has a geometric distribution.

Conditional Expectation and Joint RVsConsider X and Y discrete RVs taking values in X = {x1, x2, ...} and Y = {y1, y2, ...}. The joint probabilityfunction is given by fX,Y (xi, yj) = P (X = xi, Y = yj). The marginal probability (distribution) function isfX(xi) = P (X = xi) =

j fX,Y (xi, yj) or FY (yj) = P (Y = yj) =

i fX,Y (xi, yj). The conditional probability

of X = xi given Y = yj is

fX|Y (xi|yj) = P (X = xi|Y = yj) = P (X = xi, Y = yj)P (Y = yj)

=fX,Y (xi, yj)

fY (yj).

If X,Y are independent then fX,Y (xi, yj) = fX(xi)fY (yj) for all i, j. Given g : X Y R, E(g(X,Y )) =i,j g(xi, yj)fX,Y (xi, yj). IfX,Y are independent and g(X,Y ) = h1(X)h2(Y ) then E(g(X,Y )) = E(h1(X))E(h2(Y )).

The conditional expectation of X given Y is the quantity E(X|Y ). This is a function of Y , the average overX given by a value of Y . If Y = yj , then

E(X|Y = yj) =i

xiP (X = xi|Y = yj) =i

xifX,Y (xi, yj)

fY (yj),

a function of Y = yj . E(X|Y ) is a RV which is governed by the probability distribution of Y , hence we can alsotake expectations.

Tower ruleE(E(X|Y )) = E(X).

We have a useful check, if X and Y are independent then E(X|Y ) = E(X). In general

E(E(X|Y )) =j

(i

xifX,Y (xi, yj)

fY (yj)

)fY (yj) =

i

xij

fX,Y (xi, yj) =i

xifX(xi) = E(X).

Compound processesSuppose (Xi)

i=1 are IID RVs with PGF GX() (since the Xi are IID, X = Xi). Suppose N is a RV with PGF GN (),

independent of the Xi. Let Z = X1 +X2 +...+XN . Z is a compound process, a random sum of random variables.

PropositionFor the compound process Z, the PGF is GZ() = GN (GX()) = GN GX().

• ECM3724 Stochastic Processes 4

ProofBy definition

GZ() = E(Z) =

n=0

nP (Z = n)

= E(X1+X2+...+Xn) = E(E(X1+X2+...+Xn |N)) [Tower rule]

=

n=0

E(X1+X2+...+Xn |N = n)P (N = n) =n=0

E(X1)E(X2)...E(Xn)P (N = n) [Independence]

=

n=0

(GX())nP (N = n) = GN (GX()) [Definition of PGF].

The coefficient of n in GZ() gives P (Z = n).

ExampleSuppose we roll a dice and then flip a number of coins equal to the number on the dice. If Z is the number of heads,what is P (Z = k)?By the previous proposition we have GZ() = GN (GX()) where N Unif(6) (the values on the dice) andX Bernoulli(1/2) (the flip of the coin). Now

GX() = E(X) =

1n=0

nP (X = n) =1

2(1 + )

and

GN () = E(N ) =

6n=1

nP (N = n) =1

6

6n=1

n.

Hence

GZ() = GN

(1

2(1 + )

)=

1

6

6n=1

1

2n(1 + )n.

It follows that P (Z = k) is given by the k coefficient in the sum. By the binomial theorem, we have P (Z = k) =16

6n=1

(nk

) (12

)n, recalling that

(nk

)= 0 for k > n.

2 Branching Processes

Let Sn be the number of individuals in a population at time n. Suppose S0 = 1 (one individual at time 0). Indi-viduals evolve at each timestep according to a common RV X, and evolve independently of others. We assume Xhas PGF GX(). Let Xi, i 1 be IID copies of X. We want to work out the long term behaviour of Sn, E(Sn)and P (Sn = 0).

We use generating function analysis. For Sn, denote the PGF by Gn(). So since S1 = X, G1() = GX(). ForS2, G2() = E(

S2 ) = E(E(

S2 |X)) = GX GX() by the previous proposition. Similarly, G3() = E(E(S3 |S2)) =GX GX GX().

PropositionGn() = GX GX ... GX() (n-fold composition). Moreover Gn() = GX(Gn1()) = Gn1(GX()).ProofThis follows easily by induction.

Remark: the coefficient of k in Gn() gives P (Sn = k).

• ECM3724 Stochastic Processes 5

Expected behaviour of SnWe want to study E(Sn) =

dGnd =1 = G

n(1). Let = E(X) = G