Stochastic Processes Notes
date post
18-Jul-2016Category
Documents
view
23download
6
Embed Size (px)
description
Transcript of Stochastic Processes Notes
ECM3724 Stochastic Processes 1
ECM3724 Stochastic Processes
1 Overview of Probability
We call (X,, P ) a probability space. Here is the sample space, X : R is a random variable (RV)and P is a probability (measure). This is a function on subsets of . Elements are called outcomes.Subsets of are called events.
Given A R, P{X A} = { : X() A}. Given x R, {X = x} = { : X() = x}.
ExampleSuppose we toss a coin twice, then = {HH,HT, TH, TT}, || = 4, X is the number of heads. Then Xis the of values that X takes, that is X = {0, 1, 2}. Now {X = 1} = { : X() = 1} = {HT, TH}.P (X = 1) = P (HT ) + P (TH).
(or X) could be discrete, for example X = {x1, x2, ...}. We requirexX P (X = x) = 1. (or X)
could be continuous. If X = [0, 1], P (X A) =AfX(x)dx. Here fX(x) is a probability density function (pdf),
fX(x) 0,
XfX(x)dx = 1.
ExpectationIf X = {x1, x2, ...} and g : X R then E(g(X)) =
g(xi)P (X = xi). If g(X) = X, E(X) := X , then mean
of X. If g(X) = (X X)2 then E(g(X)) := V ar(X) = 2X > 0, the variance of X. Also V ar(X) = E(X2) 2X .In the continuous case, E(g(X)) =
Xg(x)fX(x)dx.
Common DistributionsIf X Unif [0, 1] then we say that X is distributed as Uniform[0, 1]. If A = [a, b] X then P (X A) =
ba
1dx =b a. If A = Q [0, 1], P (X A) = 0. There exist subsets of [0, 1] for which P (X A) is undefined.
If X Ber(p) then we say that X is distributed as a Bernoulli trial with success probability p. X = {0, 1},P (X = 0) = 1 p, P (X = 1) = p, p (0, 1). We can extend this over multiple independent, identicallydistributed (IID) trials (recall that events A and B are independent if P (A|B) = P (A), P (B|A) = P (B) orP (A B) = P (A)P (B)). In this case X Bin(n, p), that is X is distributed binomially with n trials and successprobability p. In this case X = {0, 1, 2, ..., n}. For 0 r n, P (X = r) =
(nr
)pr(1 p)nr. The Binomial
distribution has E(X) = np, V ar(X) = np(1 p).
Let X = {0, 1, 2, ...}. We say that X Poisson() if P (X = r) = rer! for r N {0}. Recall thatexp(x) =
r0
xr
r! , which means thatr P (X = r) = 1. For the Poisson distribution the mean and variance are
both . Other discrete distributions include the geometric and hypergeometric distributions.
We say X Exp() if the pdf of X is given by fX(x) = ex for x 0. Now
P (X > x) =
x
eudu = [eu]x = ex.
The exponential distribution is memoryless, that is the time to wait does not depend on the time already waited.More specifically, P (X > t+ s|X > s) = P (X > t) = et.
Gaussian/Normal distributionWe say X N(, 2) if the pdf is
fX(x) =1
2piexp
( (x )
2
22
)for x R. E(X) = , V ar(X) = 2.
The Central Limit Theorem (CLT)
Supoose (Xi)ni=1 are IID RVs with E(Xi) = , V ar(Xi) =
2 and let Sn =ni=1Xi. Then if Zn =
Snnn
d
ECM3724 Stochastic Processes 2
N(0, 1), that is the distribution converges to N(0, 1). If A = [a, b] then P (Zn A)n ba
exp(u2/22pi
du.
This can be applied, for example if we take Sn to be the number of heads in n coin tosses. E(Sn) = n/2,V ar(Sn) =
ni=1 V ar(Xi) = n/4. Hence
P
(Sn n2
12
n [a, b]
)=
ba
exp(u2/2)2pi
du.
Moment Generating Functions (MGF)For a RV X, the MGF of X is the function MX(t) = E(e
tx), a function of t R. In the discrete case,E(etx) =
i etxiP (X = xi), where X X = {x1, x2, ...}. In the continuous case, E(etx) =
X
etxfX(x)dx.
Properties
MX(0) = 1. drMXdtr t=0 = E(Xr). If Z = X + Y and X,Y are independent then MZ(t) = MX(t)MY (t). If X,Y are RVs and MX(t) = MY (t) then X and Y have the same probability distribution, provided MX(t)
is continuous in a neighbourhood of t = 0.
ExerciseCompute MX(t) for the Bernoulli distribution. Compute MY (t) for Y = X1 + ... + Xn, where the Xi are IIDBernoulli RVs. What is the distribution of Y ? What happens as n (and p 0 with = np fixed.We have MX(t) = p(e
t 1) + 1. Hence My(t) = (p(et 1) + 1)n by the properties of the MGF. If Y Bin(n, p)then P (Y = r) =
(nr
)pr(1 p)nr so
E(ety) =
nr=0
(n
r
)pr(1 p)nretr = (p(et 1) + 1)n
by the Binomial theorem. Hence the sum of IID Bernoulli trials has a Binomial distribution. If we fix > 0 andlet = np with n (so p 0). Note that in a special case with n and p close to 12 then we can apply theCLT. In this case we can use the fact that limn
(1 + xn
)n= ex then
MY (t) =
(1 +
(et 1)n
)n exp ((et 1)) .
If Z Poisson(), P (Y = r) = rer! for r 0. Then
MY (t) =
r=0
etrre
r!=
r=0
(et)re
r!= exp((et 1))
since ex =r=0
xr
r! . This agrees with the limiting case of the Binomial distribution.
Probability generating functions (PGF)These are useful in cases when X takes integer values.DefinitionSuppose X takes values in N. The PGF for X is the function GX() with GX() = E(X), that is GX() =n=0
nP (X = n). If = et, then we recover MX(t).
Properties
GX(1) = 1. dGXd =
n=1 n
n1P (X = n), dGXd =1 = E(X).
GX(1) = E(X(X 1)).
ECM3724 Stochastic Processes 3
GX+Y () = GX()GY () if X,Y are independent. V ar(X) = GX(1) +GX(1) [GX(1)]2. Given a series for GX(), the coefficient of n is precisely P (X = n). Moreover, GX(0) = P (X = 0).
The final property can be compared to M X(0)MX(0)2 = V ar(X).
ExampleX = X1 + ... + Xn, where the Xi Bernoulli(p). Now GXi() = (1 p + p). If the Xi are independent thenGX() = GX1()GX2()...GXn() = (1 p+ p)n. So X Bin(n, p).
ExampleConsider GX() =
12 . What distribution does X have? Now
GX() = E(X) =
n=0
nP (X = n) =1
2 =1
2
n=0
(
2
)nwhich means that P (X = n) = 12n+1 , that is X has a geometric distribution.
Conditional Expectation and Joint RVsConsider X and Y discrete RVs taking values in X = {x1, x2, ...} and Y = {y1, y2, ...}. The joint probabilityfunction is given by fX,Y (xi, yj) = P (X = xi, Y = yj). The marginal probability (distribution) function isfX(xi) = P (X = xi) =
j fX,Y (xi, yj) or FY (yj) = P (Y = yj) =
i fX,Y (xi, yj). The conditional probability
of X = xi given Y = yj is
fX|Y (xi|yj) = P (X = xi|Y = yj) = P (X = xi, Y = yj)P (Y = yj)
=fX,Y (xi, yj)
fY (yj).
If X,Y are independent then fX,Y (xi, yj) = fX(xi)fY (yj) for all i, j. Given g : X Y R, E(g(X,Y )) =i,j g(xi, yj)fX,Y (xi, yj). IfX,Y are independent and g(X,Y ) = h1(X)h2(Y ) then E(g(X,Y )) = E(h1(X))E(h2(Y )).
The conditional expectation of X given Y is the quantity E(X|Y ). This is a function of Y , the average overX given by a value of Y . If Y = yj , then
E(X|Y = yj) =i
xiP (X = xi|Y = yj) =i
xifX,Y (xi, yj)
fY (yj),
a function of Y = yj . E(X|Y ) is a RV which is governed by the probability distribution of Y , hence we can alsotake expectations.
Tower ruleE(E(X|Y )) = E(X).
We have a useful check, if X and Y are independent then E(X|Y ) = E(X). In general
E(E(X|Y )) =j
(i
xifX,Y (xi, yj)
fY (yj)
)fY (yj) =
i
xij
fX,Y (xi, yj) =i
xifX(xi) = E(X).
Compound processesSuppose (Xi)
i=1 are IID RVs with PGF GX() (since the Xi are IID, X = Xi). Suppose N is a RV with PGF GN (),
independent of the Xi. Let Z = X1 +X2 +...+XN . Z is a compound process, a random sum of random variables.
PropositionFor the compound process Z, the PGF is GZ() = GN (GX()) = GN GX().
ECM3724 Stochastic Processes 4
ProofBy definition
GZ() = E(Z) =
n=0
nP (Z = n)
= E(X1+X2+...+Xn) = E(E(X1+X2+...+Xn |N)) [Tower rule]
=
n=0
E(X1+X2+...+Xn |N = n)P (N = n) =n=0
E(X1)E(X2)...E(Xn)P (N = n) [Independence]
=
n=0
(GX())nP (N = n) = GN (GX()) [Definition of PGF].
The coefficient of n in GZ() gives P (Z = n).
ExampleSuppose we roll a dice and then flip a number of coins equal to the number on the dice. If Z is the number of heads,what is P (Z = k)?By the previous proposition we have GZ() = GN (GX()) where N Unif(6) (the values on the dice) andX Bernoulli(1/2) (the flip of the coin). Now
GX() = E(X) =
1n=0
nP (X = n) =1
2(1 + )
and
GN () = E(N ) =
6n=1
nP (N = n) =1
6
6n=1
n.
Hence
GZ() = GN
(1
2(1 + )
)=
1
6
6n=1
1
2n(1 + )n.
It follows that P (Z = k) is given by the k coefficient in the sum. By the binomial theorem, we have P (Z = k) =16
6n=1
(nk
) (12
)n, recalling that
(nk
)= 0 for k > n.
2 Branching Processes
Let Sn be the number of individuals in a population at time n. Suppose S0 = 1 (one individual at time 0). Indi-viduals evolve at each timestep according to a common RV X, and evolve independently of others. We assume Xhas PGF GX(). Let Xi, i 1 be IID copies of X. We want to work out the long term behaviour of Sn, E(Sn)and P (Sn = 0).
We use generating function analysis. For Sn, denote the PGF by Gn(). So since S1 = X, G1() = GX(). ForS2, G2() = E(
S2 ) = E(E(
S2 |X)) = GX GX() by the previous proposition. Similarly, G3() = E(E(S3 |S2)) =GX GX GX().
PropositionGn() = GX GX ... GX() (n-fold composition). Moreover Gn() = GX(Gn1()) = Gn1(GX()).ProofThis follows easily by induction.
Remark: the coefficient of k in Gn() gives P (Sn = k).
ECM3724 Stochastic Processes 5
Expected behaviour of SnWe want to study E(Sn) =
dGnd =1 = G
n(1). Let = E(X) = G