Stochastic Processes Notes

Click here to load reader

  • date post

    18-Jul-2016
  • Category

    Documents

  • view

    23
  • download

    6

Embed Size (px)

description

Stochastic Processes Notes

Transcript of Stochastic Processes Notes

  • ECM3724 Stochastic Processes 1

    ECM3724 Stochastic Processes

    1 Overview of Probability

    We call (X,, P ) a probability space. Here is the sample space, X : R is a random variable (RV)and P is a probability (measure). This is a function on subsets of . Elements are called outcomes.Subsets of are called events.

    Given A R, P{X A} = { : X() A}. Given x R, {X = x} = { : X() = x}.

    ExampleSuppose we toss a coin twice, then = {HH,HT, TH, TT}, || = 4, X is the number of heads. Then Xis the of values that X takes, that is X = {0, 1, 2}. Now {X = 1} = { : X() = 1} = {HT, TH}.P (X = 1) = P (HT ) + P (TH).

    (or X) could be discrete, for example X = {x1, x2, ...}. We requirexX P (X = x) = 1. (or X)

    could be continuous. If X = [0, 1], P (X A) =AfX(x)dx. Here fX(x) is a probability density function (pdf),

    fX(x) 0,

    XfX(x)dx = 1.

    ExpectationIf X = {x1, x2, ...} and g : X R then E(g(X)) =

    g(xi)P (X = xi). If g(X) = X, E(X) := X , then mean

    of X. If g(X) = (X X)2 then E(g(X)) := V ar(X) = 2X > 0, the variance of X. Also V ar(X) = E(X2) 2X .In the continuous case, E(g(X)) =

    Xg(x)fX(x)dx.

    Common DistributionsIf X Unif [0, 1] then we say that X is distributed as Uniform[0, 1]. If A = [a, b] X then P (X A) =

    ba

    1dx =b a. If A = Q [0, 1], P (X A) = 0. There exist subsets of [0, 1] for which P (X A) is undefined.

    If X Ber(p) then we say that X is distributed as a Bernoulli trial with success probability p. X = {0, 1},P (X = 0) = 1 p, P (X = 1) = p, p (0, 1). We can extend this over multiple independent, identicallydistributed (IID) trials (recall that events A and B are independent if P (A|B) = P (A), P (B|A) = P (B) orP (A B) = P (A)P (B)). In this case X Bin(n, p), that is X is distributed binomially with n trials and successprobability p. In this case X = {0, 1, 2, ..., n}. For 0 r n, P (X = r) =

    (nr

    )pr(1 p)nr. The Binomial

    distribution has E(X) = np, V ar(X) = np(1 p).

    Let X = {0, 1, 2, ...}. We say that X Poisson() if P (X = r) = rer! for r N {0}. Recall thatexp(x) =

    r0

    xr

    r! , which means thatr P (X = r) = 1. For the Poisson distribution the mean and variance are

    both . Other discrete distributions include the geometric and hypergeometric distributions.

    We say X Exp() if the pdf of X is given by fX(x) = ex for x 0. Now

    P (X > x) =

    x

    eudu = [eu]x = ex.

    The exponential distribution is memoryless, that is the time to wait does not depend on the time already waited.More specifically, P (X > t+ s|X > s) = P (X > t) = et.

    Gaussian/Normal distributionWe say X N(, 2) if the pdf is

    fX(x) =1

    2piexp

    ( (x )

    2

    22

    )for x R. E(X) = , V ar(X) = 2.

    The Central Limit Theorem (CLT)

    Supoose (Xi)ni=1 are IID RVs with E(Xi) = , V ar(Xi) =

    2 and let Sn =ni=1Xi. Then if Zn =

    Snnn

    d

  • ECM3724 Stochastic Processes 2

    N(0, 1), that is the distribution converges to N(0, 1). If A = [a, b] then P (Zn A)n ba

    exp(u2/22pi

    du.

    This can be applied, for example if we take Sn to be the number of heads in n coin tosses. E(Sn) = n/2,V ar(Sn) =

    ni=1 V ar(Xi) = n/4. Hence

    P

    (Sn n2

    12

    n [a, b]

    )=

    ba

    exp(u2/2)2pi

    du.

    Moment Generating Functions (MGF)For a RV X, the MGF of X is the function MX(t) = E(e

    tx), a function of t R. In the discrete case,E(etx) =

    i etxiP (X = xi), where X X = {x1, x2, ...}. In the continuous case, E(etx) =

    X

    etxfX(x)dx.

    Properties

    MX(0) = 1. drMXdtr t=0 = E(Xr). If Z = X + Y and X,Y are independent then MZ(t) = MX(t)MY (t). If X,Y are RVs and MX(t) = MY (t) then X and Y have the same probability distribution, provided MX(t)

    is continuous in a neighbourhood of t = 0.

    ExerciseCompute MX(t) for the Bernoulli distribution. Compute MY (t) for Y = X1 + ... + Xn, where the Xi are IIDBernoulli RVs. What is the distribution of Y ? What happens as n (and p 0 with = np fixed.We have MX(t) = p(e

    t 1) + 1. Hence My(t) = (p(et 1) + 1)n by the properties of the MGF. If Y Bin(n, p)then P (Y = r) =

    (nr

    )pr(1 p)nr so

    E(ety) =

    nr=0

    (n

    r

    )pr(1 p)nretr = (p(et 1) + 1)n

    by the Binomial theorem. Hence the sum of IID Bernoulli trials has a Binomial distribution. If we fix > 0 andlet = np with n (so p 0). Note that in a special case with n and p close to 12 then we can apply theCLT. In this case we can use the fact that limn

    (1 + xn

    )n= ex then

    MY (t) =

    (1 +

    (et 1)n

    )n exp ((et 1)) .

    If Z Poisson(), P (Y = r) = rer! for r 0. Then

    MY (t) =

    r=0

    etrre

    r!=

    r=0

    (et)re

    r!= exp((et 1))

    since ex =r=0

    xr

    r! . This agrees with the limiting case of the Binomial distribution.

    Probability generating functions (PGF)These are useful in cases when X takes integer values.DefinitionSuppose X takes values in N. The PGF for X is the function GX() with GX() = E(X), that is GX() =n=0

    nP (X = n). If = et, then we recover MX(t).

    Properties

    GX(1) = 1. dGXd =

    n=1 n

    n1P (X = n), dGXd =1 = E(X).

    GX(1) = E(X(X 1)).

  • ECM3724 Stochastic Processes 3

    GX+Y () = GX()GY () if X,Y are independent. V ar(X) = GX(1) +GX(1) [GX(1)]2. Given a series for GX(), the coefficient of n is precisely P (X = n). Moreover, GX(0) = P (X = 0).

    The final property can be compared to M X(0)MX(0)2 = V ar(X).

    ExampleX = X1 + ... + Xn, where the Xi Bernoulli(p). Now GXi() = (1 p + p). If the Xi are independent thenGX() = GX1()GX2()...GXn() = (1 p+ p)n. So X Bin(n, p).

    ExampleConsider GX() =

    12 . What distribution does X have? Now

    GX() = E(X) =

    n=0

    nP (X = n) =1

    2 =1

    2

    n=0

    (

    2

    )nwhich means that P (X = n) = 12n+1 , that is X has a geometric distribution.

    Conditional Expectation and Joint RVsConsider X and Y discrete RVs taking values in X = {x1, x2, ...} and Y = {y1, y2, ...}. The joint probabilityfunction is given by fX,Y (xi, yj) = P (X = xi, Y = yj). The marginal probability (distribution) function isfX(xi) = P (X = xi) =

    j fX,Y (xi, yj) or FY (yj) = P (Y = yj) =

    i fX,Y (xi, yj). The conditional probability

    of X = xi given Y = yj is

    fX|Y (xi|yj) = P (X = xi|Y = yj) = P (X = xi, Y = yj)P (Y = yj)

    =fX,Y (xi, yj)

    fY (yj).

    If X,Y are independent then fX,Y (xi, yj) = fX(xi)fY (yj) for all i, j. Given g : X Y R, E(g(X,Y )) =i,j g(xi, yj)fX,Y (xi, yj). IfX,Y are independent and g(X,Y ) = h1(X)h2(Y ) then E(g(X,Y )) = E(h1(X))E(h2(Y )).

    The conditional expectation of X given Y is the quantity E(X|Y ). This is a function of Y , the average overX given by a value of Y . If Y = yj , then

    E(X|Y = yj) =i

    xiP (X = xi|Y = yj) =i

    xifX,Y (xi, yj)

    fY (yj),

    a function of Y = yj . E(X|Y ) is a RV which is governed by the probability distribution of Y , hence we can alsotake expectations.

    Tower ruleE(E(X|Y )) = E(X).

    We have a useful check, if X and Y are independent then E(X|Y ) = E(X). In general

    E(E(X|Y )) =j

    (i

    xifX,Y (xi, yj)

    fY (yj)

    )fY (yj) =

    i

    xij

    fX,Y (xi, yj) =i

    xifX(xi) = E(X).

    Compound processesSuppose (Xi)

    i=1 are IID RVs with PGF GX() (since the Xi are IID, X = Xi). Suppose N is a RV with PGF GN (),

    independent of the Xi. Let Z = X1 +X2 +...+XN . Z is a compound process, a random sum of random variables.

    PropositionFor the compound process Z, the PGF is GZ() = GN (GX()) = GN GX().

  • ECM3724 Stochastic Processes 4

    ProofBy definition

    GZ() = E(Z) =

    n=0

    nP (Z = n)

    = E(X1+X2+...+Xn) = E(E(X1+X2+...+Xn |N)) [Tower rule]

    =

    n=0

    E(X1+X2+...+Xn |N = n)P (N = n) =n=0

    E(X1)E(X2)...E(Xn)P (N = n) [Independence]

    =

    n=0

    (GX())nP (N = n) = GN (GX()) [Definition of PGF].

    The coefficient of n in GZ() gives P (Z = n).

    ExampleSuppose we roll a dice and then flip a number of coins equal to the number on the dice. If Z is the number of heads,what is P (Z = k)?By the previous proposition we have GZ() = GN (GX()) where N Unif(6) (the values on the dice) andX Bernoulli(1/2) (the flip of the coin). Now

    GX() = E(X) =

    1n=0

    nP (X = n) =1

    2(1 + )

    and

    GN () = E(N ) =

    6n=1

    nP (N = n) =1

    6

    6n=1

    n.

    Hence

    GZ() = GN

    (1

    2(1 + )

    )=

    1

    6

    6n=1

    1

    2n(1 + )n.

    It follows that P (Z = k) is given by the k coefficient in the sum. By the binomial theorem, we have P (Z = k) =16

    6n=1

    (nk

    ) (12

    )n, recalling that

    (nk

    )= 0 for k > n.

    2 Branching Processes

    Let Sn be the number of individuals in a population at time n. Suppose S0 = 1 (one individual at time 0). Indi-viduals evolve at each timestep according to a common RV X, and evolve independently of others. We assume Xhas PGF GX(). Let Xi, i 1 be IID copies of X. We want to work out the long term behaviour of Sn, E(Sn)and P (Sn = 0).

    We use generating function analysis. For Sn, denote the PGF by Gn(). So since S1 = X, G1() = GX(). ForS2, G2() = E(

    S2 ) = E(E(

    S2 |X)) = GX GX() by the previous proposition. Similarly, G3() = E(E(S3 |S2)) =GX GX GX().

    PropositionGn() = GX GX ... GX() (n-fold composition). Moreover Gn() = GX(Gn1()) = Gn1(GX()).ProofThis follows easily by induction.

    Remark: the coefficient of k in Gn() gives P (Sn = k).

  • ECM3724 Stochastic Processes 5

    Expected behaviour of SnWe want to study E(Sn) =

    dGnd =1 = G

    n(1). Let = E(X) = G