Proof of the Central Limit Theorem - Swarthmore College 111/CLT.pdfProof of the Central Limit...

1
Proof of the Central Limit Theorem Suppose X 1 ,...,X n are i.i.d. random variables with mean 0, variance σ 2 x and Moment Generating Function (MGF) M x (t). Note that this assumes an MGF exists, which is not true of all random variables. Let S n = n i=1 X i and Z n = S n / p 2 x . Then M Sn (t)=(M x (t)) n and M Zn (t)= M x t σ x n n . Using Taylor’s theorem, we can write M x (s) as M x (s)= M x (0) + sM 0 x (0) + 1 2 s 2 M 00 x (0) + e s , where e s /s 2 0 as s 0. M x (0) = 1, by definition, and with E(X i ) = 0 and V ar(X i )= σ 2 x , we know M 0 x (0) = 0 and M 00 x (0) = σ 2 x . So M x (s)=1+ σ 2 x 2 s 2 + e s . Letting s = t/(σ x n), we have s 0 as n →∞, and M Zn (t)= 1+ σ 2 x 2 t σ x n 2 + e n ! n = 1+ t 2 2n + e n ! n , where 2 x e n /t 2 0 as n →∞. If a n a as n →∞, it can be shown that lim n→∞ 1+ a n n n = e a . It follows that lim n→∞ M Zn (t)= lim n→∞ 1+ t 2 /2+ ne n n ! n = e t 2 /2 , which is the MGF of a standard Normal. If the MGF exists, then it uniquely defines the distribution. Convergence in MGF implies that Z n converges in distribution to N (0, 1). The practical application of this theorem is that, for large n, if Y 1 ,...,Y n are indepen- dent with mean μ y and variance σ 2 y , then n X i=1 Y i - μ y σ y n ! . N (0, 1), or ¯ Y . N (μ y 2 y /n). How large is “large” depends on the distribution of the Y i ’s. If Normal, then n = 1 is large enough. As the distribution becomes less Normal, larger values of n are needed.

Transcript of Proof of the Central Limit Theorem - Swarthmore College 111/CLT.pdfProof of the Central Limit...

Page 1: Proof of the Central Limit Theorem - Swarthmore College 111/CLT.pdfProof of the Central Limit Theorem Suppose X 1;:::;X n are i.i.d. random variables with mean 0, variance ˙ x 2 and

Proof of the Central Limit Theorem

Suppose X1, . . . , Xn are i.i.d. random variables with mean 0, variance σ2x and Moment

Generating Function (MGF) Mx(t). Note that this assumes an MGF exists, which is nottrue of all random variables.

Let Sn =∑n

i=1Xi and Zn = Sn/√nσ2

x. Then

MSn(t) = (Mx(t))n and MZn(t) =(Mx

(t

σx√n

))n

.

Using Taylor’s theorem, we can write Mx(s) as

Mx(s) = Mx(0) + sM ′x(0) +

12s2M

′′x (0) + es,

where es/s2 → 0 as s→ 0.

Mx(0) = 1, by definition, and with E(Xi) = 0 and V ar(Xi) = σ2x, we know M ′

x(0) = 0 andM ′′

x (0) = σ2x. So

Mx(s) = 1 +σ2

x

2s2 + es.

Letting s = t/(σx√n), we have s→ 0 as n→∞, and

MZn(t) =

(1 +

σ2x

2

(t

σx√n

)2

+ en

)n

=

(1 +

t2

2n+ en

)n

,

where nσ2xen/t

2 → 0 as n→∞.

If an → a as n→∞, it can be shown that

limn→∞

(1 +

an

n

)n

= ea.

It follows that

limn→∞

MZn(t) = limn→∞

(1 +

t2/2 + nenn

)n

= et2/2,

which is the MGF of a standard Normal. If the MGF exists, then it uniquely defines thedistribution. Convergence in MGF implies that Zn converges in distribution to N(0, 1).

The practical application of this theorem is that, for large n, if Y1, . . . , Yn are indepen-dent with mean µy and variance σ2

y , then

n∑i=1

(Yi − µy

σy√n

).∼ N(0, 1), or Y

.∼ N(µy, σ2y/n).

How large is “large” depends on the distribution of the Yi’s. If Normal, then n = 1 is largeenough. As the distribution becomes less Normal, larger values of n are needed.