adakdh1/ada/2017/ada04c.pdf · 1 N X i i=1 N ∑ € X≠X IfX i =X,then. Sample Mean : Average of...

Review: Functions of Random Variables

Y = y(X) dYdX

= !y (X)

Conserve probability :d(Prob) = f (Y ) dY = f (X) dX

f (Y ) = f (X) dXdY

=f (X)!y (X)

mean value (biased)

Y = y X( )+ 12!!y X( )σ X

2 + ...

standard deviation (stretched)

σY =σ Xdydx X= X

+ ...

€

X =100 ±10

Y = X

Y = ? σY

= ?

X

Y f(Y)

f(X)

Y = y(X)

< X >

The Central Limit Theorem•  (a.k.a. the Law of Large Numbers)•  Sum up a large number N of independent random variables Xi .•  The result resembles a Gaussian:

•  The means and variances accumulate:

•  But higher moments are forgotten.•  The original distributions f( Xi ) don’t matter -- all shape information is lost.•  This is why Gaussians are special.•  This is why measurements often give Gaussian error distributions.•  (Fast computers let us do more exact Monte Carlo analysis.)

€

Xi

i=1

N

∑ →G(µ,σ 2) as N→∞

€

µ = Xi

i=1

N

∑ σ 2 = σ 2

i=1

N

∑ (Xi)

Example: Coin Toss

C = +1 if heads −1 if tailsC = 0 σC

2 =1

SN ≡ Cii=1

N

∑

SN = Cii=1

N

∑ = 0

σ 2 SN( ) = σ 2 Ci( ) =i=1

N

∑ N σC2 = N

€

+1

€

−1

€

+2

€

−2

€

0

€

+1

€

−1

€

+3

€

−3

+2 H H 0 H T 0 T H

-2 T T

+1 H-1 T

+3 H H H+1 H H T, H T H, T H H -1 H T T, T H T, T T H

-3 T T T

Coin Toss => Gaussian

Uniform => Gaussian Biased Coin => Gaussian

Poisson => Gaussian•  Poisson distribution P(λ)

–  < X > = λ , Var(X) = λ, x = 0, 1, 2, …•  Add N independent xi values:•  Sum xi ∼ P( N λ )•  CLT ensures that for large λ, Poisson -> Gaussian:

–  P( λ ) => G ( µ, σ2 ) –  with µ = λ, σ2 = λ

Lambda = 1

0

0.1

0.2

0.3

0.4

- 5 0 5 10

No of photons detected

Pro

ba

bil

ity

Poisson

Gaussian

Lambda = 4

0

0.05

0.1

0.15

0.2

- 5 0 5 10


Pro

ba

bil

ity

Poisson

Gaussian

Lambda = 16

0

0.02

0.04

0.06

0.08

0.1

-10 10 30 50


Pro

ba

bil

ity

Poisson

Gaussian

Lambda = 100

0

0.01

0.02

0.03

0.04

-10 40 90 140


Pro

ba

bil

ity

Poisson

Gaussian

Definition : What is a Statistic?•  Anything you measure or compute from the data.•  Any function of the data.•  Because the data “jiggle”, every statistic also “jiggles”.•  Example: the average of N data points is a statistic:

•  It has a definite value for a particular dataset.•  It has a probability distribution describing how it “jiggles”

with the ensemble of repeated datasets.

•  Note that Why?• 

€

X =1

NXi

i=1

N

∑

€

X ≠ X

If Xi = X , then X = X .

Sample Mean : Average of N data pointsSample Mean is a statistic.

It has a probability distribution, with a mean value:

and a variance:

€

X ≡1

NXi

i=1

N

∑

€

X =1

NXi

i

∑ =1

NXi

i

∑ =1

NXi

i

∑

€

Var X( ) =Var1

NXi

i

∑

=

1

N2Var X

i

i

∑

=

1

N2

Var Xi( )

i

∑

assuming Cov[ Xa , Xb ] = Var[ Xa ] δab

Sample Mean: Unbiased and lower Variance

If Xi have the same mean, < Xi > = < X > , then:

If Xi all have the same variance, Var[ Xi ] = σ2, and are uncorrelated, Cov[ Xi, Xj ] = σ2 δi j , then:

€

X =1

NXi

i=1

N

∑ =N X

N= X

∴ X is an unbiased estimator of X .

€

Var X( ) =1

N2

Var Xi( )

i

∑

=

N σ 2

N2

=σ 2

N

∴ σ X( ) =σ

N, i.e. X " jiggles" much less

than a single data value Xi does.

Many other Unbiased Statistics•  Sample median (half points above, half below)

•  ( Xmax + Xmin ) / 2

•  Any single point Xi chosen at random from sequence

•  Weighted average:

•  Which un-biased statistic is best ? (best = minimum variance)

wiXi

i

∑

wi

i

∑

€

X uses weights wi=1

Inverse-variance weights are best!•  Variance of the weighted mean ( assume Cov[ Xi, Xj ] = σi

2 δij ) :

•  What are the optimal weights ?•  The variance of the weighted average is minimised when:

•  Let’s verify this -- it’s important!

€

wi=

1

Var(Xi)≡1

σi

2.

Var

wiXi

i

∑

wi

i

∑

=

Var wiXi

i

∑

wi

i

∑

2=

wi

2Var X

i[ ]i

∑

wi

i

∑

2=

wi

2 σi

2

i

∑

wi

i

∑

2

Optimising the weights•  To minimise the variance of the weighted average, set:

€

0 =∂

∂wk

wi

2σi

2

i

∑

wi

i

∑

2

=2 w

kσk

2

wi

i

∑

2−

2 wi

2σi

2

i

∑

wi

i

∑

3

∂ wi

i

∑

∂wk

=2

wi

i

∑

2wkσk

2 −

wi

2σi

2

i

∑

wi

i

∑

⇒ wk

=1

σk

2.

(Note : wi

2σi

2∑ = wi∑ for w

i=1/σ

i

2)

The Optimal Average•  Good principles for constructing statistics:

–  Unbiased -> no systematic error–  Minimum variance -> smallest possible statistical error

•  Optimal (inverse-variance weighted) average:

•  Is unbiased, since:

•  And minimum variance: €

ˆ X ≡

Xi/σ

i

2

i

∑

1/σi

2

i

∑ˆ X = X

€

σ 2 ˆ X ( ) =1

1/σi

2

i

∑

N datapoints: Xi= X ±σ

i

Xi= X Cov X

i,X

j =σ i

2δij

Memorise !

Compare: Equal vs Optimal Weights•  Both are unbiased:•  Bad data spoils the Sample Mean (information lost).•  Optimal average ALWAYS improves with more data.•  Consider N = 2 :€

ˆ X = X = Xi

= X

X = X1 + X2

2 X̂ =

X1

σ12 +

X2

σ 22

1σ1

2 +1σ 2

2

Var X!" #$=

σ12 +σ 2

2

4 Var X̂!" #

$=1

1σ1

2 +1σ 2

2

Averaging with Equal Error Bars2 data points with equal error bars:

€

X =0 +1

2=

1

2, σ 2

(X ) =1

2+1

2

4=

1

2.

ˆ X =

0

12

+1

12

1

12

+1

12

=1

2, σ 2

( ˆ X ) =1

1

12

+1

12

=1

2.

In this case ˆ X = X since the σi are all the same.

0 ± 1

0.5 ± 0.71

1 ± 1

Averaging with Unequal Error Bars2 data points with unequal error bars:

Optimal weights retain all the information. Optimal Average always improves with new data.

€

X =0 +1

2=

1

2, σ 2

(X ) =1

2+ 4

2

4=

17

4.

Information lost since σ(X ) > σ(X1).

ˆ X =

0

12

+1

42

1

12

+1

42

=1

17, σ 2

( ˆ X ) =1

1

12

+1

42

=1

17 /16=

16

17.

Now σ ( ˆ X ) < σ(X1) .

0 ± 1

0.5 ± 2.06

1 ± 4

0 ± 1

0.06 ± 0.97

1 ± 4

Compare: Equal vs Optimal Weights

Equal weights: Poor data degrades the result.

Better to ignore “bad” data.Information lost.

Optimal weights: New data always improves the result.

Use ALL the data, but with appropriate 1 / Variance weights.

Must have good error bars.

€

ˆ X ≡

Xi/σ

i

2

i

∑

1/σi

2

i

∑

σ 2 ˆ X ( ) =1

1/σi

2

i

∑

€

X ≡1

NX

i

i

∑

σ 2X ( ) =

1

N2

σi

i

∑2

adakdh1/ada/2017/ada04c.pdf · 1 N X i i=1 N ∑ € X≠X IfX i =X,then. Sample Mean : Average of...

Documents

Transcript of adakdh1/ada/2017/ada04c.pdf · 1 N X i i=1 N ∑ € X≠X IfX i =X,then. Sample Mean : Average of...