adakdh1/ada/2017/ada04c.pdf · 1 N X i i=1 N ∑ € X≠X IfX i =X,then. Sample Mean : Average of...
Transcript of adakdh1/ada/2017/ada04c.pdf · 1 N X i i=1 N ∑ € X≠X IfX i =X,then. Sample Mean : Average of...
Page 1
Review: Functions of Random Variables
Y = y(X) dYdX
= !y (X)
Conserve probability :d(Prob) = f (Y ) dY = f (X) dX
f (Y ) = f (X) dXdY
=f (X)!y (X)
mean value (biased)
Y = y X( )+ 12!!y X( )σ X
2 + ...
standard deviation (stretched)
σY =σ Xdydx X= X
+ ...
€
X =100 ±10
Y = X
Y = ? σY
= ?
X
Y f(Y)
f(X)
Y = y(X)
< X >
The Central Limit Theorem• (a.k.a. the Law of Large Numbers)• Sum up a large number N of independent random variables Xi .• The result resembles a Gaussian:
• The means and variances accumulate:
• But higher moments are forgotten.• The original distributions f( Xi ) don’t matter -- all shape information is lost.• This is why Gaussians are special.• This is why measurements often give Gaussian error distributions.• (Fast computers let us do more exact Monte Carlo analysis.)
€
Xi
i=1
N
∑ →G(µ,σ 2) as N→∞
€
µ = Xi
i=1
N
∑ σ 2 = σ 2
i=1
N
∑ (Xi)
Example: Coin Toss
C = +1 if heads −1 if tailsC = 0 σC
2 =1
SN ≡ Cii=1
N
∑
SN = Cii=1
N
∑ = 0
σ 2 SN( ) = σ 2 Ci( ) =i=1
N
∑ N σC2 = N
€
+1
€
−1
€
+2
€
−2
€
0
€
+1
€
−1
€
+3
€
−3
+2 H H 0 H T 0 T H
-2 T T
+1 H-1 T
+3 H H H+1 H H T, H T H, T H H -1 H T T, T H T, T T H
-3 T T T
Coin Toss => Gaussian
Uniform => Gaussian Biased Coin => Gaussian
Page 2
Poisson => Gaussian• Poisson distribution P(λ)
– < X > = λ , Var(X) = λ, x = 0, 1, 2, …• Add N independent xi values:• Sum xi ∼ P( N λ )• CLT ensures that for large λ, Poisson -> Gaussian:
– P( λ ) => G ( µ, σ2 ) – with µ = λ, σ2 = λ
Lambda = 1
0
0.1
0.2
0.3
0.4
- 5 0 5 10
No of photons detected
Pro
ba
bil
ity
Poisson
Gaussian
Lambda = 4
0
0.05
0.1
0.15
0.2
- 5 0 5 10
No of photons detected
Pro
ba
bil
ity
Poisson
Gaussian
Lambda = 16
0
0.02
0.04
0.06
0.08
0.1
-10 10 30 50
No of photons detected
Pro
ba
bil
ity
Poisson
Gaussian
Lambda = 100
0
0.01
0.02
0.03
0.04
-10 40 90 140
No of photons detected
Pro
ba
bil
ity
Poisson
Gaussian
Definition : What is a Statistic?• Anything you measure or compute from the data.• Any function of the data.• Because the data “jiggle”, every statistic also “jiggles”.• Example: the average of N data points is a statistic:
• It has a definite value for a particular dataset.• It has a probability distribution describing how it “jiggles”
with the ensemble of repeated datasets.
• Note that Why?•
€
X =1
NXi
i=1
N
∑
€
X ≠ X
If Xi = X , then X = X .
Sample Mean : Average of N data pointsSample Mean is a statistic.
It has a probability distribution, with a mean value:
and a variance:
€
X ≡1
NXi
i=1
N
∑
€
X =1
NXi
i
∑ =1
NXi
i
∑ =1
NXi
i
∑
€
Var X( ) =Var1
NXi
i
∑
=
1
N2Var X
i
i
∑
=
1
N2
Var Xi( )
i
∑
assuming Cov[ Xa , Xb ] = Var[ Xa ] δab
Sample Mean: Unbiased and lower Variance
If Xi have the same mean, < Xi > = < X > , then:
If Xi all have the same variance, Var[ Xi ] = σ2, and are uncorrelated, Cov[ Xi, Xj ] = σ2 δi j , then:
€
X =1
NXi
i=1
N
∑ =N X
N= X
∴ X is an unbiased estimator of X .
€
Var X( ) =1
N2
Var Xi( )
i
∑
=
N σ 2
N2
=σ 2
N
∴ σ X( ) =σ
N, i.e. X " jiggles" much less
than a single data value Xi does.
Many other Unbiased Statistics• Sample median (half points above, half below)
• ( Xmax + Xmin ) / 2
• Any single point Xi chosen at random from sequence
• Weighted average:
• Which un-biased statistic is best ? (best = minimum variance)
wiXi
i
∑
wi
i
∑
€
X uses weights wi=1
Inverse-variance weights are best!• Variance of the weighted mean ( assume Cov[ Xi, Xj ] = σi
2 δij ) :
• What are the optimal weights ?• The variance of the weighted average is minimised when:
• Let’s verify this -- it’s important!
€
wi=
1
Var(Xi)≡1
σi
2.
Var
wiXi
i
∑
wi
i
∑
=
Var wiXi
i
∑
wi
i
∑
2=
wi
2Var X
i[ ]i
∑
wi
i
∑
2=
wi
2 σi
2
i
∑
wi
i
∑
2
Page 3
Optimising the weights• To minimise the variance of the weighted average, set:
€
0 =∂
∂wk
wi
2σi
2
i
∑
wi
i
∑
2
=2 w
kσk
2
wi
i
∑
2−
2 wi
2σi
2
i
∑
wi
i
∑
3
∂ wi
i
∑
∂wk
=2
wi
i
∑
2wkσk
2 −
wi
2σi
2
i
∑
wi
i
∑
⇒ wk
=1
σk
2.
(Note : wi
2σi
2∑ = wi∑ for w
i=1/σ
i
2)
The Optimal Average• Good principles for constructing statistics:
– Unbiased -> no systematic error– Minimum variance -> smallest possible statistical error
• Optimal (inverse-variance weighted) average:
• Is unbiased, since:
• And minimum variance: €
ˆ X ≡
Xi/σ
i
2
i
∑
1/σi
2
i
∑ˆ X = X
€
σ 2 ˆ X ( ) =1
1/σi
2
i
∑
N datapoints: Xi= X ±σ
i
Xi= X Cov X
i,X
j =σ i
2δij
Memorise !
Compare: Equal vs Optimal Weights• Both are unbiased:• Bad data spoils the Sample Mean (information lost).• Optimal average ALWAYS improves with more data.• Consider N = 2 :€
ˆ X = X = Xi
= X
X = X1 + X2
2 X̂ =
X1
σ12 +
X2
σ 22
1σ1
2 +1σ 2
2
Var X!" #$=
σ12 +σ 2
2
4 Var X̂!" #
$=1
1σ1
2 +1σ 2
2
Averaging with Equal Error Bars2 data points with equal error bars:
€
X =0 +1
2=
1
2, σ 2
(X ) =1
2+1
2
4=
1
2.
ˆ X =
0
12
+1
12
1
12
+1
12
=1
2, σ 2
( ˆ X ) =1
1
12
+1
12
=1
2.
In this case ˆ X = X since the σi are all the same.
0 ± 1
0.5 ± 0.71
1 ± 1
Averaging with Unequal Error Bars2 data points with unequal error bars:
Optimal weights retain all the information. Optimal Average always improves with new data.
€
X =0 +1
2=
1
2, σ 2
(X ) =1
2+ 4
2
4=
17
4.
Information lost since σ(X ) > σ(X1).
ˆ X =
0
12
+1
42
1
12
+1
42
=1
17, σ 2
( ˆ X ) =1
1
12
+1
42
=1
17 /16=
16
17.
Now σ ( ˆ X ) < σ(X1) .
0 ± 1
0.5 ± 2.06
1 ± 4
0 ± 1
0.06 ± 0.97
1 ± 4
Compare: Equal vs Optimal Weights
Equal weights: Poor data degrades the result.
Better to ignore “bad” data.Information lost.
Optimal weights: New data always improves the result.
Use ALL the data, but with appropriate 1 / Variance weights.
Must have good error bars.
€
ˆ X ≡
Xi/σ
i
2
i
∑
1/σi
2
i
∑
σ 2 ˆ X ( ) =1
1/σi
2
i
∑
€
X ≡1
NX
i
i
∑
σ 2X ( ) =
1
N2
σi
i
∑2