Estimation Theory - STAT 432 | UIUC
Transcript of Estimation Theory - STAT 432 | UIUC
Estimation TheorySTAT 432 | UIUC | Fall 2019 | Dalpiaz
Questions?Comments? Concerns?
Administration
• CBTF Syllabus Exam
• PrairieLearn Quiz
• Quiz Policy
Last Week This Week
Do You Remember STAT 400?DISTRIBUTION = S t PROBS
STATISTIC = f(DATA)I
p(x | n , p) = (nx)px(1 − p)n −x, x = 0,1,…, n , n ∈ ℕ, 0 < p < 1
X ∼ bin(n , p)
FAMILY OF DISTRIBUTIONS
-
parameters. ..
A←✓ SAMPLE PARAMETER
SPACE SPACE
pmfOUTPUTS A PROBABILITY-
X ∼ N (μ, σ2)
f(x |μ, σ2) = 1σ 2π
⋅ exp [ −12 ( x − μ
σ )2
], − ∞ < x < ∞, − ∞ < μ < ∞, σ > 0
DISTRIBUTIONS
-
#SAM
}§ae€ PARAM SPACE
pdf. OUTPUTS DENSITIES
' NII PROBS
X ∼ N (μ = 5, σ2 = 4)P[X = 4] = ?
P[X < 4] = ?
= O
=pt¥¥= phorm ( 4 , mean =
5,so -- 2)
X ∼ N (μ = 5, σ2 = 4)
P[2 < X < 4] = ? P[X > 3] = ?diff (prior- (42,4) , mean = 5 , so -- Z)) = I -
pnorn(3,mean=5,sd=Z)i¥*¥¥E¥
X ∼ N (μ = 5, σ2 = 4)
P[X > c] = 0.75, c = ?"
""""""" " " "
PROBABILITY IN R-
d * ( x , . . . ) = pdf/pmf
p * ( q , - . . )= P[Xe×)= cdf
q * ( p , . . . ) = c: P[x. c) =p
r * ( n , - . . ) → GENERATES RANDOM OBS
* = NAMED DISTRIBUTIONS
ExpectationsX - N/a ,
r: ) Y - N (ux.fi ) X. Y mo
E [X - Y) = u× -my←
STILL TRUE w/o end
✓Are [2+3×-44] = 9 of' +16 r} c- NEED WD
Estimation
What is Statistics?
What is Statistics?• Technically: The study of statistics.
• Practically: The science of collecting, organizing, analyzing, interpreting, and presenting data.
What is a statistic?
f-(DAIA)
What is a statistic?• Technically: A function of (sample) data.
Terminology
• Population: The entire group of interest.
• Parameter: A (usually unknown) numeric value associated with the population.
• Sample: A subset of individuals taken from the population.
• Statistic: A numeric value computed using the sample data.
The Big Picture
Sampo, no /HOPEFUL'T RANDO-u,
popo-E""
-O
, 02SAMPLEO x
..
- tho)
t⑤= f ( x . ,Xr. . .. . .
X)
Random Sample
• A random sample is a sample where each individual in the population is equally likely to be included.
Why do we care?WE LIKE TO MAKE AN HD ASSUMPTION
What is an estimator?• A statistic that attempts to provide a good guess
for an unknown population parameter.
Parameters Estimatorscylon!
AFw
EG)(IG , ,x .. - xn) -- II. *
VARG] ( SD censuses g2= i€( Xi -t )"
-
X, ,Xr
,- . . . ,
Xn "toBETA ( N , B) MCE
,MOM
P[ X - 4] ECDF , MCE
Statistics are Random Variables
• An Estimator / A statistic: A function that tells us what calculation we will perform after taking a sample from the population.
• An Estimate / The value of a statistic: The numeric result of performing the calculation on a particular sample of data.
X vs x# µPOTENTIAL VALUE
OF RANDOM VARIABLE .
RANDOMVARIABLE
Why so much focus on the mean?WANT TO
MINIMIZE
↳ E - a)) = Efx' - 2aXtaJ=E[x) - 2aE[x] + a'
WEE .¥Eaction
÷ Efx-at] =- LEG) c- Za -- O
⇒ la=ETx]/-
PREDICT THE MEAN to minimizeSQUARED
ERROR LOSS.
UNBIASED ←¥¥l#O = ECO]
BIASED #§µE[on]
Definitions
bias [ θ] ≜ ) [ θ] − θ
var [ θ] ≜ ) [( θ − ) [ θ])2
]
MSE [ θ] ≜ ) [( θ − θ)2] = var [ θ] + (Bias [ θ])
2
X1, X2, X3 ∼ N(μ, σ2)
X = 1n
n
∑i= 1
Xi
μ = 14 X1 + 1
5 X2 + 16 X3
MSEEX] us MSE
I-
EEE]=u Bias = Eft] -a= er - er
= O
VAR=
MSE = 043-
E = It Itf = SIT in
Bias =
u - Eon ---In
a-
Van =+ ¥ . =¥÷m/nsEG)
-⇐ a)'
+ II. or- SMALL WHEN U CLOSE TO 0
Probability Models
Professor Professorson received the following number of emails on each of the previous 40 days:
5 6 2 5 3 3 4 1 4 4 3 4 6 2 3 6 7 1 3 3
5 1 8 6 1 3 2 5 3 5 4 4 2 4 0 5 0 2 5 3
• What is the probability that Professor Professorson receives three emails on a particular day?
• What is the probability that Professor Professorson receives nine emails on a particular day?
- --
- - -
- - -
9/40
0/40
Idea: Use a Poisson distribution to model the number of emails received per day.
But which Poisson distribution?
jpfx.SI#E / o.am,
ypix =D
The previous “analysis” is flawed.
• The previous 40 days is not a random sample.
• Why do we need a random sample?
• We were just guessing and checking.
• How do we know if Poisson was a reasonable distribution?
• How do we pick a good lambda given some data?
• How do we know if our method for picking is good?
Maximum Likelihood
* a"
'
Assume X.,Xi
, - . .Xu - f(x IO) IF "D
①ne=trgm%¥g⇒IIf)-
Fitting a Probability Distribution
data = c(4, 10, 9, 3, 2, 4, 3, 7, 3, 5)5=5
WANT TO ESTIMATE THINGS LIKE p [× = 4)Empirical CDF
F- Cx)=F[x. D= #*n urDiscrete P' [ X -- x) = # f[x -- 43=1/5
→ P' (x - o) -- o
→ IS DATA NUMERIC ? CATEGORICAL ?LOOK AT THE DATA iii. '
'Fi.
i i- i i
ASSUME PROBABILITY DISTRIBUTIONCONTINUOUS DISCRETE
RANGE ? ?-
O , 1,2 , . . .?
ESTIMATE PARAMETERS → MLE from-
CHECK RESULTS → Q - Q plots-
USE RESULTS -3 ESTIMATEQUANTITIES LIKE P[X > 3)
-
X. , Xr . . . . X . - Pols (X)
I I 5=5
IF ⑤ is nice ForO
h (a) =P [X= 2) = THEN hCG) is nee Fonko)
icx=g=5÷
Questions?Comments? Concerns?