Statistical Inference Wen, shu-hui [email protected].
-
Upload
dustin-oneal -
Category
Documents
-
view
223 -
download
3
Transcript of Statistical Inference Wen, shu-hui [email protected].
2
Outline
EstimationPoint estimation Interval estimation
Hypothesis testingSetting for H0, H1
Test statisticP-value approachThe meaning of significance
3
Section A
Estimation
Chapter 9
4
Estimation for μ
Point estimation Using an estimator to estimate parametere.g. (p217), mean SCL μ would be X=217 mg/100ml? Can we say μ=217?
Interval Estimation Provide a formal expression of the uncertainty for th
e point estimate e.g. μ might falls in (190, 220) or (0, ∞) What is the confidence coefficient?
5
Example (Binomial distribution)
A random sample X1, …, Xn are independent Ber(p), the estimator for p is
A study on prevalence of malignant melanoma, denoted by p. Suppose 5000 women are selected, and 28 are found to have the disease. How to estimate p?
Obviously, p might not exactly be sample proportion.
Q: Another way to estimate p?
Could I say “p falls in (0,1)”?
X
6
95% Confidence Interval of μ
P(L < μ< U)=0.95, how to determine L, U? By CLT,
z0.025 is the upper 2.5percentile of a standard normal distribution
95% C.I. for μ is
)1
,( 2n
NX n
95.0)//(
,95.0)/
(
025.0025.0
nzXnzXP
baletbn
XaP
))(),(( 025.0025.0 XzXXzX
Refer to page215
7
Meaning of C.I.
Interval estimates about the population parameter
Random intervals which can be different based on observed data
A 95% CI implies 95% chance that the interval will cover the true population parameter before data collection (p217)
Over the collection of 100 95% C.I. that could be constructed from repeated samples of size n, 95 CIs will contain μ
page217
8
)12
4696.1X,
12
4696.1X(
9
Small sample case
Up to now, it’s assumed that σ2 is known. Unfortunately, it’s rarely known in practice.
s2 is substituted for σ2, then what is the distribution of
Gosset(1908) derived that it’s a t-distribution with df=n-1, and the shape depends on the sample size n. It’s also called Student’s t.
ns
Xt
/
10
T distribution v.s. Normal distribution
___ : N(0,1)
-----: t-dist, df=1,5,30
11
95% CI of μ: unknown variance
P(L < μ< U)=0.95, how to determine L, U? By CLT,
t(df, 0.025) is the upper 2.5percentile of a t distribution with df=n-1
95% C.I. for μ is
)1
,( 2n
NX n
95.0)//(
,95.0)/
(
)025.0,()025.0,(
nstXnstXP
baletbns
XaP
dfdf
)()025.0,1( XstX n查 t-dist 表 , A-10, tabA.4
12
95% CI for population mean
If variance is known or large sample size n
If variance is unknown and small sample size n
13
Example
若從一母體抽樣 16 筆資料,算出樣本平均數為 52 ,樣本標準差為 4 , (1) 求母體平均數的 95% 信賴區間 (2) 請解釋 95% 信賴區間的意義
(sol.) Assume population distribution is N(,σ2), how to estimate with unknown
σ?
95% C.I. for μ is
)131.54,869.49(..%9516
4131.252..)()025.0,15(
isIC
eiXstX
14
Computation of CI
A general formula for 95% C.I. Estimator z0.025*standard error
Where: Estimator : mean, difference in means, proporti
on etc.z0.025 : the upper 2.5percentile of N(0, 1)
If small sample, then using t(df, 0.025)
Standard error : the sampling error of the estimator
15
Factors affecting the length of CI
The length of CI for μ is The confidence coefficient increases, the len
gth increases. As the sample size n increases, the length de
creases. The larger variation, the longer the length.
nz /2 )2/(
page217
16
Section B
Hypothesis testing
Chapter 10
17
Hypothesis
Investigators often have ideas about what the parameters might be and wish to test whether the data provide these ideas. The average cholesterol level is greater tha
n 175 mg/dl or not? The mean birthweight (low SES group) is le
ss than 120oz(national average) or not? The mean change of pain level is different f
rom 0?
18
Example
The average cholesterol level (BMI>30)is greater than 175 mg/dl or not?
隨機抽取 36 個人其平均血膽固醇為 176隨機抽取 36 個人其平均血膽固醇為 174隨機抽取 36 個人其平均血膽固醇為 180隨機抽取 36 個人其平均血膽固醇為 170隨機抽取 100 個人其平均血膽固醇為 178
19
Rationale
Two competing hypotheses, null and alternative, usually given the null is true, to see if the data support this assumption. Assume the data are drawn from Normal population Construct the test statistic T, and define the
rejection region { T > c} If the assumption holds, then the probability of
observing extreme events should be small. How small is small? Say 0.05. Bearable error rate (type and type error)Ⅰ Ⅱ
20
The Hypotheses, H0 & H1
A Hypothesis is a claim (assertion) about the population parameter
States the claim to be tested e.g. The average cholesterol level is greater than 175 mg/dl or not?
H0: μ≤ 175 vs. H1: μ> 175
H1 contradicts H0
Null hypothesis is always about a parameter (H0: μ≤ 175), Not about a statistic (H0: X ≤ 175)
Always contains the “=” sign in H0
The null hypothesis may or may not be rejected
21
Hypothesis testing
Identify the Population
Test for theTest for the population mean
(H0: μ≤ 175 )
REJECT H0
Take a Sample
No, not likely!
sample mean is 200
Is 200 likely to μ≤ 175 ?
22
Reason for rejecting H0
Sampling Distribution of
=175
It is unlikely that we would get a sample mean of this value
Hence, we reject the null hypothesis that μ≤ 175
200If H0 is true
X
X
23
Testing Procedures
1. State the null and alternative hypotheses Null hypothesis(H0): the one to be questioned or
current situation Alternative hypothesis(H1): the one of particular
interest to investigators, need to prove it The average cholesterol level is greater than 175 mg/d
l or not?
H0: The average cholesterol level is less than 175 mg/dl
or H0: μ≤ μ0 (=175)
H1: The average cholesterol level is greater than 175
or H1:μ> μ0 (=175)
24
Testing Procedures (cont.)
2.Choose an appropriate test statistics and their null distributions
such as Z-statistics or t-statistics or paired t-test
3. Select nominal significance level Let Pr(type I error)= = 5%
)1(~/
),1,0(~/
00
ntns
XtorN
n
XZ
Small sample or unknown
variance
Under H0 (p235)
25
Example
The average cholesterol level (BMI>30)is greater than 175 mg/dl or not?
隨機抽取 36 個人其平均血膽固醇為 176隨機抽取 36 個人其平均血膽固醇為 174隨機抽取 36 個人其平均血膽固醇為 180隨機抽取 36 個人其平均血膽固醇為 170隨機抽取 100 個人其平均血膽固醇為 178
假設母體標準差為 12
Calculate test statistic=?
26
Error rates
H0
TRUE FALSE
Decision Rej. H0 Type Ⅰerror correct
Not Rej. H0 correct Type Ⅱerror• A Type I error is committed if rejecting H0 when it’s
true. i.e. P(Type I error)=α
•A Type Ⅱ error is committed if not rejecting H0
when H1 is true. i.e. P(Type error )=Ⅱ β
• Power=1-P(not rej. H0 |H1)=1-β
p240
27
Error rate(cont.)
The average cholesterol level is greater than 175 mg/dl or not? H0: μ≤ μ0 (=175)
H1: μ>μ0 (=175)
What do Type I error, Type error & PoⅡwer mean?
28
Testing Procedures (cont.)
4. Determine the critical value, rejection
region and decision rule: For large samples, two-sided test and
=0.05, the critical value is z0.025=1.96
and rejection region will be {|Z| >1.96}
Decision rule: reject H0 if the resulting test statistic is in the rejection region.
29
Significance level and the rejection region
H0:
175 H1:
< 175H0: 175
H1: > 175
H0:
175 H1:
175
Rejection Regions
0
0
0
/2
Critical Value(s)
)1(~/
175),1,0(~
/
175
nt
ns
XtorN
n
XZ
30
Example
The average cholesterol level (BMI>30)is greater than 175 mg/dl or not?
隨機抽取 36 個人其平均血膽固醇為 176隨機抽取 36 個人其平均血膽固醇為 174隨機抽取 36 個人其平均血膽固醇為 180隨機抽取 36 個人其平均血膽固醇為 170隨機抽取 100 個人其平均血膽固醇為 178
假設母體標準差為 12
What is the result?
31
5. Testing result Reject the null hypothesis H0
The sampling error is an unlikely explanation of discrepancy between the null hypothesis and observed values and the alternative hypothesis is proved at a risk of 5%
Fail to reject H0
The sampling error is a likely explanation and the data fail to provide enough evidence to doubt the validity of the null hypothesis.
Do NOT claim that H0 is accepted.
Testing Procedures(cont.)
32
Test statistic is a measure to quantify whether the discrepancy between the observed descriptive statistic and the hypothetical value assumed under H0 exceeds the sampling error under H0
Usually, based on the estimator of parameterSample mean for μSample proportion for pSample variance for σ2
Note: Test Statistics
33
Take home exercise
若已知某母體為常態分配 N(μ,16) ,欲檢定此母體的平均數是否小於 30 ,從母體抽樣 36 筆資料,
1. 請寫出檢定的虛無假設 H0與對立假設 H1
2. 若決策者認為棄卻域為 ,請問顯著性水準為多少 ?
3. 假設 36 筆資料 ,請問顯著性水準分別為 0.05 、 0.01 時,檢定的結論各為何 ?
29X
5.28X
34
Excel:function
35
Excel:function
36
P-value approach
37
Making decisions
t0
The observed test statistic is t0=-2.25, which population is most likely to see t0?
Green, red or black?
Given the null is true, observing extreme values than t0 is not rare.
38
P-value=P( T >T0|H0), also called observed level of significance The probability of obtaining T as extreme as or m
ore extreme than (≦ or ≥) T0, given H0 is true支持 H0的程度, P-value 越大,越不易拒絕 H0,小的 p值只是暗示資料不支持 H0
How small the p-value is sufficient enough to conclude significance?
Strength of evidence p>0.05, non-significant 0.01< p <0.05, enough evidence for significance 0.001<p<0.01, strong evidence for significance
P-value approach
p234
39
P-value
For one-sided test,
P-value= P(|T|>t0)
For two-sided test,
P-value= 2P(|T|>t0)
N(0,1)
Pe
rce
nt
210-1-2-3
20
15
10
5
0
either N(0,1) or t(df=n-1)
t0
Use the alternative hypothesis to find the direction of the rejection region.
40
372.5 3681.50
1525
XZ
n
= 0.05
n = 25, mean=372.5
Critical Value: ±1.96
Example: Two-tailed test
Test Statistic:
Decision:
Conclusion:
Do Not Reject at = .05.
Z0 1.96
.025
Reject
-1.96
.025
H0: 368
H1: 368
1.50
Insufficient Evidence that True Mean is Not 368.
41
P-value solution
(p-Value = 0.1336) ( = 0.05) Do Not Reject.
Test Statistic 1.5 is in the Non-Reject Region
1.9601.5 Z
Reject
= 0.05
p-Value = 2 x 0.0668
Reject
42
C.I. vs. Hypothesis testing
H0:μ=μ0 vs. H1:μ≠μ0
If the 95% C.I. for μ contains μ0 , then 2-sided test is NOT significant at level 0.05.
e.g. The rejection region:
The 95% C.I. for μ is
)}025.0,1(|/
||{| 0
ntns
Xt
)/)025.0,1(,/)025.0,1((.
/)025.0,1(
0 nsntXnsntXregionrej
nsntX
L U
μ=μ0
43
C.I. vs. Hypothesis testing (2)
Test for H0:μ=0 vs. H1:μ≠0
Suppose the 95% CI is (-3, 6) What is the conclusion?
What if test for H0:μ=10 vs. H1:μ≠10
44
Summary
Any testing procedure has basic stepsTwo-sample t-test, paired t-test, ANOVA
test, the Chi-squared test, test for regression coefficient.....etc.
H0 vs.H1, test statistics, level α, rejection region or p-value
Keep in mind, that statistical significance means it won’t happen by chance, not always means ‘important’.
45
The Power of a test
• Power=1-P(not rej. H0 |H1)=1-β
• e.g H0:μ≦180, H1:μ>180
One sample Z-test, given α=0.05, n=25, =46
)n
,(N~X),220|645.125/46
180X(P)220(1
)n
,(N~X),200|645.125/46
180X(P)200(1
05.0)180|zn/
180X(P)(1power
2
2
05.0
Refer to p245, figure 10.3: power curve
46
Factors affecting the power
Smaller α, smaller power (i.e. larger β)Trade off between Type I error and power
If σ increases, then the power decreasesIf the difference |μ-μ0| increases, then the
power increasesLarger sample size, larger power
How many subjects are enough?
47
Sample size determination
Perform the test for
H0:μ=μ0 vs. H1:μ= μ1
H0:μ=μ0 vs. H1:μ≠μ0
Larger effect size, δ=μ0 -μ1, smaller n Larger power, larger n Larger σ, larger n Smaller α, larger n
210
211
2
)(
)(
zzn
210
22/11
2
)(
)(
zzn
Let’s try the Minitab implementation!
48
Midterm2
考試時間 12/3 am9:00~11:30 考試範圍 Ch8-10 & 已”上台報告過”的名辭解釋 (from A to Z)
題型 : 是非題、名詞解釋及計算題 ( 中 / 英文出題 )提供檢定公式,可用計算機或翻譯機
t-test, or z-test etc.手機請關機 PS: 當天會排定座位,請提早入座