Statistical Inference Wen, shu-hui [email protected].

Statistical Inference

Wen, shu-hui

[email protected]

2

Outline

EstimationPoint estimation Interval estimation

Hypothesis testingSetting for H0, H1

Test statisticP-value approachThe meaning of significance

3

Section A

Estimation

Chapter 9

4

Estimation for μ

Point estimation Using an estimator to estimate parametere.g. (p217), mean SCL μ would be X=217 mg/100ml? Can we say μ=217?

Interval Estimation Provide a formal expression of the uncertainty for th

e point estimate e.g. μ might falls in (190, 220) or (0, ∞) What is the confidence coefficient?

5

Example (Binomial distribution)

A random sample X1, …, Xn are independent Ber(p), the estimator for p is

A study on prevalence of malignant melanoma, denoted by p. Suppose 5000 women are selected, and 28 are found to have the disease. How to estimate p?

Obviously, p might not exactly be sample proportion.

Q: Another way to estimate p?

Could I say “p falls in (0,1)”?

X

6

95% Confidence Interval of μ

P(L < μ< U)=0.95, how to determine L, U? By CLT,

z0.025 is the upper 2.5percentile of a standard normal distribution

95% C.I. for μ is

)1

,( 2n

NX n

95.0)//(

,95.0)/

(

025.0025.0

nzXnzXP

baletbn

XaP

))(),(( 025.0025.0 XzXXzX

Refer to page215

7

Meaning of C.I.

Interval estimates about the population parameter

Random intervals which can be different based on observed data

A 95% CI implies 95% chance that the interval will cover the true population parameter before data collection (p217)

Over the collection of 100 95% C.I. that could be constructed from repeated samples of size n, 95 CIs will contain μ

page217

8

)12

4696.1X,

12

4696.1X(

9

Small sample case

Up to now, it’s assumed that σ2 is known. Unfortunately, it’s rarely known in practice.

s2 is substituted for σ2, then what is the distribution of

Gosset(1908) derived that it’s a t-distribution with df=n-1, and the shape depends on the sample size n. It’s also called Student’s t.

ns

Xt

/

10

T distribution v.s. Normal distribution

___ : N(0,1)

-----: t-dist, df=1,5,30

11

95% CI of μ: unknown variance

P(L < μ< U)=0.95, how to determine L, U? By CLT,

t(df, 0.025) is the upper 2.5percentile of a t distribution with df=n-1

95% C.I. for μ is

)1

,( 2n

NX n

95.0)//(

,95.0)/

(

)025.0,()025.0,(

nstXnstXP

baletbns

XaP

dfdf

)()025.0,1( XstX n查 t-dist 表 , A-10, tabA.4

12

95% CI for population mean

If variance is known or large sample size n

If variance is unknown and small sample size n

13

Example

若從一母體抽樣 16 筆資料，算出樣本平均數為 52 ，樣本標準差為 4 ， (1) 求母體平均數的 95% 信賴區間 (2) 請解釋 95% 信賴區間的意義

(sol.) Assume population distribution is N(,σ2), how to estimate with unknown

σ?

95% C.I. for μ is

)131.54,869.49(..%9516

4131.252..)()025.0,15(

isIC

eiXstX

14

Computation of CI

A general formula for 95% C.I. Estimator z0.025*standard error

Where: Estimator : mean, difference in means, proporti

on etc.z0.025 : the upper 2.5percentile of N(0, 1)

If small sample, then using t(df, 0.025)

Standard error : the sampling error of the estimator

15

Factors affecting the length of CI

The length of CI for μ is The confidence coefficient increases, the len

gth increases. As the sample size n increases, the length de

creases. The larger variation, the longer the length.

nz /2 )2/(

page217

16

Section B

Hypothesis testing

Chapter 10

17

Hypothesis

Investigators often have ideas about what the parameters might be and wish to test whether the data provide these ideas. The average cholesterol level is greater tha

n 175 mg/dl or not? The mean birthweight (low SES group) is le

ss than 120oz(national average) or not? The mean change of pain level is different f

rom 0?

18

Example

The average cholesterol level (BMI>30)is greater than 175 mg/dl or not?

隨機抽取 36 個人其平均血膽固醇為 176隨機抽取 36 個人其平均血膽固醇為 174隨機抽取 36 個人其平均血膽固醇為 180隨機抽取 36 個人其平均血膽固醇為 170隨機抽取 100 個人其平均血膽固醇為 178

19

Rationale

Two competing hypotheses, null and alternative, usually given the null is true, to see if the data support this assumption. Assume the data are drawn from Normal population Construct the test statistic T, and define the

rejection region { T > c} If the assumption holds, then the probability of

observing extreme events should be small. How small is small? Say 0.05. Bearable error rate (type and type error)Ⅰ Ⅱ

20

The Hypotheses, H0 & H1

A Hypothesis is a claim (assertion) about the population parameter

States the claim to be tested e.g. The average cholesterol level is greater than 175 mg/dl or not?

H0: μ≤ 175 vs. H1: μ> 175

H1 contradicts H0

Null hypothesis is always about a parameter (H0: μ≤ 175), Not about a statistic (H0: X ≤ 175)

Always contains the “=” sign in H0

The null hypothesis may or may not be rejected

21

Hypothesis testing

Identify the Population

Test for theTest for the population mean

(H0: μ≤ 175 )

REJECT H0

Take a Sample

No, not likely!

sample mean is 200

Is 200 likely to μ≤ 175 ?

22

Reason for rejecting H0

Sampling Distribution of

=175

It is unlikely that we would get a sample mean of this value

Hence, we reject the null hypothesis that μ≤ 175

200If H0 is true

X

X

23

Testing Procedures

1. State the null and alternative hypotheses Null hypothesis(H0): the one to be questioned or

current situation Alternative hypothesis(H1): the one of particular

interest to investigators, need to prove it The average cholesterol level is greater than 175 mg/d

l or not?

H0: The average cholesterol level is less than 175 mg/dl

or H0: μ≤ μ0 (=175)

H1: The average cholesterol level is greater than 175

or H1:μ> μ0 (=175)

24

Testing Procedures (cont.)

2.Choose an appropriate test statistics and their null distributions

such as Z-statistics or t-statistics or paired t-test

3. Select nominal significance level Let Pr(type I error)= = 5%

)1(~/

),1,0(~/

00

ntns

XtorN

n

XZ

Small sample or unknown

variance

Under H0 (p235)

25

Example



假設母體標準差為 12

Calculate test statistic=?

26

Error rates

H0

TRUE FALSE

Decision Rej. H0 Type Ⅰerror correct

Not Rej. H0 correct Type Ⅱerror• A Type I error is committed if rejecting H0 when it’s

true. i.e. P(Type I error)=α

•A Type Ⅱ error is committed if not rejecting H0

when H1 is true. i.e. P(Type error )=Ⅱ β

• Power=1-P(not rej. H0 |H1)=1-β

p240

27

Error rate(cont.)

The average cholesterol level is greater than 175 mg/dl or not? H0: μ≤ μ0 (=175)

H1: μ>μ0 (=175)

What do Type I error, Type error & PoⅡwer mean?

28

Testing Procedures (cont.)

4. Determine the critical value, rejection

region and decision rule: For large samples, two-sided test and

=0.05, the critical value is z0.025=1.96

and rejection region will be {|Z| >1.96}

Decision rule： reject H0 if the resulting test statistic is in the rejection region.

29

Significance level and the rejection region

H0:

175 H1:

< 175H0: 175

H1: > 175

H0:

175 H1:

175

Rejection Regions

0

0

0

/2

Critical Value(s)

)1(~/

175),1,0(~

/

175

nt

ns

XtorN

n

XZ

30

Example



假設母體標準差為 12

What is the result?

31

5. Testing result Reject the null hypothesis H0

The sampling error is an unlikely explanation of discrepancy between the null hypothesis and observed values and the alternative hypothesis is proved at a risk of 5%

Fail to reject H0

The sampling error is a likely explanation and the data fail to provide enough evidence to doubt the validity of the null hypothesis.

Do NOT claim that H0 is accepted.

Testing Procedures(cont.)

32

Test statistic is a measure to quantify whether the discrepancy between the observed descriptive statistic and the hypothetical value assumed under H0 exceeds the sampling error under H0

Usually, based on the estimator of parameterSample mean for μSample proportion for pSample variance for σ2

Note: Test Statistics

33

Take home exercise

若已知某母體為常態分配 N(μ,16) ，欲檢定此母體的平均數是否小於 30 ，從母體抽樣 36 筆資料，

1. 請寫出檢定的虛無假設 H0與對立假設 H1

2. 若決策者認為棄卻域為，請問顯著性水準為多少 ?

3. 假設 36 筆資料，請問顯著性水準分別為 0.05 、 0.01 時，檢定的結論各為何 ?

29X

5.28X

34

Excel:function

35

Excel:function

36

P-value approach

37

Making decisions

t0

The observed test statistic is t0=-2.25, which population is most likely to see t0?

Green, red or black?

Given the null is true, observing extreme values than t0 is not rare.

38

P-value=P( T >T0|H0), also called observed level of significance The probability of obtaining T as extreme as or m

ore extreme than (≦ or ≥) T0, given H0 is true支持 H0的程度， P-value 越大，越不易拒絕 H0，小的 p值只是暗示資料不支持 H0

How small the p-value is sufficient enough to conclude significance?

Strength of evidence p>0.05, non-significant 0.01< p <0.05, enough evidence for significance 0.001<p<0.01, strong evidence for significance

P-value approach

p234

39

P-value

For one-sided test,

P-value= P(|T|>t0)

For two-sided test,

P-value= 2P(|T|>t0)

N(0,1)

Pe

rce

nt

210-1-2-3

20

15

10

5

0

either N(0,1) or t(df=n-1)

t0

Use the alternative hypothesis to find the direction of the rejection region.

40

372.5 3681.50

1525

XZ

n

= 0.05

n = 25, mean=372.5

Critical Value: ±1.96

Example: Two-tailed test

Test Statistic:

Decision:

Conclusion:

Do Not Reject at = .05.

Z0 1.96

.025

Reject

-1.96

.025

H0: 368

H1: 368

1.50

Insufficient Evidence that True Mean is Not 368.

41

P-value solution

(p-Value = 0.1336) ( = 0.05) Do Not Reject.

Test Statistic 1.5 is in the Non-Reject Region

1.9601.5 Z

Reject

= 0.05

p-Value = 2 x 0.0668

Reject

42

C.I. vs. Hypothesis testing

H0:μ=μ0 vs. H1:μ≠μ0

If the 95% C.I. for μ contains μ0 , then 2-sided test is NOT significant at level 0.05.

e.g. The rejection region:

The 95% C.I. for μ is

)}025.0,1(|/

||{| 0

ntns

Xt

)/)025.0,1(,/)025.0,1((.

/)025.0,1(

0 nsntXnsntXregionrej

nsntX

L U

μ=μ0

43

C.I. vs. Hypothesis testing (2)

Test for H0:μ=0 vs. H1:μ≠0

Suppose the 95% CI is (-3, 6) What is the conclusion?

What if test for H0:μ=10 vs. H1:μ≠10

44

Summary

Any testing procedure has basic stepsTwo-sample t-test, paired t-test, ANOVA

test, the Chi-squared test, test for regression coefficient.....etc.

H0 vs.H1, test statistics, level α, rejection region or p-value

Keep in mind, that statistical significance means it won’t happen by chance, not always means ‘important’.

45

The Power of a test

• Power=1-P(not rej. H0 |H1)=1-β

• e.g H0:μ≦180, H1:μ>180

One sample Z-test, given α=0.05, n=25, =46

)n

,(N~X),220|645.125/46

180X(P)220(1

)n

,(N~X),200|645.125/46

180X(P)200(1

05.0)180|zn/

180X(P)(1power

2

2

05.0

Refer to p245, figure 10.3: power curve

46

Factors affecting the power

Smaller α, smaller power (i.e. larger β)Trade off between Type I error and power

If σ increases, then the power decreasesIf the difference |μ-μ0| increases, then the

power increasesLarger sample size, larger power

How many subjects are enough?

47

Sample size determination

Perform the test for

H0:μ=μ0 vs. H1:μ= μ1

H0:μ=μ0 vs. H1:μ≠μ0

Larger effect size, δ=μ0 -μ1, smaller n Larger power, larger n Larger σ, larger n Smaller α, larger n

210

211

2

)(

)(

zzn

210

22/11

2

)(

)(

zzn

Let’s try the Minitab implementation!

48

Midterm2

考試時間 12/3 am9:00~11:30 考試範圍 Ch8-10 & 已”上台報告過”的名辭解釋 (from A to Z)

題型 : 是非題、名詞解釋及計算題 ( 中 / 英文出題 )提供檢定公式，可用計算機或翻譯機

t-test, or z-test etc.手機請關機 PS: 當天會排定座位，請提早入座

Statistical Inference Wen, shu-hui [email protected].

Documents

Transcript of Statistical Inference Wen, shu-hui [email protected].