• date post

16-Jun-2018
• Category

## Documents

• view

214

0

Embed Size (px)

### Transcript of Lecture 5: Hypothesis Testing eckel/biostat2/slides/ note about approaches to two-sided hypothesis...

• Lecture 5: Hypothesis Testing

Sandy Eckelseckel@jhsph.edu

28 April 2008

1 / 29

• Recap: Statistical Inference

Estimation

Point estimationConfidence intervals

Hypothesis Testing

Application to means of distributions for continuous variables,extension to proportionsRelation between confidence intervals and hypothesis testingP-values, Type I error (), Type II error (), Power ((1 ))

2 / 29

• Basic steps of Hypothesis Testing

Define the null hypothesis, H0

Define the alternative hypothesis, Ha, where Ha is usually ofthe form not H0

Define the type I error (probability of falsely rejecting thenull), , usually 0.05

Calculate the test statistic

Calculate the p-value (probability of getting a result as ormore extreme than observed if the null is true)

If the p-value is , reject H0Otherwise, fail to reject H0

3 / 29

• Hypothesis test for a single mean I

Birthweight example

Assume a population of normally distributed birth weightswith a known standard deviation, = 1000 grams

Birth weights are obtained on a sample of 10 infants; thesample mean is calculated as 2500 grams

Question: Is the mean birth weight in this population differentfrom 3000 grams?

Set up a two-sided test of

H0 : = 3000

vs. Ha : 6= 3000

Let = 0.05 denote a 5% significance level

4 / 29

• Hypothesis test for a single mean II

Calculate the test statistic:

zobs =X 0/

n=

2500 30001000/

10

= 1.58

What does this mean? Our observed mean is 1.58 standarderrors below the hypothesized mean

The test statistic is the standardized value of our dataassuming the null hypothesis is true

Question: If the true mean is 3000 grams, is our observedsample mean of 2500 common or is this value unlikely tooccur?

5 / 29

• Hypothesis test for a single mean III

Calculate the p-value to answer our question:

p-value = P(Z |zobs |)+P(Z |zobs |) = 20.057 = 0.114

If the true mean is 3000 grams, our data or data moreextreme than ours would occur in 11 out of 100 studies(of the same size, n=10)

In other words, in 11 out of 100 studies with sample sizen = 10, just by chance we are likely to observe a sample meanof 2500 or more extreme if the true mean is 3000 grams

What does this say about our hypothesis?

General guideline: if p-value , then reject H0Conclusion: we fail to reject the null hypothesis since wechose = 0.05 and our p-value is 0.114

6 / 29

• A note about approaches to two-sided hypothesis testing

p-value Calculate the test statistic (TS), get a p-value fromthe TS and then reject the null hypothesis ifp-value or fail to reject the null if p-value>

Critical Region Alternate, equivalent approach: calculate a criticalvalue (CV) for the specified , compute the TS andreject the null if |TS | > |CV | saying that the p-valueis < and fail to reject the null if |TS | < |CV |saying p-value > . You never calculate the actualp-value.

Confidence Interval (CI) Another equivalent approach: create100(1 )% CI for the population parameter. If theCI does not contain the null hypothesis, you fail toreject the null hypothesis saying that the p-value is> . If the CI contains the null hypothesis, youreject the null saying p-value < . You dontcalculate the actual p-value. 7 / 29

• Hypothesis test for a single mean: critical value

Birthweight example, cont...

Could also use the critical value approach

Based on our significance level ( = 0.05) and assuming H0 istrue, how far does our sample mean have to be fromH0 : = 3000 in order to reject?

Critical value = zc where 2 P(Z > |zc |) = 0.05In our example, zc = 1.96 and test statistic zobs = 1.58The rejection region is any value of our test statistic that is 1.96 or 1.96|zobs | < |zc | since | 1.58| < |1.96|, so we fail to reject thenull with p-value > 0.05

Decision is the same whether using the p-value or critical value

8 / 29

• Hypothesis test for a single mean: confidence interval

Birthweight example, cont...

An alternative approach for two sided hypothesis testing is tocalculate a 100(1-)% confidence interval for the mean

We are 95% confident that the interval (1880, 3120)contains the true population mean

X z/210 2500 1.961000

10

The hypothetical true mean 3000 is a plausible value of thetrue mean given our data since it is in the CI

We cannot say that the true mean is different from 3000

We fail to reject the null hypothesis with p-value > 0.05

Same conclusion as with p-value and critical value approach!

9 / 29

• General rule on the 100(1-)% confidence intervalapproach to two-sided hypothesis testing

If the null hypothesis value is not contained in the confidenceinterval, you reject the null hypothesis with p-value

If the null hypothesis value is contained in the confidenceinterval, you fail to reject the null hypothesis with p-value>

Note: The confidence interval approach doesnt work withone-sided tests but the critical value and p-value approaches do

10 / 29

• P-values

Definition: The p-value for a hypothesis test is the probability ofobtaining a value of the test statistic as or more extreme than theobserved test statistic when the null hypothesis is true

The rejection region is determined by , the desired level ofsignificance, or probability of committing a type I error or theprobability of falsely rejecting the null

Reporting the p-value associated with a test gives anindication of how common or rare the computed value of thetest statistic is, given that H0 is true

We often use zobs to denote the computed value of the teststatistic

11 / 29

• Choosing the correct test statistic

Depends on population sd () assumption and sample size

The test statistic depends on your assumptions on

When is known, we have a standard normal test statistic

When is unknown and

our sample size is relatively small, the test statistic has at-distributionour sample size is large, we have a standard normal teststatistic (CLT)

The only difference in the procedure is the calculation of thep-value or rejection region uses a t- instead of normal distribution

12 / 29

• Summary table: Hypothesis tests for one meanH0 : = 0,Ha : 6= 0

Population Sample Population TestDistribution Size Variance Statistic

NormalAny 2 known zobs =

X0/

n

Any 2 unknown tobs =X0s/

n

uses s2, df=n-1

Not Normal/ Large 2 known zobs =X0/

n

UnknownLarge 2 unknown zobs =

X0s/

n

uses s2

Small Any Non-parametric methods

13 / 29

• Summary table: Hypothesis tests for one proportionH0 : p = p0,Ha : p 6= p0

Population Sample TestDistribution Size Statistic

BinomialLarge zobs =

pp0p0(1p0)

n

Small Exact methods

14 / 29

• Moving from one to two means

So far, weve been looking at only a single mean. What happenswhen we want to compare the means in two groups?

We can compare two means by looking at the difference in themeans

Consider the question: is 1 = 2?This is equivalent to the question: is 1 2 = 0 ?

The work done for testing hypotheses about single meansextends to comparing two means

Assumptions about the two population standard deviationsdetermine the formula youll use

15 / 29

• Summary: Hypothesis tests for a difference of two meansH0 : 1 2 = 0,Ha : 1 2 6= 0

Population Sample Population TestDistribution Size Variances Statistic

Normal

Any Known zobs =(X1X2)0

21

n1+2

2n2

Any unknowntobs =

(X1X2)0s2pn1

+s2pn2

assume 21 = 22 ,

df = n1 + n2 2s2p =

(n11)s21 +(n21)s22

n1+n22

Any unknowntobs =

(X1X2)0s21

n1+

s22

n2

assume 21 6= 22 ,

df = =(

s21n1

+s22n2

)2

(s21/n1)

2

n11+

(s22/n2)

2

n21

16 / 29

• Example: Hypothesis test for difference of two means(two independent samples) I

The EPREDA Trial: randomized, placebo-controlled trial todetermine whether dipyridamole improves the efficacy ofaspirin in preventing fetal growth retardation

Pregnant women randomized to placebo (n=73) or totreatment (n=156)

Mean birth weight was statistically significantly different inthe two groups, with the mean weight in the treatment groupbeing higher than the mean birthweight in the placebo group

Treatment group: 2751 (SD 670) gramsPlacebo group: 2526 (SD 848) grams

We now have the knowledge to reproduce this result

17 / 29

• Example: Hypothesis test for difference of two means(two independent samples) II

Test the hypothesis:

H0 : placebo = treated

vs. Ha : placebo 6= treated

at the 5% significance level ( = 0.05)

The data are:

Treatment n mean SD

Placebo 73 2526 848

Treated 156 2751 670

18 / 29

• Example: Hypothesis test for difference of two means(two independent samples) III

Calculate the test statistic assuming the variances are unequal:

tobs =(Xp Xt) 0

s2pnp

+ s2t

nt

=2526 2751

8482

73 +6762

156

= 1.99

The observed difference in mean birth weight comparing theplacebo to treated groups is approximately 2 standard errorsbelow the hypothesized difference of 0

The degrees of freedom are:

=( 848

2

73 +6702

156 )2

(8482/73)2

731 +(6702/156)2

1561