Lecture 5: Hypothesis Testing eckel/biostat2/slides/ note about approaches to two-sided hypothesis...
date post
16-Jun-2018Category
Documents
view
214download
0
Embed Size (px)
Transcript of Lecture 5: Hypothesis Testing eckel/biostat2/slides/ note about approaches to two-sided hypothesis...
Lecture 5: Hypothesis Testing
Sandy Eckelseckel@jhsph.edu
28 April 2008
1 / 29
Recap: Statistical Inference
Estimation
Point estimationConfidence intervals
Hypothesis Testing
Application to means of distributions for continuous variables,extension to proportionsRelation between confidence intervals and hypothesis testingP-values, Type I error (), Type II error (), Power ((1 ))
2 / 29
Basic steps of Hypothesis Testing
Define the null hypothesis, H0
Define the alternative hypothesis, Ha, where Ha is usually ofthe form not H0
Define the type I error (probability of falsely rejecting thenull), , usually 0.05
Calculate the test statistic
Calculate the p-value (probability of getting a result as ormore extreme than observed if the null is true)
If the p-value is , reject H0Otherwise, fail to reject H0
3 / 29
Hypothesis test for a single mean I
Birthweight example
Assume a population of normally distributed birth weightswith a known standard deviation, = 1000 grams
Birth weights are obtained on a sample of 10 infants; thesample mean is calculated as 2500 grams
Question: Is the mean birth weight in this population differentfrom 3000 grams?
Set up a two-sided test of
H0 : = 3000
vs. Ha : 6= 3000
Let = 0.05 denote a 5% significance level
4 / 29
Hypothesis test for a single mean II
Calculate the test statistic:
zobs =X 0/
n=
2500 30001000/
10
= 1.58
What does this mean? Our observed mean is 1.58 standarderrors below the hypothesized mean
The test statistic is the standardized value of our dataassuming the null hypothesis is true
Question: If the true mean is 3000 grams, is our observedsample mean of 2500 common or is this value unlikely tooccur?
5 / 29
Hypothesis test for a single mean III
Calculate the p-value to answer our question:
p-value = P(Z |zobs |)+P(Z |zobs |) = 20.057 = 0.114
If the true mean is 3000 grams, our data or data moreextreme than ours would occur in 11 out of 100 studies(of the same size, n=10)
In other words, in 11 out of 100 studies with sample sizen = 10, just by chance we are likely to observe a sample meanof 2500 or more extreme if the true mean is 3000 grams
What does this say about our hypothesis?
General guideline: if p-value , then reject H0Conclusion: we fail to reject the null hypothesis since wechose = 0.05 and our p-value is 0.114
6 / 29
A note about approaches to two-sided hypothesis testing
p-value Calculate the test statistic (TS), get a p-value fromthe TS and then reject the null hypothesis ifp-value or fail to reject the null if p-value>
Critical Region Alternate, equivalent approach: calculate a criticalvalue (CV) for the specified , compute the TS andreject the null if |TS | > |CV | saying that the p-valueis < and fail to reject the null if |TS | < |CV |saying p-value > . You never calculate the actualp-value.
Confidence Interval (CI) Another equivalent approach: create100(1 )% CI for the population parameter. If theCI does not contain the null hypothesis, you fail toreject the null hypothesis saying that the p-value is> . If the CI contains the null hypothesis, youreject the null saying p-value < . You dontcalculate the actual p-value. 7 / 29
Hypothesis test for a single mean: critical value
Birthweight example, cont...
Could also use the critical value approach
Based on our significance level ( = 0.05) and assuming H0 istrue, how far does our sample mean have to be fromH0 : = 3000 in order to reject?
Critical value = zc where 2 P(Z > |zc |) = 0.05In our example, zc = 1.96 and test statistic zobs = 1.58The rejection region is any value of our test statistic that is 1.96 or 1.96|zobs | < |zc | since | 1.58| < |1.96|, so we fail to reject thenull with p-value > 0.05
Decision is the same whether using the p-value or critical value
8 / 29
Hypothesis test for a single mean: confidence interval
Birthweight example, cont...
An alternative approach for two sided hypothesis testing is tocalculate a 100(1-)% confidence interval for the mean
We are 95% confident that the interval (1880, 3120)contains the true population mean
X z/210 2500 1.961000
10
The hypothetical true mean 3000 is a plausible value of thetrue mean given our data since it is in the CI
We cannot say that the true mean is different from 3000
We fail to reject the null hypothesis with p-value > 0.05
Same conclusion as with p-value and critical value approach!
9 / 29
General rule on the 100(1-)% confidence intervalapproach to two-sided hypothesis testing
If the null hypothesis value is not contained in the confidenceinterval, you reject the null hypothesis with p-value
If the null hypothesis value is contained in the confidenceinterval, you fail to reject the null hypothesis with p-value>
Note: The confidence interval approach doesnt work withone-sided tests but the critical value and p-value approaches do
10 / 29
P-values
Definition: The p-value for a hypothesis test is the probability ofobtaining a value of the test statistic as or more extreme than theobserved test statistic when the null hypothesis is true
The rejection region is determined by , the desired level ofsignificance, or probability of committing a type I error or theprobability of falsely rejecting the null
Reporting the p-value associated with a test gives anindication of how common or rare the computed value of thetest statistic is, given that H0 is true
We often use zobs to denote the computed value of the teststatistic
11 / 29
Choosing the correct test statistic
Depends on population sd () assumption and sample size
The test statistic depends on your assumptions on
When is known, we have a standard normal test statistic
When is unknown and
our sample size is relatively small, the test statistic has at-distributionour sample size is large, we have a standard normal teststatistic (CLT)
The only difference in the procedure is the calculation of thep-value or rejection region uses a t- instead of normal distribution
12 / 29
Summary table: Hypothesis tests for one meanH0 : = 0,Ha : 6= 0
Population Sample Population TestDistribution Size Variance Statistic
NormalAny 2 known zobs =
X0/
n
Any 2 unknown tobs =X0s/
n
uses s2, df=n-1
Not Normal/ Large 2 known zobs =X0/
n
UnknownLarge 2 unknown zobs =
X0s/
n
uses s2
Small Any Non-parametric methods
13 / 29
Summary table: Hypothesis tests for one proportionH0 : p = p0,Ha : p 6= p0
Population Sample TestDistribution Size Statistic
BinomialLarge zobs =
pp0p0(1p0)
n
Small Exact methods
14 / 29
Moving from one to two means
So far, weve been looking at only a single mean. What happenswhen we want to compare the means in two groups?
We can compare two means by looking at the difference in themeans
Consider the question: is 1 = 2?This is equivalent to the question: is 1 2 = 0 ?
The work done for testing hypotheses about single meansextends to comparing two means
Assumptions about the two population standard deviationsdetermine the formula youll use
15 / 29
Summary: Hypothesis tests for a difference of two meansH0 : 1 2 = 0,Ha : 1 2 6= 0
Population Sample Population TestDistribution Size Variances Statistic
Normal
Any Known zobs =(X1X2)0
21
n1+2
2n2
Any unknowntobs =
(X1X2)0s2pn1
+s2pn2
assume 21 = 22 ,
df = n1 + n2 2s2p =
(n11)s21 +(n21)s22
n1+n22
Any unknowntobs =
(X1X2)0s21
n1+
s22
n2
assume 21 6= 22 ,
df = =(
s21n1
+s22n2
)2
(s21/n1)
2
n11+
(s22/n2)
2
n21
16 / 29
Example: Hypothesis test for difference of two means(two independent samples) I
The EPREDA Trial: randomized, placebo-controlled trial todetermine whether dipyridamole improves the efficacy ofaspirin in preventing fetal growth retardation
Pregnant women randomized to placebo (n=73) or totreatment (n=156)
Mean birth weight was statistically significantly different inthe two groups, with the mean weight in the treatment groupbeing higher than the mean birthweight in the placebo group
Treatment group: 2751 (SD 670) gramsPlacebo group: 2526 (SD 848) grams
We now have the knowledge to reproduce this result
17 / 29
Example: Hypothesis test for difference of two means(two independent samples) II
Test the hypothesis:
H0 : placebo = treated
vs. Ha : placebo 6= treated
at the 5% significance level ( = 0.05)
The data are:
Treatment n mean SD
Placebo 73 2526 848
Treated 156 2751 670
18 / 29
Example: Hypothesis test for difference of two means(two independent samples) III
Calculate the test statistic assuming the variances are unequal:
tobs =(Xp Xt) 0
s2pnp
+ s2t
nt
=2526 2751
8482
73 +6762
156
= 1.99
The observed difference in mean birth weight comparing theplacebo to treated groups is approximately 2 standard errorsbelow the hypothesized difference of 0
The degrees of freedom are:
=( 848
2
73 +6702
156 )2
(8482/73)2
731 +(6702/156)2
1561