Lecture 5: Hypothesis Testing eckel/biostat2/slides/ note about approaches to two-sided hypothesis...

Click here to load reader

  • date post

    16-Jun-2018
  • Category

    Documents

  • view

    214
  • download

    0

Embed Size (px)

Transcript of Lecture 5: Hypothesis Testing eckel/biostat2/slides/ note about approaches to two-sided hypothesis...

  • Lecture 5: Hypothesis Testing

    Sandy Eckelseckel@jhsph.edu

    28 April 2008

    1 / 29

  • Recap: Statistical Inference

    Estimation

    Point estimationConfidence intervals

    Hypothesis Testing

    Application to means of distributions for continuous variables,extension to proportionsRelation between confidence intervals and hypothesis testingP-values, Type I error (), Type II error (), Power ((1 ))

    2 / 29

  • Basic steps of Hypothesis Testing

    Define the null hypothesis, H0

    Define the alternative hypothesis, Ha, where Ha is usually ofthe form not H0

    Define the type I error (probability of falsely rejecting thenull), , usually 0.05

    Calculate the test statistic

    Calculate the p-value (probability of getting a result as ormore extreme than observed if the null is true)

    If the p-value is , reject H0Otherwise, fail to reject H0

    3 / 29

  • Hypothesis test for a single mean I

    Birthweight example

    Assume a population of normally distributed birth weightswith a known standard deviation, = 1000 grams

    Birth weights are obtained on a sample of 10 infants; thesample mean is calculated as 2500 grams

    Question: Is the mean birth weight in this population differentfrom 3000 grams?

    Set up a two-sided test of

    H0 : = 3000

    vs. Ha : 6= 3000

    Let = 0.05 denote a 5% significance level

    4 / 29

  • Hypothesis test for a single mean II

    Calculate the test statistic:

    zobs =X 0/

    n=

    2500 30001000/

    10

    = 1.58

    What does this mean? Our observed mean is 1.58 standarderrors below the hypothesized mean

    The test statistic is the standardized value of our dataassuming the null hypothesis is true

    Question: If the true mean is 3000 grams, is our observedsample mean of 2500 common or is this value unlikely tooccur?

    5 / 29

  • Hypothesis test for a single mean III

    Calculate the p-value to answer our question:

    p-value = P(Z |zobs |)+P(Z |zobs |) = 20.057 = 0.114

    If the true mean is 3000 grams, our data or data moreextreme than ours would occur in 11 out of 100 studies(of the same size, n=10)

    In other words, in 11 out of 100 studies with sample sizen = 10, just by chance we are likely to observe a sample meanof 2500 or more extreme if the true mean is 3000 grams

    What does this say about our hypothesis?

    General guideline: if p-value , then reject H0Conclusion: we fail to reject the null hypothesis since wechose = 0.05 and our p-value is 0.114

    6 / 29

  • A note about approaches to two-sided hypothesis testing

    p-value Calculate the test statistic (TS), get a p-value fromthe TS and then reject the null hypothesis ifp-value or fail to reject the null if p-value>

    Critical Region Alternate, equivalent approach: calculate a criticalvalue (CV) for the specified , compute the TS andreject the null if |TS | > |CV | saying that the p-valueis < and fail to reject the null if |TS | < |CV |saying p-value > . You never calculate the actualp-value.

    Confidence Interval (CI) Another equivalent approach: create100(1 )% CI for the population parameter. If theCI does not contain the null hypothesis, you fail toreject the null hypothesis saying that the p-value is> . If the CI contains the null hypothesis, youreject the null saying p-value < . You dontcalculate the actual p-value. 7 / 29

  • Hypothesis test for a single mean: critical value

    Birthweight example, cont...

    Could also use the critical value approach

    Based on our significance level ( = 0.05) and assuming H0 istrue, how far does our sample mean have to be fromH0 : = 3000 in order to reject?

    Critical value = zc where 2 P(Z > |zc |) = 0.05In our example, zc = 1.96 and test statistic zobs = 1.58The rejection region is any value of our test statistic that is 1.96 or 1.96|zobs | < |zc | since | 1.58| < |1.96|, so we fail to reject thenull with p-value > 0.05

    Decision is the same whether using the p-value or critical value

    8 / 29

  • Hypothesis test for a single mean: confidence interval

    Birthweight example, cont...

    An alternative approach for two sided hypothesis testing is tocalculate a 100(1-)% confidence interval for the mean

    We are 95% confident that the interval (1880, 3120)contains the true population mean

    X z/210 2500 1.961000

    10

    The hypothetical true mean 3000 is a plausible value of thetrue mean given our data since it is in the CI

    We cannot say that the true mean is different from 3000

    We fail to reject the null hypothesis with p-value > 0.05

    Same conclusion as with p-value and critical value approach!

    9 / 29

  • General rule on the 100(1-)% confidence intervalapproach to two-sided hypothesis testing

    If the null hypothesis value is not contained in the confidenceinterval, you reject the null hypothesis with p-value

    If the null hypothesis value is contained in the confidenceinterval, you fail to reject the null hypothesis with p-value>

    Note: The confidence interval approach doesnt work withone-sided tests but the critical value and p-value approaches do

    10 / 29

  • P-values

    Definition: The p-value for a hypothesis test is the probability ofobtaining a value of the test statistic as or more extreme than theobserved test statistic when the null hypothesis is true

    The rejection region is determined by , the desired level ofsignificance, or probability of committing a type I error or theprobability of falsely rejecting the null

    Reporting the p-value associated with a test gives anindication of how common or rare the computed value of thetest statistic is, given that H0 is true

    We often use zobs to denote the computed value of the teststatistic

    11 / 29

  • Choosing the correct test statistic

    Depends on population sd () assumption and sample size

    The test statistic depends on your assumptions on

    When is known, we have a standard normal test statistic

    When is unknown and

    our sample size is relatively small, the test statistic has at-distributionour sample size is large, we have a standard normal teststatistic (CLT)

    The only difference in the procedure is the calculation of thep-value or rejection region uses a t- instead of normal distribution

    12 / 29

  • Summary table: Hypothesis tests for one meanH0 : = 0,Ha : 6= 0

    Population Sample Population TestDistribution Size Variance Statistic

    NormalAny 2 known zobs =

    X0/

    n

    Any 2 unknown tobs =X0s/

    n

    uses s2, df=n-1

    Not Normal/ Large 2 known zobs =X0/

    n

    UnknownLarge 2 unknown zobs =

    X0s/

    n

    uses s2

    Small Any Non-parametric methods

    13 / 29

  • Summary table: Hypothesis tests for one proportionH0 : p = p0,Ha : p 6= p0

    Population Sample TestDistribution Size Statistic

    BinomialLarge zobs =

    pp0p0(1p0)

    n

    Small Exact methods

    14 / 29

  • Moving from one to two means

    So far, weve been looking at only a single mean. What happenswhen we want to compare the means in two groups?

    We can compare two means by looking at the difference in themeans

    Consider the question: is 1 = 2?This is equivalent to the question: is 1 2 = 0 ?

    The work done for testing hypotheses about single meansextends to comparing two means

    Assumptions about the two population standard deviationsdetermine the formula youll use

    15 / 29

  • Summary: Hypothesis tests for a difference of two meansH0 : 1 2 = 0,Ha : 1 2 6= 0

    Population Sample Population TestDistribution Size Variances Statistic

    Normal

    Any Known zobs =(X1X2)0

    21

    n1+2

    2n2

    Any unknowntobs =

    (X1X2)0s2pn1

    +s2pn2

    assume 21 = 22 ,

    df = n1 + n2 2s2p =

    (n11)s21 +(n21)s22

    n1+n22

    Any unknowntobs =

    (X1X2)0s21

    n1+

    s22

    n2

    assume 21 6= 22 ,

    df = =(

    s21n1

    +s22n2

    )2

    (s21/n1)

    2

    n11+

    (s22/n2)

    2

    n21

    16 / 29

  • Example: Hypothesis test for difference of two means(two independent samples) I

    The EPREDA Trial: randomized, placebo-controlled trial todetermine whether dipyridamole improves the efficacy ofaspirin in preventing fetal growth retardation

    Pregnant women randomized to placebo (n=73) or totreatment (n=156)

    Mean birth weight was statistically significantly different inthe two groups, with the mean weight in the treatment groupbeing higher than the mean birthweight in the placebo group

    Treatment group: 2751 (SD 670) gramsPlacebo group: 2526 (SD 848) grams

    We now have the knowledge to reproduce this result

    17 / 29

  • Example: Hypothesis test for difference of two means(two independent samples) II

    Test the hypothesis:

    H0 : placebo = treated

    vs. Ha : placebo 6= treated

    at the 5% significance level ( = 0.05)

    The data are:

    Treatment n mean SD

    Placebo 73 2526 848

    Treated 156 2751 670

    18 / 29

  • Example: Hypothesis test for difference of two means(two independent samples) III

    Calculate the test statistic assuming the variances are unequal:

    tobs =(Xp Xt) 0

    s2pnp

    + s2t

    nt

    =2526 2751

    8482

    73 +6762

    156

    = 1.99

    The observed difference in mean birth weight comparing theplacebo to treated groups is approximately 2 standard errorsbelow the hypothesized difference of 0

    The degrees of freedom are:

    =( 848

    2

    73 +6702

    156 )2

    (8482/73)2

    731 +(6702/156)2

    1561