Unit 3: Foundations for inference Lecture 2: Confidence...

32
Unit 3: Foundations for inference Lecture 2: Confidence Intervals and Hypothesis Testing Statistics 101 Thomas Leininger May 29, 2013

Transcript of Unit 3: Foundations for inference Lecture 2: Confidence...

Page 1: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Unit 3: Foundations for inferenceLecture 2: Confidence Intervals and Hypothesis

Testing

Statistics 101

Thomas Leininger

May 29, 2013

Page 2: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Recap of last class

Central limit theorem

Central limit theoremThe distribution of the sample mean is well approximated by a normalmodel:

x ∼ N(mean = µ,SE =

σ√

n

)If σ is unknown, use s.

So it wasn’t a coincidence that the sampling distributions we sawyesterday were symmetric.

We won’t go into the proving why SE = σ√n, but note that as n

increases SE decreases.

As the sample size increases we would expect samples to yieldmore consistent sample means, hence the variability among thesample means would be lower.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 2 / 32

Page 3: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Recap of last class

CLT - conditions

Certain conditions must be met for the CLT to apply:

1 Independence: Sampled observations must be independent.

This is difficult to verify, but is more likely ifrandom sampling/assignment is used, and,if sampling without replacement, n < 10% of the population.

2 Sample size/skew/outliers: Either1) the population distribution is normal OR2) n > 30 and the population distribution is not extremely skewed.

This is also difficult to verify for the population, but we can checkit using the sample data, and assume that the sample mirrors thepopulation.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 3 / 32

Page 4: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Recap of last class

CLT - sample size/skew condition - simulations (1)

http:// onlinestatbook.com/ stat sim/ sampling dist/ index.html

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 4 / 32

Page 5: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Recap of last class

CLT - sample size/skew condition - simulations (2)

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 5 / 32

Page 6: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Recap of last class

CLT - sample size/skew condition - simulations (3)

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 6 / 32

Page 7: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Recap of last class

Question

Four plots: Determine which plot (A, B, or C) is which.

(1) At top: distribution for a population (µ = 10, σ = 7),

(2) a single random sample of 100 observations from this population,

(3) a distribution of 100 sample means from random samples with size 7, and

(4) a distribution of 100 sample means from random samples with size 49.

0 10 20 30 40 50

Populationµ = 10σ = 7

(a) A - (3); B - (2); C - (4)

(b) A - (2); B - (3); C - (4)

(c) A - (3); B - (4); C - (2)

(d) A - (4); B - (2); C - (3)

Plot A4 6 8 10 12 14 16 18

05

1015202530

Plot B0 5 10 15 20 25 30 35

05

1015202530

Plot C8 9 10 11 12

0

5

10

15

20

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 7 / 32

Page 8: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals Why do we report confidence intervals?

Confidence intervals

A plausible range of values for the population parameter is calleda confidence interval.

Using only a point estimate to estimate a parameter is like fishingin a murky lake with a spear, and using a confidence interval islike fishing with a net.

We can throw a spear where we sawa fish but we will probably miss. If wetoss a net in that area, we have agood chance of catching the fish.

If we report a point estimate, we probably will not hit the exactpopulation parameter. If we report a range of plausible values –a confidence interval – we have a good shot at capturing theparameter.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 8 / 32

Page 9: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals Constructing a confidence interval

Average number of exclusive relationships

A sample of 50 Duke students were asked them how many exclusiverelationships they’ve been in so far. This sample yielded a mean of 3.2and a standard deviation of 1.74. Estimate the true average number ofexclusive relationships using this sample.

x = 3.2 s = 1.74

The approximate 95% confidence interval is defined as

point estimate ± 2 × SE

SE =s√

n=

1.74√

50≈ 0.25

x ± 2 × SE = 3.2 ± 2 × 0.25

= (3.2 − 0.5, 3.2 + 0.5)

= (2.7, 3.7)

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 9 / 32

Page 10: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals Constructing a confidence interval

Question

Which of the following is the correct interpretation of this confidenceinterval?

We are 95% confident that

(a) the average number of exclusive relationships Duke students inthis sample have been in is between 2.7 and 3.7.

(b) Duke students on average have been in between 2.7 and 3.7exclusive relationships.

(c) a randomly chosen Duke student has been in 2.7 to 3.7 exclusiverelationships.

(d) 95% of the Duke students have been in 2.7 to 3.7 exclusiverelationships.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 10 / 32

Page 11: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals A more accurate interval

A more accurate interval

Confidence interval, a general formula

point estimate ± z? × SE

Conditions when the point estimate = x:1 Independence: Observations in the sample must be independent

random sample/assignmentif sampling without replacement, n < 10% of population

2 Sample size / skew: n ≥ 30 and distribution not extremelyskewed

Note: We’ll talk about what happens when n < 30 in the next unit.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 11 / 32

Page 12: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals A more accurate interval

Changing the confidence level

point estimate ± z? × SE

In order to change the confidence level all we need to do isadjust z? in the above formula.

Commonly used confidence levels in practice are 90%, 95%,98%, and 99%.

For a 95% confidence interval, z? = 1.96.

However, using the z table it is possible to find the appropriate z?

for any confidence level.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 12 / 32

Page 13: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals A more accurate interval

Question

Which of the below Z scores is the appropriate z? when calculating a98% confidence interval?

(a) Z = 2.05

(b) Z = 1.96

(c) Z = 2.33

(d) Z = −2.33

(e) Z = −1.65

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 13 / 32

Page 14: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals Capturing the population parameter

What does 95% confident mean?

Suppose we took many samples and built a confidence intervalfrom each sample using the equation point estimate ± 2 × SE.

Then about 95% of those intervals would contain the truepopulation mean (µ).

The figure on the left showsthis process with 25 samples,where 24 of the resultingconfidence intervals containthe true average number ofexclusive relationships, andone does not.

µ = 3.207

Note: 95% confident DOES NOT mean that there is a 95% probabilitythat this interval captures the parameter! (hint: remember this on yourHW and exams)

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 14 / 32

Page 15: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals Capturing the population parameter

Width of an interval

If we want to be very certain that we capture the population parameter,i.e. increase our confidence level, should we use a wider interval or asmaller interval?

Can you see any drawbacks to using a wider interval?

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 15 / 32

Page 16: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Confidence intervals Capturing the population parameter

Question

One of the earliest examples of behavioral asymmetry is a preference in hu-mans for turning the head to the right, rather than to the left, during the finalweeks of gestation and for the first 6 months after birth. This is thought toinfluence subsequent development of perceptual and motor preferences. Astudy of 124 couples found that 64.5% turned their heads to the right whenkissing. The standard error associated with this estimate is roughly 4%.Which of the below is false?

(a) The 95% confidence interval for the percentage of kissers who turn theirheads to the right is roughly 64.5% ± 4%.

(b) A higher sample size would yield a lower standard error.

(c) The margin of error for a 95% confidence interval for the percentage ofkissers who turn their heads to the right is roughly 8%.

(d) The 99.7% confidence interval for the percentage of kissers who turntheir heads to the right is roughly 64.5% ± 12%.

Gunturkun, O. (2003) Adult persistence of head-turning asymmetry. Nature. Vol 421.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 16 / 32

Page 17: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Hypothesis testing framework

Remember when...

Gender discrimination experiment:Promotion

Promoted Not Promoted Total

GenderMale 21 3 24Female 14 10 24Total 35 13 48

pmales = 21/24 ≈ 0.88

pfemales = 14/24 ≈ 0.58

Possible explanations:Promotion and gender are independent, no genderdiscrimination, observed difference in proportions is simply dueto chance. → null - (nothing is going on)Promotion and gender are dependent, there is genderdiscrimination, observed difference in proportions is not due tochance. → alternative - (something is going on)

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 17 / 32

Page 18: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Hypothesis testing framework

Result

Since it was quite unlikely to obtain results like the actual data orsomething more extreme in the simulations (male promotions being30% or more higher than female promotions), we decided to reject thenull hypothesis in favor of the alternative.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 18 / 32

Page 19: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Hypothesis testing framework

Recap: hypothesis testing framework

We start with a null hypothesis (H0) that represents the statusquo.

We also have an alternative hypothesis (HA ) that represents ourresearch question, i.e. what we’re testing for.

We conduct a hypothesis test under the assumption that the nullhypothesis is true, either via simulation or theoretical methods(coming up next...).

If the test results suggest that the data do not provide convincingevidence for the alternative hypothesis, we fail to reject the nullhypothesis. If they do, then we reject the null hypothesis in favorof the alternative.

We’ll formally introduce the hypothesis testing framework using anexample on testing a claim about a population mean.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 19 / 32

Page 20: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Hypothesis testing framework

Number of college applications - hypotheses

A class survey asked how many colleges students applied to, and 206students responded to this question. This sample yielded an average of 9.7college applications with a standard deviation of 7. College Board websitestates that counselors recommend students apply to roughly 8 colleges.

Pretending that this sample is random and representative of all Duke stu-dents, do these data provide evidence that Duke students apply to moreschools on average than the recommendation?

http:// www.collegeboard.com/ student/ apply/ the-application/ 151680.html

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 20 / 32

Page 21: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Hypothesis testing framework

Number of college applications - hypotheses

Which of the following are the correct set of hypotheses to test if these dataprovide convincing evidence that the average number of colleges all Dukestudents apply to is higher than recommended?

(a) H0 : µ = 9.7HA : µ > 9.7

(b) H0 : µ = 8HA : µ > 8

(c) H0 : x = 8HA : x > 8

(d) H0 : µ = 8HA : µ > 9.7

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 21 / 32

Page 22: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Hypothesis testing framework

Setting the hypotheses

The parameter of interest is the average number of collegesapplied to among all current Duke students (µ). Our pointestimate for this is x, the sample average.There may be two explanations why our sample average ishigher than the recommendation of 8 schools.

The true average of Duke students is higher than 8.The true average of Duke students is 8, the difference is simplydue to natural sampling variability.

We start with the assumption that nothing has changed.

H0 : µ = 8

We test the claim that average Duke student applies to morethan 8 schools.

HA : µ > 8

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 22 / 32

Page 23: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Conditions for inference

Conditions for inference

Before doing inference using this data set, we must make sure thatthe conditions necessary for inference are satisfied:

1. Independence:Assuming this sample is random.206 < 10% of all current Duke students.

We can assume that number of schools applied to by one studentin this sample is independent of another.

2. Sample size / skew: n > 30 is met. Make a boxplot to check forextreme skewness (assume we’re okay here).

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 23 / 32

Page 24: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Formal testing using p-values

Number of college applications - test statistic

µ = 8 x = 9.7

x ∼ N(µ = 8,SE =

7√

206= 0.5

)Z =

9.7 − 80.5

= 3.4

The sample mean is 3.4 stan-dard errors away from the hy-pothesized value. Is this con-sidered unusually (significantly)high?

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 24 / 32

Page 25: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Formal testing using p-values

p-values

The p-value is the probability of observing data at least asfavorable to the alternative hypothesis as our current data set, ifthe null hypothesis was true.

If the p-value is low (lower than the significance level, α, which isusually 5%) we say that it would be very unlikely to observe thedata if the null hypothesis were true, and hence reject H0.

If the p-value is high (higher than α) we say that it is likely toobserve the data even if the null hypothesis were true, and hencedo not reject H0.

We never accept H0 since we’re not in the business of trying toprove it. We simply want to know if the data provide convincingevidence to support HA . (We haven’t proved µ = 8)

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 25 / 32

Page 26: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Formal testing using p-values

Number of college applications - p-value

p-value: probability of observing data at least as favorable to HA asour current data set (a sample mean greater than 9.7), if in fact H0

was true (the true population mean was 8).

µ = 8 x = 9.7

P(x > 9.7 | µ = 8) = P(Z > 3.4) = 0.0003

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 26 / 32

Page 27: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Formal testing using p-values

Number of college applications - Making a decision

p-value = 0.0003

If the true average of the number of colleges Duke studentsapplied to is 8, there is only 0.03% chance of observing a randomsample of 206 Duke students who on average apply to 9.7 ormore schools.This is a pretty low probability for us to think that a sample meanof 9.7 or more schools is likely to happen simply by chance.

Since p-value is low (lower than 5%) we reject H0.

The data provide convincing evidence that Duke students onaverage apply to more than 8 schools.

The difference between the null value of 8 schools and observedsample mean of 9.7 schools is not due to chance or samplingvariability.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 27 / 32

Page 28: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Formal testing using p-values

Question

A poll by the National Sleep Foundation found that college students average about 7

hours of sleep per night. A sample of 169 Duke students yielded an average of 6.88

hours, with a standard deviation of 0.94 hours. Assuming that this is a random sample

representative of all Duke students (bit of a leap of faith?), a hypothesis test was

conducted to evaluate if Duke students on average sleep less than 7 hours per night.

The p-value for this hypothesis test is 0.0485. Which of the following is correct?

(a) Fail to reject H0, the data provide convincing evidence that Dukestudents sleep less than 7 hours on average.

(b) Reject H0, the data provide convincing evidence that Dukestudents sleep less than 7 hours on average.

(c) Reject H0, the data prove that Duke students sleep more than 7hours on average.

(d) Fail to reject H0, the data do not provide convincing evidence thatDuke students sleep less than 7 hours on average.

(e) Reject H0, the data provide convincing evidence that Dukestudents in this sample sleep less than 7 hours on average.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 28 / 32

Page 29: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Two-sided hypothesis testing with p-values

Two-sided hypothesis testing with p-values

If the research question was “Do the data provide convincingevidence that the average amount of sleep Duke students getper night is different than the national average?”, the alternativehypothesis would be different.

H0 : µ = 7

HA : µ , 7

Hence the p-value would change as well:

6.88 7.00 7.12

p-value= 0.0485 × 2= 0.097

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 29 / 32

Page 30: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Two-sided hypothesis testing with p-values

the next two slides are provided as a brief summary of hypothesistesting...

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 30 / 32

Page 31: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Two-sided hypothesis testing with p-values

Recap: Hypothesis testing framework

1 Set the hypotheses.2 Check assumptions and conditions.3 Calculate a test statistic and a p-value.4 Make a decision, and interpret it in context of the research

question.

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 31 / 32

Page 32: Unit 3: Foundations for inference Lecture 2: Confidence ...tjl13/s101/slides/unit3lec2H.pdfConfidence intervals Constructing a confidence interval Average number of exclusive relationships

Hypothesis testing Two-sided hypothesis testing with p-values

Recap: Hypothesis testing for a population mean

1 Set the hypothesesH0 : µ = null valueHA : µ < or > or , null value

2 Check assumptions and conditionsIndependence: random sample/assignment, 10% condition whensampling without replacementNormality: nearly normal population or n ≥ 30, no extreme skew

3 Calculate a test statistic and a p-value (draw a picture!)

Z =x − µSE

, where SE =s√

n

4 Make a decision, and interpret it in context of the researchquestion

If p-value < α, reject H0, data provide evidence for HA

If p-value > α, do not reject H0, data do not provide evidence forHA

Statistics 101 (Thomas Leininger) U3 - L2: CI and HT May 29, 2013 32 / 32