MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1....

28
MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1

Transcript of MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1....

Page 1: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

MATH& 146

Lesson 21

Section 2.8

Confidence Intervals

1

Page 2: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Intervals

Recall that a point estimate is a statistic

calculated from a sample that is used to estimate a

population parameter.

For example, the sample mean, can be used to

estimate the population mean, μ.

Likewise, the sample standard proportion, can

be used to estimate the population proportion, p.

2

,x

ˆ,p

Page 3: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Intervals

A point estimate provides a single plausible value

for a parameter.

Point estimates are the best guess, based on the

sample, but are rarely perfect; usually there is

some error in the estimate.

In addition to supplying a point estimate of a

parameter, a next logical step would be to provide

a plausible range of values for the parameter.

3

Page 4: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Intervals

A plausible range of values for the population

parameter is called a confidence interval.

Using only a point estimate is like fishing in a

murky lake with a spear, and using a confidence

interval is like fishing with a net. We can throw a

spear where we saw a fish, but we will probably

miss. On the other hand, if we toss a net in that

area, we have a good chance of catching the fish.

4

Page 5: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Intervals

If we report a point estimate, we probably will not

hit the exact population parameter.

On the other hand, if we report a range of plausible

values – a confidence interval – we have a good

shot at capturing the parameter.

5

Page 6: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Example 1

If we want to be very certain we capture the

population parameter, should we use a wider

interval or a smaller interval?

6

Page 7: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Intervals

The point estimate is the most plausible value for

our parameter of interest, so it makes sense to

build the confidence interval around the point

estimate.

The margin of error is a measure of the

uncertainty associated with the point estimate, It is

found by multiplying the standard error with the

confidence coefficient.

7

Page 8: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Example 2

Compute the area between –1.96 and 1.96 for a

normal distribution with mean 0 and standard

deviation 1.

8

Page 9: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

95% Confidence Intervals

9

As you saw with the last example, approximately

95% of all values in a normal distribution will be

within 1.96 standard deviations from the mean.

Page 10: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Example 3

Suppose X ~ N(μ = 100, σ = 16). Approximately

95% of all values will be between which two

numbers?

10

Page 11: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

95% Confidence Intervals

When the sampling distribution of a point estimate

can reasonably be modeled as normal, the point

estimate we observe will be within 1.96 standard

errors of the true value of interest about 95% of the

time. Thus, a 95% confidence interval for such a

point estimate can be constructed:

point estimate ± 1.96 × SE

We can be 95% confident this interval captures

the true value.

11

Page 12: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Conditions

The normal approximation is crucial to the precision of

confidence intervals. To ensure this, we need to check

the following conditions.

• The sample observations are independent.

• The sample size is large enough.

• The distribution of sample observations is not too

strongly skewed. (However, we can relax this

condition as the sample size increases.)

12

Page 13: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Independence

Verifying independence is often the most difficult of the

conditions to check, and the way to check for

independence varies from one situation to another.

However, we can provide simple rules for the most

common scenarios:

• Independence is guaranteed when we take random

samples from less than 10% of the population.

• It can also be guaranteed if we randomly divide

individuals into treatment and control groups.

13

Page 14: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Example 4

The point estimate from the opportunity cost study was

that 20% fewer students would buy a DVD if they were

reminded that money not spent now could be spent

later on something else. The point estimate from this

study can reasonably be modeled with a normal

distribution, and a proper standard error for this point

estimate is SE = 0.078.

Construct a 95% confidence interval. Assume the

conditions for the normal approximation have been

met.

14

Page 15: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Stent Study

Section 1.1 of our textbook describes an

experiment that examined whether implanting a

stent in the brain of a patient at risk for a stroke

helps reduce the risk of a stroke. The results from

the first 30 days of this study, which included 451

patients, are summarized below.

15

stroke no event Total

treatment 33 191 224

control 13 214 227

Total 46 405 451

Page 16: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Stent Study

Surprisingly, the point estimate suggests that

patients who received stents may have a higher

risk of stroke:

16

stroke no event Total

treatment 33 191 224

control 13 214 227

Total 46 405 451

33 13ˆ ˆ 0.090

224 227trmt ctrlp p

Page 17: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Example 5

Consider the stent study and results. The conditions

necessary to ensure the point estimate

0.090 may be assumed, and the estimate's standard

error is SE = 0.028. Construct a 95% confidence

interval for the change in 30-day stroke rates from

usage of the stent.

17

stroke no event Total

treatment 33 191 224

control 13 214 227

Total 46 405 451

ˆ ˆtrmt ctrlp p

Page 18: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

95% Confidence Intervals

As with hypothesis tests, confidence intervals are

imperfect. About 1-in-20 properly constructed 95%

confidence intervals will fail to capture the parameter

of interest.

180.045 p = 0.100 0.155

Page 19: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

95% Confidence Intervals

The figure below shows 25 confidence intervals for a

proportion that were constructed from simulations

where the true proportion was p = 0.100. However, 1

of these 25 confidence intervals happened not to

include the true value.

190.045 p = 0.100 0.155

Page 20: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Example 6

In the figure below, one interval does not contain the

true proportion, p = 0.100. Does this imply there was a

problem with the simulations run?

200.045 p = 0.100 0.155

Page 21: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Changing Confidence Levels

Suppose we want to consider confidence intervals

where the confidence level is somewhat higher

than 95%: perhaps we would like a confidence

level of 99%.

Think back to the analogy about trying to catch a

fish: if we want to be more sure that we will catch

the fish, we should use a wider net.

21

Page 22: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Why not 100% Confidence?

The more confident we want to be, the larger the

margin of error must be. To be 100%, however,

usually involves making a statement that is pretty

useless.

For example, I am 100% confident that the average

student will score between 0 and 100 percent on the

final. Completely correct, but not useful.

For another example, see

http://garfield.com/comic/1999-03-12.

22

Page 23: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Why not 1% Confidence?

On the other hand, a confidence interval of, say,

0.00994 to 0.01006, which are the results from a

simulation with 1% confidence, is useful, but not

very likely.

The tension between certainty and precision is

always there. Fortunately, in most cases we can

be both sufficiently certain and sufficiently precise

to make useful statements.

23

Page 24: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Levels

The choice of confidence level is somewhat

arbitrary (the data can't do it for you), but the most

commonly chosen confidence levels are 90%,

95%, and 99%.

Although any percentage can be used, you should

be able to justify your choice. In practice, using

something like 92.9% or 97.2% is likely to make

people think you're up to something.

24

Page 25: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Levels

To make your life easier, you should just

memorize, or have it written somewhere

prominent, the three most common confidence

coefficients:

Confidence Level z*

90% 1.64

95% 1.96

99% 2.58

25

Page 26: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Confidence Levels

In general, if the point estimate follows the normal

model with standard error SE, then a confidence

interval for the population parameter is

point estimate ± z*SE

where z* corresponds to the confidence level

selected.

In a confidence interval, z*SE is called the margin

of error.

26

Page 27: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Example 7

Create a 99% confidence interval for the impact of

the stent on the risk of stroke using the data below.

The point estimate is 0.090, and the standard error

is SE = 0.028. It has been verified for you that the

point estimate can reasonably be modeled by a

normal distribution.

27

stroke no event Total

treatment 33 191 224

control 13 214 227

Total 46 405 451

Page 28: MATH& 146 Lesson 21 - Amazon S3 · MATH& 146 Lesson 21 Section 2.8 Confidence Intervals 1. Confidence Intervals Recall that a point estimate is a statistic calculated from a sample

Interpreting Confidence

Intervals

A careful eye might have observed the somewhat

awkward language used to describe confidence

intervals. Correct interpretation:

We are XX% confident that the population

parameter is between...

28