Sampling Distribution & Confidence...

12
Sampling Distribution & Confidence Interval CI -1 1 A Normal Distribution Example: Consider the distribution of serum cholesterol levels for 40- to 70-year-old males living in community A has a mean of 211 mg/100 ml, and the standard deviation of 46 mg/100 ml. If an individual is selected from this population, what is the probability that his/her serum cholesterol level is higher than 225? 2 P(X > 225) = ? 225 0 z x 211 X ~ N (µ = 211, σ = 46) .30 225 211 46 = .30 z = .382 .382 3 Statistical Inference 4 Inferential Statistics 1. Type of Inference: Estimation Hypothesis Testing 2. Purpose Make Decisions about Population Characteristics Population? Population? 5 Inference Process Decision Decision & & Conclusion Conclusion Identify Identify Population Population Find Find Representative Representative Sample Sample Sample Sample Statistic Statistic Estimates Estimates & Tests & Tests 6 Statistics Used to Estimate Population Parameters Sample Mean, Sample Variance, s 2 Sample Proportion, Estimators p ˆ x µ population mean σ 2 population variance p population proportion Parameters Statistics

Transcript of Sampling Distribution & Confidence...

Page 1: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 1

1

A Normal Distribution

Example: Consider the distribution of serum cholesterol levels for 40- to 70-year-old males living in community A has a mean of 211 mg/100 ml, and the standard deviationof 46 mg/100 ml. If an individual is selected from this population, what is the probability that his/her serum cholesterol level is higher than 225?

2

P(X > 225) = ?

225

0z

x211

X ~ N (µ = 211, σ = 46)

.30

225 − 211

46= .30z =

.382

.382

3

Statistical Inference

4

Inferential Statistics

1. Type of Inference:EstimationHypothesis Testing

2. PurposeMake Decisions about Population Characteristics

Population?Population?

5

Inference Process

Decision Decision & &

ConclusionConclusion

Identify Identify PopulationPopulation

Find Find Representative Representative

SampleSample

Sample Sample StatisticStatistic

Estimates Estimates & Tests& Tests

6

Statistics Used to Estimate Population Parameters

Sample Mean,

Sample Variance, s2

Sample Proportion, …

Estimators

x µ population mean

σ 2 population variance

p population proportion

ParametersStatistics

Page 2: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 2

7

Sampling Distribution

Theoretical Probability Distribution of the Sample Statistic.

What is the Shape of this distribution?

What are the values of the parameters such as mean and standard deviation?

8

Probability Related to Mean

Example: Consider the distribution of serum cholesterol levels for 40- to 70-year-old males living in community A has a mean of 211 mg/100 ml, and the standard deviation of 46 mg/100 ml. If a random sample of 100 individuals is taken from this population, what is the probability that the averageserum cholesterol level of these 100 individuals is higher than 225?

9

P(X > 225) = ? What is probability that mean of the sample is greater than 225?

225

?x

?)?,( ? ==→ xxX σµ

What is the sampling distribution of sample mean?

10

Sampling Distribution of The MeanIf a random sample is taken from a population that has a mean µ and a standard deviation σ, the sampling distribution of the sample mean, x, will have a mean that is the same as the population mean, and will have a standard deviation that is equal to the standard deviation of the population divided by the square root of the sample size.

σσ

x n=σ

σx n

=µ µx =µ µx =

11

Sampling Distribution

σ = 2

µ = 8x

Population Distribution

µ = 8x

4.0252

==xσ 4.0252

==xσSampling Distribution of Mean (Sample size n=25)

12

Standard Error of Mean

1. Formula

2. Standard Deviation of the sampling distribution of the Sample Means,⎯X

3. Less Than Pop. Standard Deviation

ns

nx ≈=σσ

ns

nx ≈=σσ

σσ<

nσσ

<n

Page 3: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 3

13

Distribution Shape

What is the shape of the sampling distribution of mean?

A theorem of sampling distribution of mean:

If the population to be sampled is normallydistributed then the sampling distribution of mean would be normally distributed.

14

P(X > 225) = ? Cholesterol Level has a mean 211, s.d. 46.

If the population is normally distributed, the sampling distribution of the mean is normally distributed.

6.410046

211

===

==

nx

x

σσ

µµParameters of the sampling distribution of the mean:

211

x100)6.4,211(~

===

nNX xx σµ

15

Central Limit Theorem

What if the population sampled is not normally distributed?

16

Central Limit Theorem

If a relative large random sample is taken from a population that has a mean µ and a standard deviation σ, regardless of the distribution of the population, the distribution of the sample means is approximately normal with

σσ

x n=σ

σx n

=

µ µx =µ µx =

17

X

Central Limit Theorem

As As sample sample size gets size gets large large enough enough (n (n ≥≥ 30) ...30) ...

sampling sampling distribution distribution becomes becomes almost almost normal.normal.

σσ

x n=σ

σx n

=

µ µx =µ µx =18

µ = 50

σ = 10

Xµ = 50

σ = 10

X

Sampling from Non-Normal Populations

Mean

Standard Error

Mean

Standard Error

Population DistributionPopulation DistributionPopulation Distribution

σσ

x n=σ

σx n

=

µ µx =µ µx =

σσ XX = 5= 5

µX = 50- XµX = 50- X

Sampling Distribution ?Sampling Distribution ?Sampling Distribution ?

n = 4n = 4

Page 4: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 4

19

µ = 50

σ = 10

Xµ = 50

σ = 10

X

Sampling from Non-Normal Populations

Mean

Standard Error

Mean

Standard Error

Population DistributionPopulation DistributionPopulation Distribution

σσ

x n=σ

σx n

=

µ µx =µ µx = Sampling Distribution ?Sampling Distribution ?Sampling Distribution ?

σσ XX = 1.8= 1.8

µ =50XX

n = 30n = 30

20

A Random Sample from Population

Population mean = 19.9, standard deviation = 12.6

Random Sample of Size 400 from Population

110.0100.0

90.080.0

70.060.0

50.040.0

30.020.0

10.00.0

120

100

80

60

40

20

0

Std. Dev = 12.92 Mean = 20.7N = 400.00

21

Simulated Sampling Distribution of Means

SIZE2

77.073.0

69.065.0

61.057.0

53.049.0

45.041.0

37.033.0

29.025.0

21.017.0

13.09.05.01.0

70

60

50

40

30

20

10

0

Std. Dev = 8.88 Mean = 20.3N = 400.00

n=2 SIZE4

77.073.0

69.065.0

61.057.0

53.049.0

45.041.0

37.033.0

29.025.0

21.017.0

13.09.05.01.0

70

60

50

40

30

20

10

0

Std. Dev = 5.40 Mean = 19.4N = 400.00

n=4 SIZE10

77.073.0

69.065.0

61.057.0

53.049.0

45.041.0

37.033.0

29.025.0

21.017.0

13.09.05.01.0

100

80

60

40

20

0

Std. Dev = 4.32 Mean = 19.9N = 400.00

n=10

SIZE25

77.0073.00

69.0065.00

61.0057.00

53.0049.00

45.0041.00

37.0033.00

29.0025.00

21.0017.00

13.009.00

5.001.00

200

100

0

Std. Dev = 2.23 Mean = 19.84N = 400.00

n=25 SIZE50

77.0073.00

69.0065.00

61.0057.00

53.0049.00

45.0041.00

37.0033.00

29.0025.00

21.0017.00

13.009.00

5.001.00

200

100

0

Std. Dev = 1.64 Mean = 19.75N = 400.00

n=50 SIZE100

77.0073.00

69.0065.00

61.0057.00

53.0049.00

45.0041.00

37.0033.00

29.0025.00

21.0017.00

13.009.00

5.001.00

300

200

100

0

Std. Dev = 1.20 Mean = 19.81N = 400.00

n=100

22

Probability Related to Mean

Example: Consider the distribution of serum cholesterol levels for 40- to 70-year-old males living in community A has a mean of 211 mg/100 ml, and the standard deviation of 46 mg/100 ml. If a random sample of 100 individuals is taken from this population, what is the probability that the average serum cholesterol level of these 100 individuals is higher than 225?

23

P(X > 225) = ?

225

3.04

225 − 211

4.6= 3.04

.001

Cholesterol Level has a mean 211, s.d. 46.

001.0 )04.3()225(

=>=> ZPXP

211

n = 100

x

)6.4,211( ==→ xxNX σµ

0z

24

Introduction to Estimation

Confidence Intervals Confidence Intervals &&

Sample SizeSample Size

Page 5: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 5

25

Disadvantage of Point Estimation

1. Provides Single ValueBased on Observations from 1 Sample. * Sample Mean⎯X = 98 Is a Point Estimate of Unknown Population Mean.

2. Gives No Information about How Close Value Is to the Unknown Population Parameter

Which of the following statistics do you prefer?a. 32%b. 32% with a margin of error 3%

26

Estimation

You’re interested in finding the average body temperature of healthy adults in Northeastern Ohio (the population). What would you do?How can we estimate this average with a measure of reliability?

98 ± 1 F° 98 ± .5 F° 98 ± .2 F°

27

Interval Estimation

Margin of Error Gives Information about How Close Value Is to the Unknown Population Parameter.

28

Sampling Error

Sample statistic Sample statistic (point estimate)(point estimate)

Sampling Error = | µ – x |

29

Key Elements of Interval Estimation

Sample statistic Sample statistic (point estimate)(point estimate)

Confidence Confidence limit (lower)limit (lower)

Confidence Confidence limit (upper)limit (upper)

Confidence Confidence intervalinterval

Confidence LevelConfidence Level: A : A probabilityprobability that the that the population parameter falls somewhere population parameter falls somewhere within the interval.within the interval.

x ± Margin of Error

98 ± 1 F°

30

Sampling Distribution of the Mean

σσxx__

⎯⎯XXµµ

The sampling distribution is normal when sampled from normally distributed populationor having a relatively large sample.

Page 6: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 6

31

Sampling Distribution of the Mean

σσxx__

⎯⎯XXµµ

Within how many standard deviations of the mean will have 95% of the sampling distribution?

µµ -- ??σσ⎯⎯xx µµ + ?+ ?σσ⎯⎯xx

.025

.025.95

32

A Special Notation

Z .05 .06 .07

1.8 .032 .031 .031

1.9 .026 .025 .024

2.0 .020 .020 .019

2.1 .016 .015 .015

Z .05 .06 .07

1.8 .032 .031 .031

1.9 .026 .025 .024

2.0 .020 .020 .019

2.1 .016 .015 .015

zα = the z score that the proportion of the standard normal distribution to the right of it is α.

z.025 = ?

0 z.025

z.010 = ?

1.96

.025

33

The Confidence Interval

95% Sample 95% Sample MeansMeans

σσxx__

⎯⎯XX

µµ + 1.96+ 1.96σσ⎯⎯xxµµ -- 1.961.96σσ⎯⎯xx

µµ

1- α = .95

Confidence Level

α/2 α/2 = .025

1.96 = z.025

x + 1.96x + 1.96σσ⎯⎯xxx x -- 1.961.96σσ⎯⎯xx

x

Confidence Interval =>34

(1-α)·100% Confidence Interval Estimate for mean of a normal population

or

) , ( 2/2/ nZX

nZX σσ

αα ⋅+⋅− ) , ( 2/2/ nZX

nZX σσ

αα ⋅+⋅−

2/ nZX σ

α ⋅± 2/ nZX σ

α ⋅±Margin of Error

Confidence Interval for Mean (σ Known)

“σ Known” may mean that we have very good estimate of σ. It is not practical to assume that we know σ.

35

Confidence Interval of Mean (σ unKnown and n ≥ 30)

(1-α)·100% Confidence Interval Estimate for mean of a population when sample size is relative large

or

),( 2/2/ nsZX

nsZX ⋅+⋅− αα ),( 2/2/

nsZX

nsZX ⋅+⋅− αα

n

sZX ⋅± 2/α n

sZX ⋅± 2/α

36

The Confidence Interval

95% Samples95% Samples

σσxx__

⎯⎯XX

µµ + 1.96+ 1.96σσ⎯⎯xxµµ -- 1.961.96σσ⎯⎯xx

µµ

x x -- 1.961.96σσ⎯⎯xx x + 1.96x + 1.96σσ⎯⎯xx

x

Confidence Interval =>

95% Confidence Interval

Page 7: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 7

37

95% Samples95% Samples

σσxx__

⎯⎯XXµµ

2.5%2.5%

95 % of 95 % of intervals intervals contain contain µµ. . 5% do not.5% do not.

The Confidence Interval

38

Factors Affecting Interval Width

1.Data DispersionMeasured by σ

2.Sample SizeAffects standard error:

3.Level of Confidence (1 - α)Affects Zα/2

nx

σσ = nx

σσ =

) ( 22 nzX,

nzX //

σσαα ⋅+⋅− ) ( 22 n

zX,n

zX //σσ

αα ⋅+⋅−

39

90% Samples90% Samples

95% Samples95% Samples

99% Samples99% Samples

µµ + 1.65+ 1.65σσ x x µµ + 2.58+ 2.58σσxx

σσxx__

⎯⎯XX

µµ+1.96+1.96σσ xx

µµ -- 2.582.58σσ xx µµ -- 1.651.65σσxxµµ--1.961.96σσ xx

µµ

Size of Interval

40

Estimation Example Mean (σ Known)

The average weight of a random sample of n = 25 subjects is⎯X = 140. Set up a 95% confidence interval estimate for µ if σ = 10. (Assume Normal population.)

3.92140or ) 92.341 , 08.631 (

) 25

1096.1041 , 25

1096.1041 (

) , (

1.96. z .025, 2 .05, ,95.1

2/2/

2

±

⋅+⋅−

⋅+⋅−

====−

nZX

nZX σσ

ααα

αα

α

3.92140or ) 92.341 , 08.631 (

) 25

1096.1041 , 25

1096.1041 (

) , (

1.96. z .025, 2 .05, ,95.1

2/2/

2

±

⋅+⋅−

⋅+⋅−

====−

nZX

nZX σσ

ααα

αα

α

2/ nZX σ

α ⋅± 2/ nZX σ

α ⋅±

143.92) (136.08,

92.3 140 25

1096.1401

±⇒⋅±

143.92) (136.08,

92.3 140 25

1096.1401

±⇒⋅±

41

Interpretation

We can be 95% confident that the population mean is in (136.08, 143.92).

We can be 95% confident that the maximum sampling error using this interval estimate for estimating mean is within 3.92.

42

Confidence Interval of Mean (σ unKnown and n ≥ 30)

(1-α)·100% Confidence Interval Estimate for mean of a population when sample size is relative large

or

),( 2/2/ nsZX

nsZX ⋅+⋅− αα ),( 2/2/

nsZX

nsZX ⋅+⋅− αα

n

sZX ⋅± 2/α n

sZX ⋅± 2/α

Page 8: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 8

43

Thinking Challenge

Example: A city uses a certain noise index to monitor the noise pollution at a certain area of the city. A random sample of 100 observations from randomly selected days around noon showed an average indexvalue of x = 1.99 and standard deviation s = 0.05. Find the 90%confidence interval estimate of the average noise index at noon.

44

Confidence Interval Solution*

) 998.1 , 982.1 (

0.008 1.9910005.64.199.1

±⇒⋅±

) 998.1 , 982.1 (

0.008 1.9910005.64.199.1

±⇒⋅±

1.64 ZZ.05 /2 .1, 90.1 .90, 1

2/

.052 /

nsZX ⋅±

====−==−

α

α

ααα

1.64 ZZ.05 /2 .1, 90.1 .90, 1

2/

.052 /

nsZX ⋅±

====−==−

α

α

ααα

45

Interval Estimation for Mean

In a survey on a random sample of 64 individuals who gambled at Las Vegas, the average amount of money won for the day that survey was done is –$25.50 with a standard deviation of $100. Find the 95% confidence interval estimate for the average amount of money won by people gambled at Las Vegas that day.

46

Finding Sample Sizes for Estimating µ

I don’t want to sample too much or too little!

2

22

2

2

2

Error ofMargin

nzx :C.I.

B

zn

nZB

σ

σ

σ

α

α

α

⋅=

⋅==

⋅±

2

22

2

2

2

Error ofMargin

nzx :C.I.

B

zn

nZB

σ

σ

σ

α

α

α

⋅=

⋅==

⋅±

BB = Margin of Error or Bound= Margin of Error or Bound

47

Sample Size Example

What sample size is needed to be 90% confident of being correct within ± 5? A pilot study suggested that the standard deviation is 45.

( ) ( )( )

2202.2195

45645.12

22

2

22

05. ≅===B

Zn σ ( ) ( )( )

2202.2195

45645.12

22

2

22

05. ≅===B

Zn σ

48

Thinking Challenge

You plan to survey residents in your county to find the average health insurance premium that they are paying. You want to be 95% confident that the sample mean is within ± $50. A pilot study showed that σ was about $400. What sample size should you use?

Page 9: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 9

49

Sample Size Solution*

( ) ( )( )

24686.245

5040096.1

2

22

2

22

025.0

≅=

=

=B

Zn σ

( ) ( )( )

24686.245

5040096.1

2

22

2

22

025.0

≅=

=

=B

Zn σ

50

Confidence Interval Mean (σ Unknown & n< 30)

1. AssumptionsPopulation Standard Deviation Is UnknownPopulation Must Be Normally Distributed

2. Use Student’s t Distribution

3. Confidence Interval Estimate

) , ( 1,2/1,2/ nStX

nStX nn ⋅+⋅− −− αα ) , ( 1,2/1,2/ n

StXn

StX nn ⋅+⋅− −− αα

nStX

n ⋅±

− 1,2α

51

tt

Student’s t Distribution

00

t (t (dfdf = 5)= 5)

ZZ

Standard Standard Normal (Z)Normal (Z)

BellBell--ShapedShaped

SymmetricSymmetric

‘‘FatterFatter’’ TailsTails

t (t (dfdf = 13)= 13)n

sxt µ−=

52

Student’s t Table

t valuest valuest0 t0

.05.05

For a 90% C.I.: For a 90% C.I.: nn = 3= 3dfdf = = nn -- 1 = 21 = 2αα = .10= .10αα/2 =.05/2 =.05ttαα/2/2 = ?= ?

2.9202.920

53

Estimation Example Mean (σ Unknown)

A random sample of weights of 25 subjects, has a sample mean 140 and sample standard deviation 8. Set up a 95% confidence interval estimate for µ.

) 31.341 , 69.631 (

3.31 140 258064.2041

±⇒⋅±

) 31.341 , 69.631 (

3.31 140 258064.2041

±⇒⋅±

064.2 .025, /2 .05,.951 .95, 1

025.024 , /2 ====−==−

= tt dfα

ααα064.2

.025, /2 .05,.951 .95, 1

025.024 , /2 ====−==−

= tt dfα

ααα

1,2/ nStX n ⋅± −α 1,2/ nStX n ⋅± −α

54

Thinking Challenge

The numbers of community hospital beds per 1000 population that are available in each different regions of the country is normally distributed. A random sample 6 regions were selected and the rates of beds per 1000 were recorded and they are

3.6, 4.2, 4.0, 3.5, 3.8, 3.1.Find the 90% confidence interval estimate of the mean bed-rate in the country.

Page 10: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 10

55

Confidence Interval Solution*

= 3.7 s = 0.38987 x

1592.6

38987.==

ns

(use 90% confidence level)(use 90% confidence level)

n = 6, df = n − 1 = 6 − 1 = 5

t.05,5 = 2.015

( 3.7 - (2.015)(0.1592), 3.7 + (2.015)(0.1592) )

( 3.379, 4.021 )

nStX n ⋅± −1 ,2/α

56

Confidence interval with z-score: The (1− α)% confidence interval estimate for population mean:Assumption: If sampled from normal population with known variance, σ,

Assumption: If large sample and if unknown variance, s replaces σ,

nzx σ

α ⋅± 2/

nszx ⋅± 2/α

57

Confidence interval with t-score: The (1− α)% confidence interval estimate for population mean:Assumption: If sampled from normal population with unknown variance, σ,

nstx ndf ⋅± −= 1 ,2/α

(If sample size is large the normality assumption is insignificant.)

t → z as sample becomes large58

Average Weight for Female Ten Year Children In US

Info. from a random sample: n = 10, x = 80 lb, s = 18.05 lb, assume weight is normally distributed, find the 95% confidence interval estimate for average weight.

Data: 73.80 50.00 101.40 67.20 102.20 97.80 81.00 93.40 63.20 70.00

How do we know whether normality assumption is OK?

59

Tests of Normality

.171 10 .200* .930 10 .452ght (pounds) of participantStatistic df Sig. Statistic df Sig.

Kolmogorov-Smirnova Shapiro-Wilk

This is a lower bound of the true significance..

Lilliefors Significance Correction.

Both are greater than 0.05, normality assumption is acceptable.

60

Average Weight for Female Ten Year Children In US

Info. from a random sample: n = 10, x = 80 lb, s = 18.05 lb, assume weight is normally distributed, find the 95% confidence interval estimate for average weight.

tα/2 = t.05/2 = t0.25 , d.f. = 10 – 1 = 9, t0.25, 9 = 2.262

1005.18262.2809,2/ ⋅±⇒⋅± = n

stx dfα

)91.92 ,09.67( 91.1280 ⇒±

Page 11: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 11

61

Descriptives

80.0000 5.70840

67.0867

92.9133

80.4333

77.4000

325.858

18.05153

50.00

102.20

52.20

32.5000

-.148 .687

-1.229 1.334

86.8600 3.96048

77.9008

95.8192

Mean

Lower Bound

Upper Bound

95% ConfidenceInterval for Mean

5% Trimmed Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Kurtosis

Mean

Lower Bound

Upper Bound

95% ConfidenceInterval for Mean

What is your sex?female

male

ht (pounds)articipant

Statistic Std. Error

80 ± 12.91Weight for Ten Year Old

62

Confidence Interval Estimate of Proportion

63

Proportion Estimation

Parameter: Population Proportion p (or π)(Percentage of people has no health insurance)

Statistic: Sample Proportion nxp =ˆ

x is number of successes n is sample size

64

Confidence Interval Proportion

1. AssumptionsTwo Categorical OutcomesNormal Approximation Can Be Used If np and n(1 – p) are both greater than 5.

) )ˆ1(ˆˆ , )ˆ1(ˆˆ ( 22 nppzp

nppzp −⋅

⋅+−⋅

⋅− αα ) )ˆ1(ˆˆ , )ˆ1(ˆˆ ( 22 nppzp

nppzp −⋅

⋅+−⋅

⋅− αα

2. Confidence Interval Estimate 2. Confidence Interval Estimate (for large sample)(for large sample)

nppp )ˆ1(ˆzˆ

2

−⋅⋅± α n

ppp )ˆ1(ˆzˆ2

−⋅⋅± α

65

Estimation Example Proportion

A random sample of 400 from a large community showed that 32 have diabetes. Set up a 95% confidence interval estimate for p, the percentage of people that have diabetes.

96.1,4000840032ˆ 025.2/ ===== zzn.p α , 96.1,40008

40032ˆ 025.2/ ===== zzn.p α ,

66

Estimation Example Proportion

The 95% C.I. for p, the percentage of people that have diabetes:

) 107. , 053. ( %7.2%8 .027 .08 ⇒±⇒± ) 107. , 053. ( %7.2%8 .027 .08 ⇒±⇒±

400

)08.1(08.96.108. −⋅⋅±

400)08.1(08.96.108. −⋅

⋅±

)ˆ1(ˆˆ 2/ nppZp −⋅

⋅± α )ˆ1(ˆˆ 2/ nppZp −⋅

⋅± α

400 ,0840032ˆ === n.p 400 ,08

40032ˆ === n.p

Page 12: Sampling Distribution & Confidence Intervalgchang.people.ysu.edu/class/mph/note/06_5_CI_Ch8_9_CLT_CI.pdfSampling Distribution & Confidence Interval CI -1 1 ... Sample Proportion, ...

Sampling Distribution & Confidence Interval

CI - 12

67

Thinking Challenge

A member of a health department wish to see what percentage of people in a community will support an environmental policy. Of 200 survey forms sent and received, 35 responded that they support the policy and the rest of them do not support the policy. Find a 90% confidence interval estimate of the percentage of the population in this community that support the policy?

68

Confidence Interval Solution*

) %92.21 , %08.13 ( 4.42%17.5%0442. .175

=±=±

) %92.21 , %08.13 ( 4.42%17.5%0442. .175

=±=±

645.1 ,200 175.20035ˆ 2/ ==== αzn,p 645.1 ,200 175.

20035ˆ 2/ ==== αzn,p

)ˆ1(ˆˆ 2/ nppzp −⋅

⋅± α )ˆ1(ˆˆ 2/ nppzp −⋅

⋅± α

200

)825(.175.645.1175. ⋅⋅±

200)825(.175.645.1175. ⋅

⋅±

69

Example:Researchers wish to estimate the percentage of hospital employees infected by SARS in a certain country. Out of 500 randomly chosen hospital employees, 14 were infected. Find the 95% confidence interval estimate for percentage of hospital employees infected by SARS in this country.

70

Sample Size

25.0

or

2

2

2 ⋅=B

zn

α 25.0

or

2

2

2 ⋅=B

zn

α to get the largest sample to achieve the goal.

nppp )ˆ1(ˆ

zˆ :C.I.2

−⋅⋅± α n

ppp )ˆ1(ˆzˆ :C.I.

2

−⋅⋅± α

nppZB )ˆ1(ˆ

Error ofMargin 2

−⋅⋅== α n

ppZB )ˆ1(ˆError ofMargin

2

−⋅⋅== α

if pilot study is done.

)ˆ1(ˆ2

2

2 ppB

zn −⋅⋅=

α

)ˆ1(ˆ2

2

2 ppB

zn −⋅⋅=

α

71

Sample Size (No prior information on p)

Sample Size Example: If one wishes to do a survey to estimate the population proportion with 95% confidence and a margin of error of 3%, how large a sample is needed? Zα/2 = 1.96; B = .03n = (1.962/.032) x .25 = 1067.11 A sample of size 1068 is needed.

72

Sample Size (With prior information on p)

Sample Size Example: If one wishes to to estimate the percentage of people infected with West Nile in a population with 95% confidence and a margin of error of 3%, how large a sample is needed? (A pilot study has been done, and the sample proportion was 6%.)Zα/2 = 1.96; B = .03n = (1.962/.032) x .06 x (1 – .06) = 240.7 A sample of size 241 is needed.

How large a sample was used for pilot study?