Chapter 6 Estimation

PROBABILITY (6MTCOAE205)

Chapter 6

Estimation

Confidence Intervals

Contents of this chapter: Confidence Intervals for the Population

Mean, μ when Population Variance σ2 is Known when Population Variance σ2 is Unknown

Confidence Intervals for the Population Proportion, (large samples)

Confidence interval estimates for the variance of a normal population

p̂

Assist. Prof. Dr. İmran Göker 6-2

Definitions

An estimator of a population parameter is a random variable that depends on sample

information . . . whose value provides an approximation to this

unknown parameter

A specific value of that random variable is called an estimate


Point and Interval Estimates

A point estimate is a single number, a confidence interval provides additional

information about variability

Point EstimateLower Confidence Limit

UpperConfidence Limit

Width of confidence interval


Point Estimates

We can estimate a Population Parameter …

with a SampleStatistic

(a Point Estimate)

Mean

Proportion P

xμ

p̂


Unbiasedness

A point estimator is said to be an unbiased estimator of the parameter if the expected value, or mean, of the sampling distribution of is ,

Examples: The sample mean is an unbiased estimator of μ The sample variance s2 is an unbiased estimator of σ2

The sample proportion is an unbiased estimator of P

θ̂

θ̂

θ)θE( ˆ


x

p̂

Unbiasedness

is an unbiased estimator, is biased:

1θ̂ 2θ̂

θ̂θ

1θ̂ 2θ̂

(continued)


Bias

Let be an estimator of

The bias in is defined as the difference between its mean and

The bias of an unbiased estimator is 0

θ̂

θ̂

θ)θE()θBias( ˆˆ


Most Efficient Estimator

Suppose there are several unbiased estimators of The most efficient estimator or the minimum variance

unbiased estimator of is the unbiased estimator with the smallest variance

Let and be two unbiased estimators of , based on the same number of sample observations. Then,

is said to be more efficient than if

The relative efficiency of with respect to is the ratio of their variances:

)θVar()θVar( 21ˆˆ

)θVar()θVar( Efficiency Relative

1

2

ˆˆ

1θ̂ 2θ̂

1θ̂ 2θ̂

1θ̂ 2θ̂



How much uncertainty is associated with a point estimate of a population parameter?

An interval estimate provides more information about a population characteristic than does a point estimate

Such interval estimates are called confidence intervals


Confidence Interval Estimate

An interval gives a range of values: Takes into consideration variation in sample

statistics from sample to sample Based on observation from 1 sample Gives information about closeness to

unknown population parameters Stated in terms of level of confidence

Can never be 100% confident


Confidence Interval and Confidence Level

If P(a < < b) = 1 - then the interval from a to b is called a 100(1 - )% confidence interval of .

The quantity (1 - ) is called the confidence level of the interval ( between 0 and 1)

In repeated samples of the population, the true value of the parameter would be contained in 100(1 - )% of intervals calculated this way.

The confidence interval calculated in this manner is written as a < < b with 100(1 - )% confidence


Estimation Process

(mean, μ, is unknown)

Population

Random Sample

Mean X = 50

Sample

I am 95% confident that μ is between 40 & 60.


Confidence Level, (1-)

Suppose confidence level = 95% Also written (1 - ) = 0.95 A relative frequency interpretation:

From repeated samples, 95% of all the confidence intervals that can be constructed will contain the unknown true parameter

A specific interval either will contain or will not contain the true parameter No probability involved in a specific interval

(continued)


General Formula

The general formula for all confidence intervals is:

The value of the reliability factor depends on the desired level of confidence

Point Estimate ± (Reliability Factor)(Standard Error)



Population Mean

σ2 Unknown

ConfidenceIntervals

PopulationProportion

σ2 Known


PopulationVariance

Confidence Interval for μ(σ2 Known)

Assumptions Population variance σ2 is known Population is normally distributed If population is not normal, use large sample

Confidence interval estimate:

(where z/2 is the normal distribution value for a probability of /2 in each tail)

nσzxμ

nσzx α/2α/2


Margin of Error The confidence interval,

Can also be written aswhere ME is called the margin of error

The interval width, w, is equal to twice the margin of error

nσzxμ

nσzx α/2α/2

MEx ±

nσzME α/2


Reducing the Margin of Error

The margin of error can be reduced if

the population standard deviation can be reduced (σ↓)

The sample size is increased (n↑)

The confidence level is decreased, (1 – ) ↓

nσzME α/2


Finding the Reliability Factor, z/2

Consider a 95% confidence interval:

z = -1.96 z = 1.96

.951

.0252α

.0252α

Point EstimateLower Confidence Limit

UpperConfidence Limit

Z units:

X units: Point Estimate

0

Find z.025 = ±1.96 from the standard normal distribution table


Common Levels of Confidence

Commonly used confidence levels are 90%, 95%, and 99%

Confidence Level

Confidence Coefficient,

Z/2 value

1.281.6451.962.332.583.083.27

.80

.90

.95

.98

.99

.998

.999

80%90%95%98%99%99.8%99.9%

1


Intervals and Level of Confidence

μμx


Intervals extend from

to

100(1-)%of intervals constructed contain μ;

100()% do not.

Sampling Distribution of the Mean

nσzxLCL

nσzxUCL

x

x1

x2

/2 /21


Example

A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms.

Determine a 95% confidence interval for the true mean resistance of the population.


Example

A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is .35 ohms.

Solution:

2.4068μ1.9932

.2068 2.20

)11(.35/ 1.96 2.20

nσz x

±

±

±

(continued)


Interpretation

We are 95% confident that the true mean resistance is between 1.9932 and 2.4068 ohms

Although the true mean may or may not be in this interval, 95% of intervals formed in this manner will contain the true mean



Population Mean

σ2 Unknown

ConfidenceIntervals


σ2 Known


PopulationVariance

Student’s t Distribution

Consider a random sample of n observations with mean x and standard deviation s from a normally distributed population with mean μ

Then the variable

follows the Student’s t distribution with (n - 1) degrees of freedom

ns/μxt


Confidence Interval for μ(σ2 Unknown)

If the population standard deviation σ is unknown, we can substitute the sample standard deviation, s

This introduces extra uncertainty, since s is variable from sample to sample

So we use the t distribution instead of the normal distribution


Confidence Interval for μ(σ Unknown)

Assumptions Population standard deviation is unknown Population is normally distributed If population is not normal, use large sample

Use Student’s t Distribution Confidence Interval Estimate:

where tn-1,α/2 is the critical value of the t distribution with n-1 d.f. and an area of α/2 in each tail:

nStxμ

nStx α/21,-nα/21,-n

(continued)

α/2)tP(t α/21,n1n


Margin of Error The confidence interval,

Can also be written as

where ME is called the margin of error:

MEx ±

nσtME α/21,-n


nStxμ

nStx α/21,-nα/21,-n


The t is a family of distributions The t value depends on degrees of

freedom (d.f.) Number of observations that are free to vary after

sample mean has been calculated

d.f. = n - 1



t0

t (df = 5)

t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

Standard Normal

(t with df = ∞)

Note: t Z as n increases


Student’s t Table

Upper Tail Area

df .10 .025.05

1 12.706

2

3 3.182

t0 2.920The body of the table contains t values, not probabilities

Let: n = 3 df = n - 1 = 2 = .10 /2 =.05

/2 = .05

3.078

1.886

1.638

6.314

2.920

2.353

4.303


t distribution valuesWith comparison to the Z value

Confidence t t t Z Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.282

.90 1.812 1.725 1.697 1.645

.95 2.228 2.086 2.042 1.960

.99 3.169 2.845 2.750 2.576

Note: t Z as n increases


Example

A random sample of n = 25 has x = 50 and s = 8. Form a 95% confidence interval for μ

d.f. = n – 1 = 24, so

The confidence interval is

2.0639tt 24,.025α/21,n

53.302μ46.698258(2.0639)50μ

258(2.0639)50

nStxμ

nStx α/21,-n α/21,-n



Population Mean

σ2 Unknown

ConfidenceIntervals


σ2 Known


PopulationVariance

Confidence Intervals for the Population Proportion

An interval estimate for the population proportion ( P ) can be calculated by adding an allowance for uncertainty to the sample proportion ( ) p̂


Confidence Intervals for the Population Proportion, p

Recall that the distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation

We will estimate this with sample data:

(continued)

n)p(1p ˆˆ

nP)P(1σP


Confidence Interval Endpoints

Upper and lower confidence limits for the population proportion are calculated with the formula

where z/2 is the standard normal value for the level of confidence desired is the sample proportion n is the sample size nP(1−P) > 5

n)p(1pzpP

n)p(1pzp α/2α/2

ˆˆˆˆˆˆ

p̂


Example

A random sample of 100 people shows that 25 are left-handed.

Form a 95% confidence interval for the true proportion of left-handers


Example A random sample of 100 people shows

that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.

(continued)

0.3349P0.1651

100.25(.75)1.96

10025P

100.25(.75)1.96

10025

n)p(1pzpP

n)p(1pzp α/2α/2

ˆˆˆˆˆˆ


Interpretation

We are 95% confident that the true percentage of left-handers in the population is between

16.51% and 33.49%.

Although the interval from 0.1651 to 0.3349 may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion.



Population Mean

σ2 Unknown

ConfidenceIntervals


σ2 Known


PopulationVariance

Confidence Intervals for the Population Variance

The confidence interval is based on the sample variance, s2

Assumed: the population is normally distributed

Assist. Prof. Dr. İmran Göker

Goal: Form a confidence interval for the population variance, σ2

6-44



The random variable

2

22

1n σ1)s(n

follows a chi-square distribution with (n – 1) degrees of freedom

(continued)

Where the chi-square value denotes the number for which2

, 1n

αχχ )P( 2α , 1n

21n

6-45



The (1 - )% confidence interval for the population variance is

2/2 - 1 , 1n

22

2/2 , 1n

2 1)s(nσ1)s(n

αα χχ

(continued)

6-46

Example

You are testing the speed of a batch of computer processors. You collect the following data (in Mhz):

Sample size 17Sample mean 3004Sample std dev 74


Assume the population is normal. Determine the 95% confidence interval for σx

2

6-47

Finding the Chi-square Values n = 17 so the chi-square distribution has (n – 1) = 16

degrees of freedom = 0.05, so use the the chi-square values with area

0.025 in each tail:


probability α/2 = .025

216

216= 28.85

6.91

28.85

20.975 , 16

2/2 - 1 , 1n

20.025 , 16

2/2 , 1n

χχ

χχ

α

α

216 = 6.91

probability α/2 = .025

6-48

Calculating the Confidence Limits

The 95% confidence interval is


Converting to standard deviation, we are 95% confident that the population standard deviation of

CPU speed is between 55.1 and 112.6 Mhz

2/2 - 1 , 1n

22

2/2 , 1n

2 1)s(nσ1)s(n

αα χχ

6.911)(74)(17σ

28.851)(74)(17 2

22

12683σ3037 2

6-49

Finite Populations

If the sample size is more than 5% of the population size (and sampling is without replacement) then a finite population correction factor must be used when calculating the standard error


Finite PopulationCorrection Factor

Suppose sampling is without replacement and the sample size is large relative to the population size

Assume the population size is large enough to apply the central limit theorem

Apply the finite population correction factor when estimating the population variance


1NnNfactor correction population finite

Ch.17-51

Estimating the Population Mean

Let a simple random sample of size n be taken from a population of N members with mean μ

The sample mean is an unbiased estimator of the population mean μ

The point estimate is:


n

1iix

n1x

Ch.17-52

Finite Populations: Mean

If the sample size is more than 5% of the population size, an unbiased estimator for the variance of the sample mean is

So the 100(1-α)% confidence interval for the population mean is


1NnN

nsσ

22xˆ

xα/21,-nxα/21,-n σtxμσt-x ˆˆ

Estimating the Population Total

Consider a simple random sample of size n from a population of size N

The quantity to be estimated is the population total Nμ

An unbiased estimation procedure for the population total Nμ yields the point estimate Nx

Assist. Prof. Dr. İmran Göker Ch.17-54

Estimating the Population Total

An unbiased estimator of the variance of the population total is

A 100(1 - )% confidence interval for the population total is


xα/21,-nxα/21,-n σNtxNNμσNtxN ˆˆ

1-Nn)(N

nsNσN

222

x2

ˆ

Ch.17-55

Confidence Interval for Population Total: Example


A firm has a population of 1000 accounts and wishes to estimate the total population value

A sample of 80 accounts is selected with average balance of $87.6 and standard

deviation of $22.3

Find the 95% confidence interval estimate of the total balance

Ch.17-56

Example Solution


The 95% confidence interval for the population total balance is $82,837.53 to $92,362.47

2392.65724559.6σN

5724559.6999920

80(22.3)(1000)

1-Nn)(N

nsNσN

x

22

222

x2

ˆ

ˆ

22.3s 87.6,x 80, n 1000,N

392.6)(1.9905)(26)(1000)(87.σNtxN x79,0.025 ±± ˆ

92362.47Nμ82837.53

Ch.17-57

Estimating the Population Proportion

Let the true population proportion be P Let be the sample proportion from n

observations from a simple random sample The sample proportion, , is an unbiased

estimator of the population proportion, P


p̂

p̂

Ch.17-58

Finite Populations: Proportion

If the sample size is more than 5% of the population size, an unbiased estimator for the variance of the population proportion is

So the 100(1-α)% confidence interval for the population proportion is


1NnN

n)p(1-pσ2

p

ˆˆˆ ˆ

pα/2pα/2 σzpPσz-p ˆˆ ˆˆˆˆ

Estimation: Additional Topics


Chapter Topics

Population Means,

Independent Samples

Population Means,

Dependent Samples

Sample Size Determination

Group 1 vs. independent

Group 2

Same group before vs. after

treatment

Finite Populations

Examples:

Population Proportions

Proportion 1 vs. Proportion 2

6-60

Large Populations


Dependent Samples

Tests Means of 2 Related Populations Paired or matched samples Repeated measures (before/after) Use difference between paired values:

Eliminates Variation Among Subjects Assumptions:

Both Populations Are Normally Distributed


Dependent samples

di = xi - yi

6-61

Mean Difference

The ith paired difference is di , where


di = xi - yi

The point estimate for the population mean paired difference is d : n

dd

n

1ii

n is the number of matched pairs in the sample

1n

)d(dS

n

1i

2i

d

The sample standard deviation is:

Dependent samples

6-62

Confidence Interval forMean Difference

The confidence interval for difference between population means, μd , is


Where n = the sample size (number of matched pairs in the paired sample)

nStdμ

nStd d

α/21,ndd

α/21,n

Dependent samples

6-63

Confidence Interval forMean Difference

The margin of error is

tn-1,/2 is the value from the Student’s t distribution with (n – 1) degrees of freedom for which


(continued)

2α)tP(t α/21,n1n

nstME d

α/21,n

Dependent samples

6-64

Six people sign up for a weight loss program. You collect the following data:


Paired Samples Example

Weight: Person Before (x) After (y) Difference, di

1 136 125 11 2 205 195 10 3 157 150 7 4 138 140 - 2 5 175 165 10 6 166 160 6 42

d = din

4.82

1n)d(d

S2

id

= 7.0

6-65

Dependent samples

For a 95% confidence level, the appropriate t value is tn-1,/2 = t5,.025 = 2.571

The 95% confidence interval for the difference between means, μd , is


12.06μ1.94

64.82(2.571)7μ

64.82(2.571)7

nStdμ

nStd

d

d

dα/21,nd

dα/21,n

Paired Samples Example (continued)

Since this interval contains zero, we cannot be 95% confident, given this limited data, that the weight loss program helps people lose weight

6-66

Dependent samples

Difference Between Two Means:Independent Samples

Different data sources Unrelated Independent

Sample selected from one population has no effect on the sample selected from the other population

The point estimate is the difference between the two sample means:


Population means, independent

samples

Goal: Form a confidence interval for the difference between two population means, μx – μy

x – y6-67



samples

Confidence interval uses z/2

Confidence interval uses a value from the Student’s t distribution

σx2 and σy

2 assumed equal

σx2 and σy

2 known

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

(continued)

6-68

Difference Between Two Means:Independent Samples

σx2 and σy

2 Known



samples

Assumptions:

Samples are randomly and

independently drawn

both population distributions

are normal

Population variances are known

*σx2 and σy

2 known

σx2 and σy

2 unknown

6-69

σx2 and σy

2 Known



samples

…and the random variable

has a standard normal distribution

When σx and σy are known and both populations are

normal, the variance of X – Y is y

2y

x

2x2

YX nσ

nσσ

(continued)

*

Y

2y

X

2x

YX

nσ

nσ

)μ(μ)yx(Z

σx2 and σy

2 known

σx2 and σy

2 unknown

6-70

Confidence Interval, σx

2 and σy2 Known


Population means,

independent samples The confidence interval for

μx – μy is:*

y

2Y

x

2X

α/2YXy

2Y

x

2X

α/2 nσ

nσz)yx(μμ

nσ

nσz)yx(

σx2 and σy

2 known

σx2 and σy

2 unknown

6-71

σx2 and σy

2 Unknown,Assumed Equal



samples

Assumptions:

Samples are randomly and independently drawn

Populations are normally distributed

Population variances are unknown but assumed equal

*σx2 and σy

2 assumed equal

σx2 and σy

2 known

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

6-72

σx2 and σy




samples

(continued)Forming interval estimates:

The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ

use a t value with (nx + ny – 2) degrees of freedom

*σx2 and σy

2 assumed equal

σx2 and σy

2 known

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

6-73

σx2 and σy




samplesThe pooled variance is

(continued)

* 2nn1)s(n1)s(n

syx

2yy

2xx2

p

σx

2 and σy2

assumed equal

σx2 and σy

2 known

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

6-74


2 and σy2 Unknown, Equal


The confidence interval for μ1 – μ2 is:

Where

*σx2 and σy

2 assumed equal

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

y

2p

x

2p

α/22,nnYXy

2p

x

2p

α/22,nn ns

ns

t)yx(μμns

ns

t)yx(yxyx

2nn1)s(n1)s(n

syx

2yy

2xx2

p

6-75

Pooled Variance Example

You are testing two computer processors for speed. Form a confidence interval for the difference in CPU speed. You collect the following speed data (in Mhz):

CPUx CPUyNumber Tested 17 14Sample mean 3004 2538Sample std dev 74 56


Assume both populations are normal with equal variances, and use 95% confidence

6-76

Calculating the Pooled Variance


4427.031)141)-(17

56114741171)n(n

S1nS1nS

22

y

2yy

2xx2

p

(()1x

The pooled variance is:

The t value for a 95% confidence interval is:

2.045tt 0.025 , 29α/2 , 2nn yx

6-77

Calculating the Confidence Limits

The 95% confidence interval is


y

2p

x

2p

α/22,nnYXy

2p

x

2p

α/22,nn ns

ns

t)yx(μμns

ns

t)yx(yxyx

144427.03

174427.03(2.054)2538)(3004μμ

144427.03

174427.03(2.054)2538)(3004 YX

515.31μμ416.69 YX

We are 95% confident that the mean difference in CPU speed is between 416.69 and 515.31 Mhz.

6-78

σx2 and σy

2 Unknown,Assumed Unequal



samples

Assumptions:

Samples are randomly and independently drawn

Populations are normally distributed

Population variances are unknown and assumed unequal

*σx

2 and σy2

assumed equal

σx2 and σy

2 known

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

6-79

σx2 and σy

2 Unknown,Assumed Unequal



samples

(continued)

Forming interval estimates:

The population variances are assumed unequal, so a pooled variance is not appropriate

use a t value with degrees of freedom, where

σx2 and σy

2 known

σx2 and σy

2 unknown

*σx

2 and σy2

assumed equal

σx2 and σy

2 assumed unequal

1)/(nns

1)/(nns

)ns

()ns(

y

2

y

2y

x

2

x

2x

2

y

2y

x

2x

v

6-80


2 and σy2 Unknown, Unequal


The confidence interval for μ1 – μ2 is:

*σx

2 and σy2

assumed equal

σx2 and σy

2 unknown

σx2 and σy

2 assumed unequal

y

2y

x

2x

α/2,YXy

2y

x

2x

α/2, ns

nst)yx(μμ

ns

nst)yx(

1)/(nns

1)/(nns

)ns

()ns(

y

2

y

2y

x

2

x

2x

2

y

2y

x

2x

vWhere

6-81

Two Population Proportions


Goal: Form a confidence interval for the difference between two population proportions, Px – Py

The point estimate for the difference is

Population proportions

Assumptions: Both sample sizes are large (generally at least 40 observations in each sample)

yx pp ˆˆ

6-82

Two Population Proportions

The random variable

is approximately normally distributed



(continued)

y

yy

x

xx

yxyx

n)p(1p

n)p(1p

)p(p)pp(Z

ˆˆˆˆ

ˆˆ

6-83

Confidence Interval forTwo Population Proportions



The confidence limits for Px – Py are:

y

yy

x

xxyx n

)p(1pn

)p(1pZ )pp(ˆˆˆˆˆˆ

2/

±

6-84

Example: Two Population Proportions

Form a 90% confidence interval for the difference between the proportion of men and the proportion of women who have college degrees.

In a random sample, 26 of 50 men and 28 of 40 women had an earned college degree



Men:

Women:


0.101240

0.70(0.30)50

0.52(0.48)n

)p(1pn

)p(1p

y

yy

x

xx

ˆˆˆˆ

0.525026px ˆ

0.704028py ˆ

(continued)

For 90% confidence, Z/2 = 1.645

6-86


The confidence limits are:

so the confidence interval is

-0.3465 < Px – Py < -0.0135

Since this interval does not contain zero we are 90% confident that the two proportions are not equal


(continued)

(0.1012)1.645.70)(.52

n)p(1p

n)p(1pZ)pp(

y

yy

x

xxα/2yx

±

±

ˆˆˆˆˆˆ

6-87



For the Mean

DeterminingSample Size

For theProportion

6-88

Large Populations

Finite Populations

For the Mean

For theProportion

Margin of Error

The required sample size can be found to reach a desired margin of error (ME) with a specified level of confidence (1 - )

The margin of error is also called sampling error the amount of imprecision in the estimate of the

population parameter the amount added and subtracted to the point

estimate to form the confidence interval




nσzx α/2±

nσzME α/2

Margin of Error (sampling error)

6-90

For the Mean

Large Population

s



nσzME α/2

(continued)

2

22α/2

MEσzn Now solve

for n to get

6-91

For the Mean

Large Populations


To determine the required sample size for the mean, you must know:

The desired level of confidence (1 - ), which determines the z/2 value

The acceptable margin of error (sampling error), ME The population standard deviation, σ


(continued)

6-92

Required Sample Size Example

If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence?


(Always round up)

219.195

(45)(1.645)ME

σzn 2

22

2

22α/2

So the required sample size is n = 220

6-93

Sample Size Determination:Population Proportion


n)p(1pzp α/2

ˆˆˆ ±

n)p(1pzME α/2

ˆˆ

Margin of Error (sampling error)

6-94

For theProportion

Large Populations


2

2α/2

MEz 0.25n

Substitute 0.25 for and solve for n to get

(continued)

n)p(1pzME α/2

ˆˆ

cannot be larger than 0.25, when = 0.5

)p(1p ˆˆ

p̂)p(1p ˆˆ

6-95

For theProportion

Large Populations


The sample and population proportions, and P, are generally not known (since no sample has been taken yet)

P(1 – P) = 0.25 generates the largest possible margin of error (so guarantees that the resulting sample size will meet the desired level of confidence)

To determine the required sample size for the proportion, you must know:

The desired level of confidence (1 - ), which determines the critical z/2 value

The acceptable sampling error (margin of error), ME Estimate P(1 – P) = 0.25


(continued)

p̂

6-96


Required Sample Size Example:Population Proportion

How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence?




Solution:

For 95% confidence, use z0.025 = 1.96

ME = 0.03Estimate P(1 – P) = 0.25

So use n = 1068

(continued)

1067.11(0.03)

6)(0.25)(1.9ME

z 0.25n 2

2

2

2α/2

6-98


Sample Size Determination:Finite Populations

Finite Populations

For the Mean

1NnN

nσ)XVar(

2

A finite population correction factor is added:

1. Calculate the required sample size n0 using the prior formula:

2. Then adjust for the finite population:

2

22α/2

0 MEσzn

1)-(NnNnn

0

0

Sample Size Determination:Finite Populations


Finite Populations

For theProportion

1NnN

nP)P(1-)pVar( ˆ

A finite population correction factor is added:

1. Solve for n:

2. The largest possible value for this expression (if P = 0.25) is:

3. A 95% confidence interval will extend ±1.96 from the sample proportion

P)P(11)σ(NP)NP(1n 2

p

ˆ

0.251)σ(NP)0.25(1n 2

p

ˆ

pσ ˆ

Example: Sample Size to Estimate Population Proportion


(continued)

pσ ˆHow large a sample would be necessary to estimate within ±5% the true proportion of college graduates in a population of 850 people with 95% confidence?



Solution: For 95% confidence, use z0.025 = 1.96 ME = 0.05

So use n = 265

(continued)

6-102

0.02551σ0.05σ1.96 pp ˆˆ

264.80.25551)(849)(0.02

)(0.25)(8500.251)σ(N

0.25Nn 22p

max

ˆ

Chapter 6 Estimation

Documents

Transcript of Chapter 6 Estimation