Chapter 6 Estimation
Embed Size (px)
description
Transcript of Chapter 6 Estimation

PROBABILITY (6MTCOAE205)
Chapter 6
Estimation

Confidence Intervals
Contents of this chapter: Confidence Intervals for the Population
Mean, μ when Population Variance σ2 is Known when Population Variance σ2 is Unknown
Confidence Intervals for the Population Proportion, (large samples)
Confidence interval estimates for the variance of a normal population
p̂
Assist. Prof. Dr. İmran Göker 6-2

Definitions
An estimator of a population parameter is a random variable that depends on sample
information . . . whose value provides an approximation to this
unknown parameter
A specific value of that random variable is called an estimate
Assist. Prof. Dr. İmran Göker 6-3

Point and Interval Estimates
A point estimate is a single number, a confidence interval provides additional
information about variability
Point EstimateLower Confidence Limit
UpperConfidence Limit
Width of confidence interval
Assist. Prof. Dr. İmran Göker 6-4

Point Estimates
We can estimate a Population Parameter …
with a SampleStatistic
(a Point Estimate)
Mean
Proportion P
xμ
p̂
Assist. Prof. Dr. İmran Göker 6-5

Unbiasedness
A point estimator is said to be an unbiased estimator of the parameter if the expected value, or mean, of the sampling distribution of is ,
Examples: The sample mean is an unbiased estimator of μ The sample variance s2 is an unbiased estimator of σ2
The sample proportion is an unbiased estimator of P
θ̂
θ̂
θ)θE( ˆ
Assist. Prof. Dr. İmran Göker 6-6
x
p̂

Unbiasedness
is an unbiased estimator, is biased:
1θ̂ 2θ̂
θ̂θ
1θ̂ 2θ̂
(continued)
Assist. Prof. Dr. İmran Göker 6-7

Bias
Let be an estimator of
The bias in is defined as the difference between its mean and
The bias of an unbiased estimator is 0
θ̂
θ̂
θ)θE()θBias( ˆˆ
Assist. Prof. Dr. İmran Göker 6-8

Most Efficient Estimator
Suppose there are several unbiased estimators of The most efficient estimator or the minimum variance
unbiased estimator of is the unbiased estimator with the smallest variance
Let and be two unbiased estimators of , based on the same number of sample observations. Then,
is said to be more efficient than if
The relative efficiency of with respect to is the ratio of their variances:
)θVar()θVar( 21ˆˆ
)θVar()θVar( Efficiency Relative
1
2
ˆˆ
1θ̂ 2θ̂
1θ̂ 2θ̂
1θ̂ 2θ̂
Assist. Prof. Dr. İmran Göker 6-9

Confidence Intervals
How much uncertainty is associated with a point estimate of a population parameter?
An interval estimate provides more information about a population characteristic than does a point estimate
Such interval estimates are called confidence intervals
Assist. Prof. Dr. İmran Göker 6-10

Confidence Interval Estimate
An interval gives a range of values: Takes into consideration variation in sample
statistics from sample to sample Based on observation from 1 sample Gives information about closeness to
unknown population parameters Stated in terms of level of confidence
Can never be 100% confident
Assist. Prof. Dr. İmran Göker 6-11

Confidence Interval and Confidence Level
If P(a < < b) = 1 - then the interval from a to b is called a 100(1 - )% confidence interval of .
The quantity (1 - ) is called the confidence level of the interval ( between 0 and 1)
In repeated samples of the population, the true value of the parameter would be contained in 100(1 - )% of intervals calculated this way.
The confidence interval calculated in this manner is written as a < < b with 100(1 - )% confidence
Assist. Prof. Dr. İmran Göker 6-12

Estimation Process
(mean, μ, is unknown)
Population
Random Sample
Mean X = 50
Sample
I am 95% confident that μ is between 40 & 60.
Assist. Prof. Dr. İmran Göker 6-13

Confidence Level, (1-)
Suppose confidence level = 95% Also written (1 - ) = 0.95 A relative frequency interpretation:
From repeated samples, 95% of all the confidence intervals that can be constructed will contain the unknown true parameter
A specific interval either will contain or will not contain the true parameter No probability involved in a specific interval
(continued)
Assist. Prof. Dr. İmran Göker 6-14

General Formula
The general formula for all confidence intervals is:
The value of the reliability factor depends on the desired level of confidence
Point Estimate ± (Reliability Factor)(Standard Error)
Assist. Prof. Dr. İmran Göker 6-15

Confidence Intervals
Population Mean
σ2 Unknown
ConfidenceIntervals
PopulationProportion
σ2 Known
Assist. Prof. Dr. İmran Göker 6-16
PopulationVariance

Confidence Interval for μ(σ2 Known)
Assumptions Population variance σ2 is known Population is normally distributed If population is not normal, use large sample
Confidence interval estimate:
(where z/2 is the normal distribution value for a probability of /2 in each tail)
nσzxμ
nσzx α/2α/2
Assist. Prof. Dr. İmran Göker 6-17

Margin of Error The confidence interval,
Can also be written aswhere ME is called the margin of error
The interval width, w, is equal to twice the margin of error
nσzxμ
nσzx α/2α/2
MEx ±
nσzME α/2
Assist. Prof. Dr. İmran Göker 6-18

Reducing the Margin of Error
The margin of error can be reduced if
the population standard deviation can be reduced (σ↓)
The sample size is increased (n↑)
The confidence level is decreased, (1 – ) ↓
nσzME α/2
Assist. Prof. Dr. İmran Göker 6-19

Finding the Reliability Factor, z/2
Consider a 95% confidence interval:
z = -1.96 z = 1.96
.951
.0252α
.0252α
Point EstimateLower Confidence Limit
UpperConfidence Limit
Z units:
X units: Point Estimate
0
Find z.025 = ±1.96 from the standard normal distribution table
Assist. Prof. Dr. İmran Göker 6-20

Common Levels of Confidence
Commonly used confidence levels are 90%, 95%, and 99%
Confidence Level
Confidence Coefficient,
Z/2 value
1.281.6451.962.332.583.083.27
.80
.90
.95
.98
.99
.998
.999
80%90%95%98%99%99.8%99.9%
1
Assist. Prof. Dr. İmran Göker 6-21

Intervals and Level of Confidence
μμx
Confidence Intervals
Intervals extend from
to
100(1-)%of intervals constructed contain μ;
100()% do not.
Sampling Distribution of the Mean
nσzxLCL
nσzxUCL
x
x1
x2
/2 /21
Assist. Prof. Dr. İmran Göker 6-22

Example
A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms.
Determine a 95% confidence interval for the true mean resistance of the population.
Assist. Prof. Dr. İmran Göker 6-23

Example
A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is .35 ohms.
Solution:
2.4068μ1.9932
.2068 2.20
)11(.35/ 1.96 2.20
nσz x
±
±
±
(continued)
Assist. Prof. Dr. İmran Göker 6-24

Interpretation
We are 95% confident that the true mean resistance is between 1.9932 and 2.4068 ohms
Although the true mean may or may not be in this interval, 95% of intervals formed in this manner will contain the true mean
Assist. Prof. Dr. İmran Göker 6-25

Confidence Intervals
Population Mean
σ2 Unknown
ConfidenceIntervals
PopulationProportion
σ2 Known
Assist. Prof. Dr. İmran Göker 6-26
PopulationVariance

Student’s t Distribution
Consider a random sample of n observations with mean x and standard deviation s from a normally distributed population with mean μ
Then the variable
follows the Student’s t distribution with (n - 1) degrees of freedom
ns/μxt
Assist. Prof. Dr. İmran Göker 6-27

Confidence Interval for μ(σ2 Unknown)
If the population standard deviation σ is unknown, we can substitute the sample standard deviation, s
This introduces extra uncertainty, since s is variable from sample to sample
So we use the t distribution instead of the normal distribution
Assist. Prof. Dr. İmran Göker 6-28

Confidence Interval for μ(σ Unknown)
Assumptions Population standard deviation is unknown Population is normally distributed If population is not normal, use large sample
Use Student’s t Distribution Confidence Interval Estimate:
where tn-1,α/2 is the critical value of the t distribution with n-1 d.f. and an area of α/2 in each tail:
nStxμ
nStx α/21,-nα/21,-n
(continued)
α/2)tP(t α/21,n1n
Assist. Prof. Dr. İmran Göker 6-29

Margin of Error The confidence interval,
Can also be written as
where ME is called the margin of error:
MEx ±
nσtME α/21,-n
Assist. Prof. Dr. İmran Göker 6-30
nStxμ
nStx α/21,-nα/21,-n

Student’s t Distribution
The t is a family of distributions The t value depends on degrees of
freedom (d.f.) Number of observations that are free to vary after
sample mean has been calculated
d.f. = n - 1
Assist. Prof. Dr. İmran Göker 6-31

Student’s t Distribution
t0
t (df = 5)
t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal
Standard Normal
(t with df = ∞)
Note: t Z as n increases
Assist. Prof. Dr. İmran Göker 6-32

Student’s t Table
Upper Tail Area
df .10 .025.05
1 12.706
2
3 3.182
t0 2.920The body of the table contains t values, not probabilities
Let: n = 3 df = n - 1 = 2 = .10 /2 =.05
/2 = .05
3.078
1.886
1.638
6.314
2.920
2.353
4.303
Assist. Prof. Dr. İmran Göker 6-33

t distribution valuesWith comparison to the Z value
Confidence t t t Z Level (10 d.f.) (20 d.f.) (30 d.f.) ____
.80 1.372 1.325 1.310 1.282
.90 1.812 1.725 1.697 1.645
.95 2.228 2.086 2.042 1.960
.99 3.169 2.845 2.750 2.576
Note: t Z as n increases
Assist. Prof. Dr. İmran Göker 6-34

Example
A random sample of n = 25 has x = 50 and s = 8. Form a 95% confidence interval for μ
d.f. = n – 1 = 24, so
The confidence interval is
2.0639tt 24,.025α/21,n
53.302μ46.698258(2.0639)50μ
258(2.0639)50
nStxμ
nStx α/21,-n α/21,-n
Assist. Prof. Dr. İmran Göker 6-35

Confidence Intervals
Population Mean
σ2 Unknown
ConfidenceIntervals
PopulationProportion
σ2 Known
Assist. Prof. Dr. İmran Göker 6-36
PopulationVariance

Confidence Intervals for the Population Proportion
An interval estimate for the population proportion ( P ) can be calculated by adding an allowance for uncertainty to the sample proportion ( ) p̂
Assist. Prof. Dr. İmran Göker 6-37

Confidence Intervals for the Population Proportion, p
Recall that the distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation
We will estimate this with sample data:
(continued)
n)p(1p ˆˆ
nP)P(1σP
Assist. Prof. Dr. İmran Göker 6-38

Confidence Interval Endpoints
Upper and lower confidence limits for the population proportion are calculated with the formula
where z/2 is the standard normal value for the level of confidence desired is the sample proportion n is the sample size nP(1−P) > 5
n)p(1pzpP
n)p(1pzp α/2α/2
ˆˆˆˆˆˆ
p̂
Assist. Prof. Dr. İmran Göker 6-39

Example
A random sample of 100 people shows that 25 are left-handed.
Form a 95% confidence interval for the true proportion of left-handers
Assist. Prof. Dr. İmran Göker 6-40

Example A random sample of 100 people shows
that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.
(continued)
0.3349P0.1651
100.25(.75)1.96
10025P
100.25(.75)1.96
10025
n)p(1pzpP
n)p(1pzp α/2α/2
ˆˆˆˆˆˆ
Assist. Prof. Dr. İmran Göker 6-41

Interpretation
We are 95% confident that the true percentage of left-handers in the population is between
16.51% and 33.49%.
Although the interval from 0.1651 to 0.3349 may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion.
Assist. Prof. Dr. İmran Göker 6-42

Confidence Intervals
Population Mean
σ2 Unknown
ConfidenceIntervals
PopulationProportion
σ2 Known
Assist. Prof. Dr. İmran Göker 6-43
PopulationVariance

Confidence Intervals for the Population Variance
The confidence interval is based on the sample variance, s2
Assumed: the population is normally distributed
Assist. Prof. Dr. İmran Göker
Goal: Form a confidence interval for the population variance, σ2
6-44

Confidence Intervals for the Population Variance
Assist. Prof. Dr. İmran Göker
The random variable
2
22
1n σ1)s(n
follows a chi-square distribution with (n – 1) degrees of freedom
(continued)
Where the chi-square value denotes the number for which2
, 1n
αχχ )P( 2α , 1n
21n
6-45

Confidence Intervals for the Population Variance
Assist. Prof. Dr. İmran Göker
The (1 - )% confidence interval for the population variance is
2/2 - 1 , 1n
22
2/2 , 1n
2 1)s(nσ1)s(n
αα χχ
(continued)
6-46

Example
You are testing the speed of a batch of computer processors. You collect the following data (in Mhz):
Sample size 17Sample mean 3004Sample std dev 74
Assist. Prof. Dr. İmran Göker
Assume the population is normal. Determine the 95% confidence interval for σx
2
6-47

Finding the Chi-square Values n = 17 so the chi-square distribution has (n – 1) = 16
degrees of freedom = 0.05, so use the the chi-square values with area
0.025 in each tail:
Assist. Prof. Dr. İmran Göker
probability α/2 = .025
216
216= 28.85
6.91
28.85
20.975 , 16
2/2 - 1 , 1n
20.025 , 16
2/2 , 1n
χχ
χχ
α
α
216 = 6.91
probability α/2 = .025
6-48

Calculating the Confidence Limits
The 95% confidence interval is
Assist. Prof. Dr. İmran Göker
Converting to standard deviation, we are 95% confident that the population standard deviation of
CPU speed is between 55.1 and 112.6 Mhz
2/2 - 1 , 1n
22
2/2 , 1n
2 1)s(nσ1)s(n
αα χχ
6.911)(74)(17σ
28.851)(74)(17 2
22
12683σ3037 2
6-49

Finite Populations
If the sample size is more than 5% of the population size (and sampling is without replacement) then a finite population correction factor must be used when calculating the standard error
Assist. Prof. Dr. İmran Göker 6-50

Finite PopulationCorrection Factor
Suppose sampling is without replacement and the sample size is large relative to the population size
Assume the population size is large enough to apply the central limit theorem
Apply the finite population correction factor when estimating the population variance
Assist. Prof. Dr. İmran Göker
1NnNfactor correction population finite
Ch.17-51

Estimating the Population Mean
Let a simple random sample of size n be taken from a population of N members with mean μ
The sample mean is an unbiased estimator of the population mean μ
The point estimate is:
Assist. Prof. Dr. İmran Göker
n
1iix
n1x
Ch.17-52

Finite Populations: Mean
If the sample size is more than 5% of the population size, an unbiased estimator for the variance of the sample mean is
So the 100(1-α)% confidence interval for the population mean is
Assist. Prof. Dr. İmran Göker 6-53
1NnN
nsσ
22xˆ
xα/21,-nxα/21,-n σtxμσt-x ˆˆ

Estimating the Population Total
Consider a simple random sample of size n from a population of size N
The quantity to be estimated is the population total Nμ
An unbiased estimation procedure for the population total Nμ yields the point estimate Nx
Assist. Prof. Dr. İmran Göker Ch.17-54

Estimating the Population Total
An unbiased estimator of the variance of the population total is
A 100(1 - )% confidence interval for the population total is
Assist. Prof. Dr. İmran Göker
xα/21,-nxα/21,-n σNtxNNμσNtxN ˆˆ
1-Nn)(N
nsNσN
222
x2
ˆ
Ch.17-55

Confidence Interval for Population Total: Example
Assist. Prof. Dr. İmran Göker
A firm has a population of 1000 accounts and wishes to estimate the total population value
A sample of 80 accounts is selected with average balance of $87.6 and standard
deviation of $22.3
Find the 95% confidence interval estimate of the total balance
Ch.17-56

Example Solution
Assist. Prof. Dr. İmran Göker
The 95% confidence interval for the population total balance is $82,837.53 to $92,362.47
2392.65724559.6σN
5724559.6999920
80(22.3)(1000)
1-Nn)(N
nsNσN
x
22
222
x2
ˆ
ˆ
22.3s 87.6,x 80, n 1000,N
392.6)(1.9905)(26)(1000)(87.σNtxN x79,0.025 ±± ˆ
92362.47Nμ82837.53
Ch.17-57

Estimating the Population Proportion
Let the true population proportion be P Let be the sample proportion from n
observations from a simple random sample The sample proportion, , is an unbiased
estimator of the population proportion, P
Assist. Prof. Dr. İmran Göker
p̂
p̂
Ch.17-58

Finite Populations: Proportion
If the sample size is more than 5% of the population size, an unbiased estimator for the variance of the population proportion is
So the 100(1-α)% confidence interval for the population proportion is
Assist. Prof. Dr. İmran Göker 6-59
1NnN
n)p(1-pσ2
p
ˆˆˆ ˆ
pα/2pα/2 σzpPσz-p ˆˆ ˆˆˆˆ

Estimation: Additional Topics
Assist. Prof. Dr. İmran Göker
Chapter Topics
Population Means,
Independent Samples
Population Means,
Dependent Samples
Sample Size Determination
Group 1 vs. independent
Group 2
Same group before vs. after
treatment
Finite Populations
Examples:
Population Proportions
Proportion 1 vs. Proportion 2
6-60
Large Populations
Confidence Intervals

Dependent Samples
Tests Means of 2 Related Populations Paired or matched samples Repeated measures (before/after) Use difference between paired values:
Eliminates Variation Among Subjects Assumptions:
Both Populations Are Normally Distributed
Assist. Prof. Dr. İmran Göker
Dependent samples
di = xi - yi
6-61

Mean Difference
The ith paired difference is di , where
Assist. Prof. Dr. İmran Göker
di = xi - yi
The point estimate for the population mean paired difference is d : n
dd
n
1ii
n is the number of matched pairs in the sample
1n
)d(dS
n
1i
2i
d
The sample standard deviation is:
Dependent samples
6-62

Confidence Interval forMean Difference
The confidence interval for difference between population means, μd , is
Assist. Prof. Dr. İmran Göker
Where n = the sample size (number of matched pairs in the paired sample)
nStdμ
nStd d
α/21,ndd
α/21,n
Dependent samples
6-63

Confidence Interval forMean Difference
The margin of error is
tn-1,/2 is the value from the Student’s t distribution with (n – 1) degrees of freedom for which
Assist. Prof. Dr. İmran Göker
(continued)
2α)tP(t α/21,n1n
nstME d
α/21,n
Dependent samples
6-64

Six people sign up for a weight loss program. You collect the following data:
Assist. Prof. Dr. İmran Göker
Paired Samples Example
Weight: Person Before (x) After (y) Difference, di
1 136 125 11 2 205 195 10 3 157 150 7 4 138 140 - 2 5 175 165 10 6 166 160 6 42
d = din
4.82
1n)d(d
S2
id
= 7.0
6-65
Dependent samples

For a 95% confidence level, the appropriate t value is tn-1,/2 = t5,.025 = 2.571
The 95% confidence interval for the difference between means, μd , is
Assist. Prof. Dr. İmran Göker
12.06μ1.94
64.82(2.571)7μ
64.82(2.571)7
nStdμ
nStd
d
d
dα/21,nd
dα/21,n
Paired Samples Example (continued)
Since this interval contains zero, we cannot be 95% confident, given this limited data, that the weight loss program helps people lose weight
6-66
Dependent samples

Difference Between Two Means:Independent Samples
Different data sources Unrelated Independent
Sample selected from one population has no effect on the sample selected from the other population
The point estimate is the difference between the two sample means:
Assist. Prof. Dr. İmran Göker
Population means, independent
samples
Goal: Form a confidence interval for the difference between two population means, μx – μy
x – y6-67

Assist. Prof. Dr. İmran Göker
Population means, independent
samples
Confidence interval uses z/2
Confidence interval uses a value from the Student’s t distribution
σx2 and σy
2 assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
(continued)
6-68
Difference Between Two Means:Independent Samples

σx2 and σy
2 Known
Assist. Prof. Dr. İmran Göker
Population means, independent
samples
Assumptions:
Samples are randomly and
independently drawn
both population distributions
are normal
Population variances are known
*σx2 and σy
2 known
σx2 and σy
2 unknown
6-69

σx2 and σy
2 Known
Assist. Prof. Dr. İmran Göker
Population means, independent
samples
…and the random variable
has a standard normal distribution
When σx and σy are known and both populations are
normal, the variance of X – Y is y
2y
x
2x2
YX nσ
nσσ
(continued)
*
Y
2y
X
2x
YX
nσ
nσ
)μ(μ)yx(Z
σx2 and σy
2 known
σx2 and σy
2 unknown
6-70

Confidence Interval, σx
2 and σy2 Known
Assist. Prof. Dr. İmran Göker
Population means,
independent samples The confidence interval for
μx – μy is:*
y
2Y
x
2X
α/2YXy
2Y
x
2X
α/2 nσ
nσz)yx(μμ
nσ
nσz)yx(
σx2 and σy
2 known
σx2 and σy
2 unknown
6-71

σx2 and σy
2 Unknown,Assumed Equal
Assist. Prof. Dr. İmran Göker
Population means, independent
samples
Assumptions:
Samples are randomly and independently drawn
Populations are normally distributed
Population variances are unknown but assumed equal
*σx2 and σy
2 assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
6-72

σx2 and σy
2 Unknown,Assumed Equal
Assist. Prof. Dr. İmran Göker
Population means, independent
samples
(continued)Forming interval estimates:
The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ
use a t value with (nx + ny – 2) degrees of freedom
*σx2 and σy
2 assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
6-73

σx2 and σy
2 Unknown,Assumed Equal
Assist. Prof. Dr. İmran Göker
Population means, independent
samplesThe pooled variance is
(continued)
* 2nn1)s(n1)s(n
syx
2yy
2xx2
p
σx
2 and σy2
assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
6-74

Confidence Interval, σx
2 and σy2 Unknown, Equal
Assist. Prof. Dr. İmran Göker
The confidence interval for μ1 – μ2 is:
Where
*σx2 and σy
2 assumed equal
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
y
2p
x
2p
α/22,nnYXy
2p
x
2p
α/22,nn ns
ns
t)yx(μμns
ns
t)yx(yxyx
2nn1)s(n1)s(n
syx
2yy
2xx2
p
6-75

Pooled Variance Example
You are testing two computer processors for speed. Form a confidence interval for the difference in CPU speed. You collect the following speed data (in Mhz):
CPUx CPUyNumber Tested 17 14Sample mean 3004 2538Sample std dev 74 56
Assist. Prof. Dr. İmran Göker
Assume both populations are normal with equal variances, and use 95% confidence
6-76

Calculating the Pooled Variance
Assist. Prof. Dr. İmran Göker
4427.031)141)-(17
56114741171)n(n
S1nS1nS
22
y
2yy
2xx2
p
(()1x
The pooled variance is:
The t value for a 95% confidence interval is:
2.045tt 0.025 , 29α/2 , 2nn yx
6-77

Calculating the Confidence Limits
The 95% confidence interval is
Assist. Prof. Dr. İmran Göker
y
2p
x
2p
α/22,nnYXy
2p
x
2p
α/22,nn ns
ns
t)yx(μμns
ns
t)yx(yxyx
144427.03
174427.03(2.054)2538)(3004μμ
144427.03
174427.03(2.054)2538)(3004 YX
515.31μμ416.69 YX
We are 95% confident that the mean difference in CPU speed is between 416.69 and 515.31 Mhz.
6-78

σx2 and σy
2 Unknown,Assumed Unequal
Assist. Prof. Dr. İmran Göker
Population means, independent
samples
Assumptions:
Samples are randomly and independently drawn
Populations are normally distributed
Population variances are unknown and assumed unequal
*σx
2 and σy2
assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
6-79

σx2 and σy
2 Unknown,Assumed Unequal
Assist. Prof. Dr. İmran Göker
Population means, independent
samples
(continued)
Forming interval estimates:
The population variances are assumed unequal, so a pooled variance is not appropriate
use a t value with degrees of freedom, where
σx2 and σy
2 known
σx2 and σy
2 unknown
*σx
2 and σy2
assumed equal
σx2 and σy
2 assumed unequal
1)/(nns
1)/(nns
)ns
()ns(
y
2
y
2y
x
2
x
2x
2
y
2y
x
2x
v
6-80

Confidence Interval, σx
2 and σy2 Unknown, Unequal
Assist. Prof. Dr. İmran Göker
The confidence interval for μ1 – μ2 is:
*σx
2 and σy2
assumed equal
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
y
2y
x
2x
α/2,YXy
2y
x
2x
α/2, ns
nst)yx(μμ
ns
nst)yx(
1)/(nns
1)/(nns
)ns
()ns(
y
2
y
2y
x
2
x
2x
2
y
2y
x
2x
vWhere
6-81

Two Population Proportions
Assist. Prof. Dr. İmran Göker
Goal: Form a confidence interval for the difference between two population proportions, Px – Py
The point estimate for the difference is
Population proportions
Assumptions: Both sample sizes are large (generally at least 40 observations in each sample)
yx pp ˆˆ
6-82

Two Population Proportions
The random variable
is approximately normally distributed
Assist. Prof. Dr. İmran Göker
Population proportions
(continued)
y
yy
x
xx
yxyx
n)p(1p
n)p(1p
)p(p)pp(Z
ˆˆˆˆ
ˆˆ
6-83

Confidence Interval forTwo Population Proportions
Assist. Prof. Dr. İmran Göker
Population proportions
The confidence limits for Px – Py are:
y
yy
x
xxyx n
)p(1pn
)p(1pZ )pp(ˆˆˆˆˆˆ
2/
±
6-84

Example: Two Population Proportions
Form a 90% confidence interval for the difference between the proportion of men and the proportion of women who have college degrees.
In a random sample, 26 of 50 men and 28 of 40 women had an earned college degree
Assist. Prof. Dr. İmran Göker 6-85

Example: Two Population Proportions
Men:
Women:
Assist. Prof. Dr. İmran Göker
0.101240
0.70(0.30)50
0.52(0.48)n
)p(1pn
)p(1p
y
yy
x
xx
ˆˆˆˆ
0.525026px ˆ
0.704028py ˆ
(continued)
For 90% confidence, Z/2 = 1.645
6-86

Example: Two Population Proportions
The confidence limits are:
so the confidence interval is
-0.3465 < Px – Py < -0.0135
Since this interval does not contain zero we are 90% confident that the two proportions are not equal
Assist. Prof. Dr. İmran Göker
(continued)
(0.1012)1.645.70)(.52
n)p(1p
n)p(1pZ)pp(
y
yy
x
xxα/2yx
±
±
ˆˆˆˆˆˆ
6-87

Sample Size Determination
Assist. Prof. Dr. İmran Göker
For the Mean
DeterminingSample Size
For theProportion
6-88
Large Populations
Finite Populations
For the Mean
For theProportion

Margin of Error
The required sample size can be found to reach a desired margin of error (ME) with a specified level of confidence (1 - )
The margin of error is also called sampling error the amount of imprecision in the estimate of the
population parameter the amount added and subtracted to the point
estimate to form the confidence interval
Assist. Prof. Dr. İmran Göker 6-89

Sample Size Determination
Assist. Prof. Dr. İmran Göker
nσzx α/2±
nσzME α/2
Margin of Error (sampling error)
6-90
For the Mean
Large Population
s

Sample Size Determination
Assist. Prof. Dr. İmran Göker
nσzME α/2
(continued)
2
22α/2
MEσzn Now solve
for n to get
6-91
For the Mean
Large Populations

Sample Size Determination
To determine the required sample size for the mean, you must know:
The desired level of confidence (1 - ), which determines the z/2 value
The acceptable margin of error (sampling error), ME The population standard deviation, σ
Assist. Prof. Dr. İmran Göker
(continued)
6-92

Required Sample Size Example
If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence?
Assist. Prof. Dr. İmran Göker
(Always round up)
219.195
(45)(1.645)ME
σzn 2
22
2
22α/2
So the required sample size is n = 220
6-93

Sample Size Determination:Population Proportion
Assist. Prof. Dr. İmran Göker
n)p(1pzp α/2
ˆˆˆ ±
n)p(1pzME α/2
ˆˆ
Margin of Error (sampling error)
6-94
For theProportion
Large Populations

Assist. Prof. Dr. İmran Göker
2
2α/2
MEz 0.25n
Substitute 0.25 for and solve for n to get
(continued)
n)p(1pzME α/2
ˆˆ
cannot be larger than 0.25, when = 0.5
)p(1p ˆˆ
p̂)p(1p ˆˆ
6-95
For theProportion
Large Populations
Sample Size Determination:Population Proportion

The sample and population proportions, and P, are generally not known (since no sample has been taken yet)
P(1 – P) = 0.25 generates the largest possible margin of error (so guarantees that the resulting sample size will meet the desired level of confidence)
To determine the required sample size for the proportion, you must know:
The desired level of confidence (1 - ), which determines the critical z/2 value
The acceptable sampling error (margin of error), ME Estimate P(1 – P) = 0.25
Assist. Prof. Dr. İmran Göker
(continued)
p̂
6-96
Sample Size Determination:Population Proportion

Required Sample Size Example:Population Proportion
How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence?
Assist. Prof. Dr. İmran Göker 6-97

Required Sample Size Example
Assist. Prof. Dr. İmran Göker
Solution:
For 95% confidence, use z0.025 = 1.96
ME = 0.03Estimate P(1 – P) = 0.25
So use n = 1068
(continued)
1067.11(0.03)
6)(0.25)(1.9ME
z 0.25n 2
2
2
2α/2
6-98

Assist. Prof. Dr. İmran Göker 6-99
Sample Size Determination:Finite Populations
Finite Populations
For the Mean
1NnN
nσ)XVar(
2
A finite population correction factor is added:
1. Calculate the required sample size n0 using the prior formula:
2. Then adjust for the finite population:
2
22α/2
0 MEσzn
1)-(NnNnn
0
0

Sample Size Determination:Finite Populations
Assist. Prof. Dr. İmran Göker 6-100
Finite Populations
For theProportion
1NnN
nP)P(1-)pVar( ˆ
A finite population correction factor is added:
1. Solve for n:
2. The largest possible value for this expression (if P = 0.25) is:
3. A 95% confidence interval will extend ±1.96 from the sample proportion
P)P(11)σ(NP)NP(1n 2
p
ˆ
0.251)σ(NP)0.25(1n 2
p
ˆ
pσ ˆ

Example: Sample Size to Estimate Population Proportion
Assist. Prof. Dr. İmran Göker 6-101
(continued)
pσ ˆHow large a sample would be necessary to estimate within ±5% the true proportion of college graduates in a population of 850 people with 95% confidence?

Required Sample Size Example
Assist. Prof. Dr. İmran Göker
Solution: For 95% confidence, use z0.025 = 1.96 ME = 0.05
So use n = 265
(continued)
6-102
0.02551σ0.05σ1.96 pp ˆˆ
264.80.25551)(849)(0.02
)(0.25)(8500.251)σ(N
0.25Nn 22p
max
ˆ