Post on 18-Jan-2016
description
ETM 620 - 09U1 ETM 620 - 09U1
Random samples and estimationChapter 9: Random samples & sampling
distributionsSamples and populationsΧ2, t, and F distributions
Chapter 10: Parameter estimationPoint estimation
Standard error of a statisticMethod of maximum likelihoodMethod of moments
One-sample and two-sample confidence interval estimation
Foundation for understanding the next few chapters
ETM 620 - 09U2 ETM 620 - 09U2
Ch. 9: Populations and samplesPopulation: “a group of individual persons,
objects, or items from which samples are taken for statistical measurement”
Sample: “a finite part of a statistical population whose properties are studied to gain information about the whole”
(Merriam-Webster Online Dictionary, http://www.m-w.com/, October 5, 2004)
ETM 620 - 09U3 ETM 620 - 09U3
ExamplesPopulation
Students pursuing graduate engineering degrees
Cars capable of speeds in excess of 160 mph.
Potato chips produced at the Frito-Lay plant in Kathleen
Freshwater lakes and rivers
Samples
In general, (x1, x2, x3, …, xn) are random samples of size n if:
the x’s are independent random variablesevery observation is equally likely (has the
same probability)
ETM 620 - 09U4 ETM 620 - 09U4
Sampling distributionsIf we conduct the same experiment several
times with the same sample size, the probability distribution of the resulting statistic is called a sampling distribution
Sampling distribution of the mean: if n observations are taken from a normal population with mean μ and variance σ2, then:
nn
n
x
x
2
2
22222 ...
...
ETM 620 - 09U5
An important consideration …
will be different for every sampleFor example, suppose we know the time to
complete a typical homework problem, in minutes, is known to be uniformly distributed between 5 and 25. Four people are asked to record the time it takes them to complete each of 31 different problems.
ETM 620 - 09U5
x
x
ETM 620 - 09U6
Individual data points
μ = __________________
σ2 = _________________
σ = __________________
ETM 620 - 09U6
Problem # 1 2 3 41 12.64 7.01 16.93 22.982 22.69 24.17 5.29 13.153 22.26 7.77 9.90 5.914 5.65 8.28 9.39 5.345 10.70 11.86 16.07 12.156 12.44 12.11 23.21 14.327 13.52 11.08 24.51 21.138 24.82 10.13 24.03 6.079 19.10 21.33 24.45 14.33
10 11.00 20.00 12.03 20.5111 6.49 8.97 6.28 12.1712 14.74 15.22 12.47 24.7213 5.81 9.61 5.10 23.5214 7.01 10.13 20.51 18.5915 21.18 19.49 6.70 7.6516 20.12 17.53 8.47 13.1017 16.05 19.23 16.10 8.6218 24.41 18.74 15.58 20.9319 21.11 10.24 8.56 22.3420 7.30 6.19 20.23 19.7721 24.73 23.51 23.08 15.9022 15.02 18.50 14.80 7.9223 5.76 20.93 18.43 19.6324 16.69 8.04 22.84 12.5625 9.01 9.12 11.68 11.5026 11.00 21.04 18.92 10.4327 23.08 5.78 19.18 14.0728 15.33 10.13 10.83 21.0429 20.78 18.52 20.11 23.9730 17.39 19.44 24.36 12.3731 22.01 16.14 22.46 13.82
ETM 620 - 09U7
Sample means
= __________________
= _________________
= __________________
ETM 620 - 09U7
Problem # 1 2 3 4 average
1 12.64 7.01 16.93 22.98 14.89
2 22.69 24.17 5.29 13.15 16.32
3 22.26 7.77 9.90 5.91 11.46
4 5.65 8.28 9.39 5.34 7.17
5 10.70 11.86 16.07 12.15 12.70
6 12.44 12.11 23.21 14.32 15.52
7 13.52 11.08 24.51 21.13 17.56
8 24.82 10.13 24.03 6.07 16.26
9 19.10 21.33 24.45 14.33 19.80
10 11.00 20.00 12.03 20.51 15.89
11 6.49 8.97 6.28 12.17 8.48
12 14.74 15.22 12.47 24.72 16.79
13 5.81 9.61 5.10 23.52 11.01
14 7.01 10.13 20.51 18.59 14.06
15 21.18 19.49 6.70 7.65 13.75
16 20.12 17.53 8.47 13.10 14.81
17 16.05 19.23 16.10 8.62 15.00
18 24.41 18.74 15.58 20.93 19.91
19 21.11 10.24 8.56 22.34 15.56
20 7.30 6.19 20.23 19.77 13.37
21 24.73 23.51 23.08 15.90 21.80
22 15.02 18.50 14.80 7.92 14.06
23 5.76 20.93 18.43 19.63 16.19
24 16.69 8.04 22.84 12.56 15.03
25 9.01 9.12 11.68 11.50 10.33
26 11.00 21.04 18.92 10.43 15.35
27 23.08 5.78 19.18 14.07 15.53
28 15.33 10.13 10.83 21.04 14.33
29 20.78 18.52 20.11 23.97 20.84
30 17.39 19.44 24.36 12.37 18.39
31 22.01 16.14 22.46 13.82 18.61
x2x
x
ETM 620 - 09U8 ETM 620 - 09U8
Central Limit Theorem
Given:X : the mean of a random sample of size n taken
from a population with mean μ and finite variance σ2,
Then, the limiting form of the distribution of
is _________________________
nn
XZ ,
/
ETM 620 - 09U9 ETM 620 - 09U9
Central Limit Theorem
If the population is known to be normal, the sampling distribution of X will follow a normal distribution.
Even when the distribution of the population is not normal, the sampling distribution of X is normal when n is large.
NOTE: when n is not large, we cannot assume the distribution of X is normal.
ETM 620 - 09U10 ETM 620 - 09U10
Sampling distribution of S2 : Χ2
Given:Z1
2, Z22, … , Zk
2 normally distributed random variables, with mean μ and standard deviation σ = 1.
Then, follows a χ2 distribution with k degrees of freedom and distribution function,
(eq. 9-15, pg. 208)
μ = kσ2 = 2k
222
21
2 ... kZZZ
.0,
22
1)( 2/1)2/(
2/
ueuk
uf uk
k
ETM 620 - 09U11 ETM 620 - 09U11
χ2 Distribution
χα2 represents the χ2 value above which we find an
area of α, that is, for which P(χ2 > χα2 ) = α.
In Excel, =CHIDIST(x,degrees_freedom)
χ2 is additive, so if Y =∑ χi2 , then kY =∑ki
Sample variance,
χ2
22
2
~)1(
sn
ETM 620 - 09U12 ETM 620 - 09U12
Student’s t DistributionIf Z ~N(0,1) and V is a chi-square random
variable with k degrees of freedom, then
follows a t-distribution with k degrees of freedom. The probability density function is,
kV
ZT
/
t
ktkk
k
tf k ,
1
1
2/2
1
)( 2/)1(2
ETM 620 - 09U13 ETM 620 - 09U13
t- DistributionExample 9-7 shows that
follows a t distribution. In other words, x ~t(n-1) when σ is not know but is estimated by s.
In Excel, =TDIST(x,degrees_freedom,tails) gives the probability associated with getting a value above x (tails = 1) or outside +x (tails =2). =TINV(probability,degrees_freedom) gives the value associated with a desired probability, α.
nS
XT
/
ETM 620 - 09U14 ETM 620 - 09U14
F-DistributionGiven:
S12 and S2
2, the variances of independent random samples of size n1 and n2 taken from normal populations with variances σ1
2 and σ22, respectively,
Then,
follows an F-distribution with ν1 = n1 - 1 and ν2 = n2 – 1 degrees of freedom.
Table V, pp 605-609 gives F-values associated with given α values.
In Excel, =FDIST(x,degrees_freedom1,degrees_freedom2) gives probability associated with a given x-value, while =FINV(probability,degrees_freedom1,degrees_freedom2) gives F-value associated with a given α.
22
21
21
22
22
22
21
21
/
/
S
S
S
SF
ETM 620 - 09U15
Ch. 10: Parameter estimationExample: Say we have 5 numbers from a random
sample, as follows:19, 58, 31, 44, 43
8̅x = ____________________ is an estimate of μ
s2 = _____________________ is an estimate of σ2
We want to use “good” estimators (unbiased, minimum error)Unbiased, i.e. E(̂θ) = θ (e.g., E( 8x) = ___, and E(S2) =
__)Minimum error,
MSE(θ< - θ) = E(θ< - θ)2 = Var(θ<)
ETM 620 - 09U15
ETM 620 - 09U16
Finding good estimatorsMethod of maximum likelihood
take n random samples (x1, x2, x3, .., xn) from a distribution with function f(x,θ)
Likelihood function, L(θ) = f(x1,θ) ∙ f(x2,θ) ∙ f(x3,θ) ∙ ∙ ∙ f(xn,θ)
Take the derivative with respect to θ and set to 0.See example 10-4, pg. 222not always unbiased, but can be modified to make it so.
Method of momentsFirst k moments about the origin of any function is
Can produce good estimators, but sometimes not as good as MLE (for example).
ETM 620 - 09U16
ktdxxfxXE ktt
t ...,,2,1,),...,,;()(' 21
ETM 620 - 09U17
Interval estimation(1 – α)100% confidence interval for the
unknown parameterFor some statistic, θ (e.g., μ) looking for L and U
such thatP{L < θ < U} = 1 – α
or _______________
or ________________
ETM 620 - 09U
ETM 620 - 09U18 ETM 620 - 09U18
Single sample: Estimating the meanGiven:
σ is known and X is the mean of a random sample of size n,
Then, the (1 – α)100% confidence interval for μ is
given by)()( 2/2/ n
zXn
zX
Z
ETM 620 - 09U19 ETM 620 - 09U19
Example: mean with known variance
A random sample of size 25 is taken from a normal distribution with unknown mean and known variance of 4 (i.e., N(μ,4)). X of the sample is determined to be 13.2. What is the 90% confidence interval around the mean?
ETM 620 - 09U20
What does this mean?Measure of the precision of the estimateLength of the interval is a function of
confidence levelvariancesample size
Can vary n to decrease the length of the interval for the same confidence level.
For our example, suppose we want an error of 0.25 or less. Then,
n = ___________________________________________
ETM 620 - 09U20
22/
Ez
n
ETM 620 - 09U21
What if σ2 is unknown?If n is sufficiently large (> _______), then the large
sample confidence interval is:
Otherwise, must use the t-statistic …
)(2/ n
szX
EGR 252 - Ch. 92121
ETM 620 - 09U22
Single sample estimate of the mean(σ unknown, n not large)Given:
σ is unknown and X is the mean of a random sample of size n (where n is not large),
Then, the (1 – α)100% confidence interval for μ is
given by)()( 1,2/1,2/ n
stX
n
stX nn
EGR 252 - Ch. 922
-5 -4 -3 -2 -1 0 1 2 3 4 5
22
ETM 620 - 09U23
ExampleA traffic engineer is concerned about the delays at an intersection near a local school. The intersection is equipped with a fully actuated (“demand”) traffic light and there have been complaints that traffic on the main street is subject to unacceptable delays.
To develop a benchmark, the traffic engineer randomly samples 25 stop times (in seconds) on a weekend day. The average of these times is found to be 13.2 seconds, and the sample variance, s2, is found to be 4 seconds2.
Based on this data, what is the 95% confidence interval (C.I.) around the mean stop time during a weekend day?
EGR 252 - Ch. 92323
ETM 620 - 09U24
Example (cont.)
EGR 252 - Ch. 924
X = ______________ s = _______________
α = ________________ α/2 = _____________
t0.025,24 = _____________
__________________ < μ < ___________________
24
ETM 620 - 09U25
C.I. on the varianceGiven that
is ~ Χ2 with n-1 degrees of freedom.then,
gives the 100(1-α)% two-sided confidence interval on the variance.
ETM 620 - 09U25
2
22 )1(
sn
21,2/1
22
21,2/
2 )1()1(
nn
SnSn
ETM 620 - 09U26
Confidence interval on a proportion
ETM 620 - 09U26
The proportion, P, in a binomial experiment may be estimated by
where X is the number of successes in n trials.
For a sample, the point estimate of the parameter is
The mean for the sample proportion is and the sample variance is
nX
P
nx
p
pp
n
pq
p
2
ETM 620 - 09U27
C.I. for proportions
ETM 620 - 09U27
An approximate (1-α)100% confidence interval for p is:
Large-sample C.I. for p1 – p2 is:
Interpretation: _______________________________
n
qpzp
2/
2
22
1
112/21 )(
n
qp
n
qpzpp
ETM 620 - 09U28
Example 10.17 (pg. 240)
n = 75 x = 12
z0.025= ________
Picture:
C.I.:
Interpretation: ____________________________________
____________ˆp
ETM 620 - 09U29
Setting the sample size …If the estimate for p from the initial estimate
seems pretty reliable, then
e.g., for our example if we want to be 95% confident that the error in our estimate is less than 0.05, then
n = __________________ If we’re not at all sure how to estimate p,
then assume p = 0.5 and use
)̂1(̂2
2/ ppE
zn
25.02
2/
Ez
n
ETM 620 - 09U30
Example: comparing 2 proportions
Look at example 10-23, pg. 2501. C.I. = (-0.07, 0.15), therefore no reason to
believe there is a significant decrease in the proportion defectives using the new process.
2. What if the interval were (+0.07, 0.15)?
3. What if the interval were (-0.9, -0.7)?
ETM 620 - 09U31 ETM 620 - 09U31
Difference in 2 means, both σ2 knownGiven two independent random samples, a
point estimate the difference between μ1 and μ2 is given by the statistic
We can build a confidence interval for μ1 - μ2 (given σ1
2 and σ22 known) as follows:
21 xx
2
22
1
21
2/21212
22
1
21
2/21 )()(nn
zxxnn
zxx
31
ETM 620 - 09U32
An exampleA farm equipment manufacturer wants to compare the average daily downtime of two sheet-metal stamping machines located in two different factories. Investigation of company records for 100 randomly selected days on each of the two machines gave the following results:
8x1 = 12 minutes 8x2 = 10 minutes
12 = 12 2
2 = 8n1 = n2 = 100
Construct a 95% C.I. for μ1 – μ2
ETM 620 - 09U32
ETM 620 - 09U33
Solution
ETM 620 - 09U33
α/2 = _____________
z_____ = ____________
__________________ < μ1 – μ2 < _________________
Interpretation:
2
22
1
21
2/21212
22
1
21
2/21 )()(nn
zxxnn
zxx
Picture
ETM 620 - 09U34 ETM 620 - 09U34
Differences in 2 means, σ2 unknown
Case 1: σ12 and σ2
2 unknown but equal
Where,
212,2/2121
212,2/21
11)(
11)(
2121 nnStxx
nnStxx pnnpnn
2)1()1(
21
222
2112
nn
SnSnSp
ETM 620 - 09U35 ETM 620 - 09U35
Differences in 2 means, σ2 unknown
Case 2: σ12 and σ2
2 unknown and not equal
Where,
2
22
1
21
,2/21212
22
1
21
,2/21 )()(ns
ns
txxns
ns
txx
1/
1/
)//(
2
22
22
1
21
21
22
221
21
nnS
nnS
nSnS
ETM 620 - 09U36 ETM 620 - 09U36
Example, σ2 unknown
Suppose the farm equipment manufacturer was unable to gather data for 100 days. Using the data they were able to gather, they would still like to compare the downtime for the two machines. The data they gathered is as follows:
x1 = 12 minutes x2 = 10 minutess1
2 = 12 s22 = 8
n1 = 18 n2 = 14
Construct a 95% C.I. for μ1 – μ2 assuming:1. σ1
2 and σ22 unknown but equal
2. σ12 and σ2
2 unknown and not equal
ETM 620 - 09U37 ETM 620 - 09U37
Solution: Case 1Picture
t____ , ________= ____________
__________________ < μ1 – μ2 < _________________
Interpretation:
_____________________2
)1()1(
21
222
2112
nn
SnSnSp
212,2/2121
212,2/21
11)(
11)(
2121 nnStxx
nnStxx pnnpnn
_____________21 xx
ETM 620 - 09U38
Your turn …Solve Case 2 (assuming variances are not
equal)
ETM 620 - 09U39
Paired ObservationsSuppose we are evaluating observations
that are not independent …For example, suppose a teacher wants to compare results of a pretest and posttest administered to the same group of students.
Paired-observation or Paired-sample test …Example: murder rates in two consecutive years for several US cities (see attached.) Construct a 90% confidence interval around the difference in consecutive years.
ETM 620 - 09U40
SolutionD = ____________
tα/2, n-1 = _____________
a (1-α)100% CI for μD is:
__________________ < μ1 – μ2 < _________________
Interpretation:
_________1
)( 2
ndd
s id
Picture
)()( 1,2/1,2/ n
std
n
std d
nDd
n
C. I. for the ratio of two variancesIf X1 and X2 are independent normal random variables
with unknown and unequal means and variances, then the confidence interval on the ratio σ1
2/ σ22 is given by:
Note: for F-values not given in table V, recall that
or use = FINV(probability,degrees_freedom1,degrees_freedom2)
ETM 620 - 09U41
1,1,2/22
21
22
21
1,1,2/122
21
1212 nnnn FS
SF
S
S
1,1,2/1,1,2/1
21
12
1
nnnn F
F
ETM 620 - 09U42 ETM 620 - 09U42
Example 10-22Picturen1 = 12, s1 = 0.85
n2 = 15, s2 = 0.98
F____ , ____ , ____= ____________
F____ , ____ , ____= ____________
__________________ < σ12/ σ2
2 < _________________
Interpretation:
1,1,2/22
21
22
21
1,1,2/122
21
1212 nnnn FS
SF
S
S