Random samples and estimation

42
ETM 620 - 09U 1 ETM 620 - 09U 1 Random samples and estimation Chapter 9: Random samples & sampling distributions Samples and populations Χ 2 , t, and F distributions Chapter 10: Parameter estimation Point estimation Standard error of a statistic Method of maximum likelihood Method of moments One-sample and two-sample confidence interval estimation Foundation for understanding the next few chapters

description

Random samples and estimation. Chapter 9: Random samples & sampling distributions Samples and populations Χ 2 , t , and F distributions Chapter 10: Parameter estimation Point estimation Standard error of a statistic Method of maximum likelihood Method of moments - PowerPoint PPT Presentation

Transcript of Random samples and estimation

Page 1: Random samples and estimation

ETM 620 - 09U1 ETM 620 - 09U1

Random samples and estimationChapter 9: Random samples & sampling

distributionsSamples and populationsΧ2, t, and F distributions

Chapter 10: Parameter estimationPoint estimation

Standard error of a statisticMethod of maximum likelihoodMethod of moments

One-sample and two-sample confidence interval estimation

Foundation for understanding the next few chapters

Page 2: Random samples and estimation

ETM 620 - 09U2 ETM 620 - 09U2

Ch. 9: Populations and samplesPopulation: “a group of individual persons,

objects, or items from which samples are taken for statistical measurement”

Sample: “a finite part of a statistical population whose properties are studied to gain information about the whole”

(Merriam-Webster Online Dictionary, http://www.m-w.com/, October 5, 2004)

Page 3: Random samples and estimation

ETM 620 - 09U3 ETM 620 - 09U3

ExamplesPopulation

Students pursuing graduate engineering degrees

Cars capable of speeds in excess of 160 mph.

Potato chips produced at the Frito-Lay plant in Kathleen

Freshwater lakes and rivers

Samples

In general, (x1, x2, x3, …, xn) are random samples of size n if:

the x’s are independent random variablesevery observation is equally likely (has the

same probability)

Page 4: Random samples and estimation

ETM 620 - 09U4 ETM 620 - 09U4

Sampling distributionsIf we conduct the same experiment several

times with the same sample size, the probability distribution of the resulting statistic is called a sampling distribution

Sampling distribution of the mean: if n observations are taken from a normal population with mean μ and variance σ2, then:

nn

n

x

x

2

2

22222 ...

...

Page 5: Random samples and estimation

ETM 620 - 09U5

An important consideration …

will be different for every sampleFor example, suppose we know the time to

complete a typical homework problem, in minutes, is known to be uniformly distributed between 5 and 25. Four people are asked to record the time it takes them to complete each of 31 different problems.

ETM 620 - 09U5

x

x

Page 6: Random samples and estimation

ETM 620 - 09U6

Individual data points

μ = __________________

σ2 = _________________

σ = __________________

ETM 620 - 09U6

Problem # 1 2 3 41 12.64 7.01 16.93 22.982 22.69 24.17 5.29 13.153 22.26 7.77 9.90 5.914 5.65 8.28 9.39 5.345 10.70 11.86 16.07 12.156 12.44 12.11 23.21 14.327 13.52 11.08 24.51 21.138 24.82 10.13 24.03 6.079 19.10 21.33 24.45 14.33

10 11.00 20.00 12.03 20.5111 6.49 8.97 6.28 12.1712 14.74 15.22 12.47 24.7213 5.81 9.61 5.10 23.5214 7.01 10.13 20.51 18.5915 21.18 19.49 6.70 7.6516 20.12 17.53 8.47 13.1017 16.05 19.23 16.10 8.6218 24.41 18.74 15.58 20.9319 21.11 10.24 8.56 22.3420 7.30 6.19 20.23 19.7721 24.73 23.51 23.08 15.9022 15.02 18.50 14.80 7.9223 5.76 20.93 18.43 19.6324 16.69 8.04 22.84 12.5625 9.01 9.12 11.68 11.5026 11.00 21.04 18.92 10.4327 23.08 5.78 19.18 14.0728 15.33 10.13 10.83 21.0429 20.78 18.52 20.11 23.9730 17.39 19.44 24.36 12.3731 22.01 16.14 22.46 13.82

Page 7: Random samples and estimation

ETM 620 - 09U7

Sample means

= __________________

= _________________

= __________________

ETM 620 - 09U7

Problem # 1 2 3 4 average

1 12.64 7.01 16.93 22.98 14.89

2 22.69 24.17 5.29 13.15 16.32

3 22.26 7.77 9.90 5.91 11.46

4 5.65 8.28 9.39 5.34 7.17

5 10.70 11.86 16.07 12.15 12.70

6 12.44 12.11 23.21 14.32 15.52

7 13.52 11.08 24.51 21.13 17.56

8 24.82 10.13 24.03 6.07 16.26

9 19.10 21.33 24.45 14.33 19.80

10 11.00 20.00 12.03 20.51 15.89

11 6.49 8.97 6.28 12.17 8.48

12 14.74 15.22 12.47 24.72 16.79

13 5.81 9.61 5.10 23.52 11.01

14 7.01 10.13 20.51 18.59 14.06

15 21.18 19.49 6.70 7.65 13.75

16 20.12 17.53 8.47 13.10 14.81

17 16.05 19.23 16.10 8.62 15.00

18 24.41 18.74 15.58 20.93 19.91

19 21.11 10.24 8.56 22.34 15.56

20 7.30 6.19 20.23 19.77 13.37

21 24.73 23.51 23.08 15.90 21.80

22 15.02 18.50 14.80 7.92 14.06

23 5.76 20.93 18.43 19.63 16.19

24 16.69 8.04 22.84 12.56 15.03

25 9.01 9.12 11.68 11.50 10.33

26 11.00 21.04 18.92 10.43 15.35

27 23.08 5.78 19.18 14.07 15.53

28 15.33 10.13 10.83 21.04 14.33

29 20.78 18.52 20.11 23.97 20.84

30 17.39 19.44 24.36 12.37 18.39

31 22.01 16.14 22.46 13.82 18.61

x2x

x

Page 8: Random samples and estimation

ETM 620 - 09U8 ETM 620 - 09U8

Central Limit Theorem

Given:X : the mean of a random sample of size n taken

from a population with mean μ and finite variance σ2,

Then, the limiting form of the distribution of

is _________________________

nn

XZ ,

/

Page 9: Random samples and estimation

ETM 620 - 09U9 ETM 620 - 09U9

Central Limit Theorem

If the population is known to be normal, the sampling distribution of X will follow a normal distribution.

Even when the distribution of the population is not normal, the sampling distribution of X is normal when n is large.

NOTE: when n is not large, we cannot assume the distribution of X is normal.

Page 10: Random samples and estimation

ETM 620 - 09U10 ETM 620 - 09U10

Sampling distribution of S2 : Χ2

Given:Z1

2, Z22, … , Zk

2 normally distributed random variables, with mean μ and standard deviation σ = 1.

Then, follows a χ2 distribution with k degrees of freedom and distribution function,

(eq. 9-15, pg. 208)

μ = kσ2 = 2k

222

21

2 ... kZZZ

.0,

22

1)( 2/1)2/(

2/

ueuk

uf uk

k

Page 11: Random samples and estimation

ETM 620 - 09U11 ETM 620 - 09U11

χ2 Distribution

χα2 represents the χ2 value above which we find an

area of α, that is, for which P(χ2 > χα2 ) = α.

In Excel, =CHIDIST(x,degrees_freedom)

χ2 is additive, so if Y =∑ χi2 , then kY =∑ki

Sample variance,

χ2

22

2

~)1(

sn

Page 12: Random samples and estimation

ETM 620 - 09U12 ETM 620 - 09U12

Student’s t DistributionIf Z ~N(0,1) and V is a chi-square random

variable with k degrees of freedom, then

follows a t-distribution with k degrees of freedom. The probability density function is,

kV

ZT

/

t

ktkk

k

tf k ,

1

1

2/2

1

)( 2/)1(2

Page 13: Random samples and estimation

ETM 620 - 09U13 ETM 620 - 09U13

t- DistributionExample 9-7 shows that

follows a t distribution. In other words, x ~t(n-1) when σ is not know but is estimated by s.

In Excel, =TDIST(x,degrees_freedom,tails) gives the probability associated with getting a value above x (tails = 1) or outside +x (tails =2). =TINV(probability,degrees_freedom) gives the value associated with a desired probability, α.

nS

XT

/

Page 14: Random samples and estimation

ETM 620 - 09U14 ETM 620 - 09U14

F-DistributionGiven:

S12 and S2

2, the variances of independent random samples of size n1 and n2 taken from normal populations with variances σ1

2 and σ22, respectively,

Then,

follows an F-distribution with ν1 = n1 - 1 and ν2 = n2 – 1 degrees of freedom.

Table V, pp 605-609 gives F-values associated with given α values.

In Excel, =FDIST(x,degrees_freedom1,degrees_freedom2) gives probability associated with a given x-value, while =FINV(probability,degrees_freedom1,degrees_freedom2) gives F-value associated with a given α.

22

21

21

22

22

22

21

21

/

/

S

S

S

SF

Page 15: Random samples and estimation

ETM 620 - 09U15

Ch. 10: Parameter estimationExample: Say we have 5 numbers from a random

sample, as follows:19, 58, 31, 44, 43

8̅x = ____________________ is an estimate of μ

s2 = _____________________ is an estimate of σ2

We want to use “good” estimators (unbiased, minimum error)Unbiased, i.e. E(̂θ) = θ (e.g., E( 8x) = ___, and E(S2) =

__)Minimum error,

MSE(θ< - θ) = E(θ< - θ)2 = Var(θ<)

ETM 620 - 09U15

Page 16: Random samples and estimation

ETM 620 - 09U16

Finding good estimatorsMethod of maximum likelihood

take n random samples (x1, x2, x3, .., xn) from a distribution with function f(x,θ)

Likelihood function, L(θ) = f(x1,θ) ∙ f(x2,θ) ∙ f(x3,θ) ∙ ∙ ∙ f(xn,θ)

Take the derivative with respect to θ and set to 0.See example 10-4, pg. 222not always unbiased, but can be modified to make it so.

Method of momentsFirst k moments about the origin of any function is

Can produce good estimators, but sometimes not as good as MLE (for example).

ETM 620 - 09U16

ktdxxfxXE ktt

t ...,,2,1,),...,,;()(' 21

Page 17: Random samples and estimation

ETM 620 - 09U17

Interval estimation(1 – α)100% confidence interval for the

unknown parameterFor some statistic, θ (e.g., μ) looking for L and U

such thatP{L < θ < U} = 1 – α

or _______________

or ________________

ETM 620 - 09U

Page 18: Random samples and estimation

ETM 620 - 09U18 ETM 620 - 09U18

Single sample: Estimating the meanGiven:

σ is known and X is the mean of a random sample of size n,

Then, the (1 – α)100% confidence interval for μ is

given by)()( 2/2/ n

zXn

zX

Z

Page 19: Random samples and estimation

ETM 620 - 09U19 ETM 620 - 09U19

Example: mean with known variance

A random sample of size 25 is taken from a normal distribution with unknown mean and known variance of 4 (i.e., N(μ,4)). X of the sample is determined to be 13.2. What is the 90% confidence interval around the mean?

Page 20: Random samples and estimation

ETM 620 - 09U20

What does this mean?Measure of the precision of the estimateLength of the interval is a function of

confidence levelvariancesample size

Can vary n to decrease the length of the interval for the same confidence level.

For our example, suppose we want an error of 0.25 or less. Then,

n = ___________________________________________

ETM 620 - 09U20

22/

Ez

n

Page 21: Random samples and estimation

ETM 620 - 09U21

What if σ2 is unknown?If n is sufficiently large (> _______), then the large

sample confidence interval is:

Otherwise, must use the t-statistic …

)(2/ n

szX

EGR 252 - Ch. 92121

Page 22: Random samples and estimation

ETM 620 - 09U22

Single sample estimate of the mean(σ unknown, n not large)Given:

σ is unknown and X is the mean of a random sample of size n (where n is not large),

Then, the (1 – α)100% confidence interval for μ is

given by)()( 1,2/1,2/ n

stX

n

stX nn

EGR 252 - Ch. 922

-5 -4 -3 -2 -1 0 1 2 3 4 5

22

Page 23: Random samples and estimation

ETM 620 - 09U23

ExampleA traffic engineer is concerned about the delays at an intersection near a local school. The intersection is equipped with a fully actuated (“demand”) traffic light and there have been complaints that traffic on the main street is subject to unacceptable delays.

To develop a benchmark, the traffic engineer randomly samples 25 stop times (in seconds) on a weekend day. The average of these times is found to be 13.2 seconds, and the sample variance, s2, is found to be 4 seconds2.

Based on this data, what is the 95% confidence interval (C.I.) around the mean stop time during a weekend day?

EGR 252 - Ch. 92323

Page 24: Random samples and estimation

ETM 620 - 09U24

Example (cont.)

EGR 252 - Ch. 924

X = ______________ s = _______________

α = ________________ α/2 = _____________

t0.025,24 = _____________

__________________ < μ < ___________________

24

Page 25: Random samples and estimation

ETM 620 - 09U25

C.I. on the varianceGiven that

is ~ Χ2 with n-1 degrees of freedom.then,

gives the 100(1-α)% two-sided confidence interval on the variance.

ETM 620 - 09U25

2

22 )1(

sn

21,2/1

22

21,2/

2 )1()1(

nn

SnSn

Page 26: Random samples and estimation

ETM 620 - 09U26

Confidence interval on a proportion

ETM 620 - 09U26

The proportion, P, in a binomial experiment may be estimated by

where X is the number of successes in n trials.

For a sample, the point estimate of the parameter is

The mean for the sample proportion is and the sample variance is

nX

P

nx

p

pp

n

pq

p

2

Page 27: Random samples and estimation

ETM 620 - 09U27

C.I. for proportions

ETM 620 - 09U27

An approximate (1-α)100% confidence interval for p is:

Large-sample C.I. for p1 – p2 is:

Interpretation: _______________________________

n

qpzp

2/

2

22

1

112/21 )(

n

qp

n

qpzpp

Page 28: Random samples and estimation

ETM 620 - 09U28

Example 10.17 (pg. 240)

n = 75 x = 12

z0.025= ________

Picture:

C.I.:

Interpretation: ____________________________________

____________ˆp

Page 29: Random samples and estimation

ETM 620 - 09U29

Setting the sample size …If the estimate for p from the initial estimate

seems pretty reliable, then

e.g., for our example if we want to be 95% confident that the error in our estimate is less than 0.05, then

n = __________________ If we’re not at all sure how to estimate p,

then assume p = 0.5 and use

)̂1(̂2

2/ ppE

zn

25.02

2/

Ez

n

Page 30: Random samples and estimation

ETM 620 - 09U30

Example: comparing 2 proportions

Look at example 10-23, pg. 2501. C.I. = (-0.07, 0.15), therefore no reason to

believe there is a significant decrease in the proportion defectives using the new process.

2. What if the interval were (+0.07, 0.15)?

3. What if the interval were (-0.9, -0.7)?

Page 31: Random samples and estimation

ETM 620 - 09U31 ETM 620 - 09U31

Difference in 2 means, both σ2 knownGiven two independent random samples, a

point estimate the difference between μ1 and μ2 is given by the statistic

We can build a confidence interval for μ1 - μ2 (given σ1

2 and σ22 known) as follows:

21 xx

2

22

1

21

2/21212

22

1

21

2/21 )()(nn

zxxnn

zxx

31

Page 32: Random samples and estimation

ETM 620 - 09U32

An exampleA farm equipment manufacturer wants to compare the average daily downtime of two sheet-metal stamping machines located in two different factories. Investigation of company records for 100 randomly selected days on each of the two machines gave the following results:

8x1 = 12 minutes 8x2 = 10 minutes

12 = 12 2

2 = 8n1 = n2 = 100

Construct a 95% C.I. for μ1 – μ2

ETM 620 - 09U32

Page 33: Random samples and estimation

ETM 620 - 09U33

Solution

ETM 620 - 09U33

α/2 = _____________

z_____ = ____________

__________________ < μ1 – μ2 < _________________

Interpretation:

2

22

1

21

2/21212

22

1

21

2/21 )()(nn

zxxnn

zxx

Picture

Page 34: Random samples and estimation

ETM 620 - 09U34 ETM 620 - 09U34

Differences in 2 means, σ2 unknown

Case 1: σ12 and σ2

2 unknown but equal

Where,

212,2/2121

212,2/21

11)(

11)(

2121 nnStxx

nnStxx pnnpnn

2)1()1(

21

222

2112

nn

SnSnSp

Page 35: Random samples and estimation

ETM 620 - 09U35 ETM 620 - 09U35

Differences in 2 means, σ2 unknown

Case 2: σ12 and σ2

2 unknown and not equal

Where,

2

22

1

21

,2/21212

22

1

21

,2/21 )()(ns

ns

txxns

ns

txx

1/

1/

)//(

2

22

22

1

21

21

22

221

21

nnS

nnS

nSnS

Page 36: Random samples and estimation

ETM 620 - 09U36 ETM 620 - 09U36

Example, σ2 unknown

Suppose the farm equipment manufacturer was unable to gather data for 100 days. Using the data they were able to gather, they would still like to compare the downtime for the two machines. The data they gathered is as follows:

x1 = 12 minutes x2 = 10 minutess1

2 = 12 s22 = 8

n1 = 18 n2 = 14

Construct a 95% C.I. for μ1 – μ2 assuming:1. σ1

2 and σ22 unknown but equal

2. σ12 and σ2

2 unknown and not equal

Page 37: Random samples and estimation

ETM 620 - 09U37 ETM 620 - 09U37

Solution: Case 1Picture

t____ , ________= ____________

__________________ < μ1 – μ2 < _________________

Interpretation:

_____________________2

)1()1(

21

222

2112

nn

SnSnSp

212,2/2121

212,2/21

11)(

11)(

2121 nnStxx

nnStxx pnnpnn

_____________21 xx

Page 38: Random samples and estimation

ETM 620 - 09U38

Your turn …Solve Case 2 (assuming variances are not

equal)

Page 39: Random samples and estimation

ETM 620 - 09U39

Paired ObservationsSuppose we are evaluating observations

that are not independent …For example, suppose a teacher wants to compare results of a pretest and posttest administered to the same group of students.

Paired-observation or Paired-sample test …Example: murder rates in two consecutive years for several US cities (see attached.) Construct a 90% confidence interval around the difference in consecutive years.

Page 40: Random samples and estimation

ETM 620 - 09U40

SolutionD = ____________

tα/2, n-1 = _____________

a (1-α)100% CI for μD is:

__________________ < μ1 – μ2 < _________________

Interpretation:

_________1

)( 2

ndd

s id

Picture

)()( 1,2/1,2/ n

std

n

std d

nDd

n

Page 41: Random samples and estimation

C. I. for the ratio of two variancesIf X1 and X2 are independent normal random variables

with unknown and unequal means and variances, then the confidence interval on the ratio σ1

2/ σ22 is given by:

Note: for F-values not given in table V, recall that

or use = FINV(probability,degrees_freedom1,degrees_freedom2)

ETM 620 - 09U41

1,1,2/22

21

22

21

1,1,2/122

21

1212 nnnn FS

SF

S

S

1,1,2/1,1,2/1

21

12

1

nnnn F

F

Page 42: Random samples and estimation

ETM 620 - 09U42 ETM 620 - 09U42

Example 10-22Picturen1 = 12, s1 = 0.85

n2 = 15, s2 = 0.98

F____ , ____ , ____= ____________

F____ , ____ , ____= ____________

__________________ < σ12/ σ2

2 < _________________

Interpretation:

1,1,2/22

21

22

21

1,1,2/122

21

1212 nnnn FS

SF

S

S