Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture...

17
Section 6.2 Standard error of ¯ Y Sections 6.1, 6.2 and 6.3 Elementary Statistics for the Biological and Life Sciences (Stat 205) 1 / 17

Transcript of Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture...

Page 1: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Sections 6.1, 6.2 and 6.3

Elementary Statistics for the Biological and Life Sciences

(Stat 205)

1 / 17

Page 2: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Example 6.1.1 Butterfly wings

n = 14 male Monarch butterflies were measured for wing area(Oceano Dunes State Park, California).

y = 32.81 cm2 and s = 2.48 cm2 estimate µ and σ, the mean andstandard deviation of all Monarch butterfly wing areas fromOceano Dunes.

How good are these estimates?

2 / 17

Page 3: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

6.2 Standard error of Y

Recall σY = σ√n

.

We will usually not know σ (if we don’t know µ, how can weknow σ?)

Simply plug in s for σ.

The standard error of the mean is

SEY =s√n.

For the butterfly wings, SEY = s√n

= 2.48√14

= 0.66 cm2.

The standard error SEY gives the variability of Y ; thestandard deviation s gives the variability in the data itself.

3 / 17

Page 4: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Example 6.2.2

Geneticist weighs n = 28 female Rambouillet lambs at birth, allborn in April, all single births.

y = 5.17 kg estimates µ, the population mean.

s = 0.65 kg estimates the spread in the sample.

SEY = s√n

= 0.65√28

= 0.12 kg estimates how variable y is, i.e.

how “close” we can expect y to be to µ.

4 / 17

Page 5: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Increasing n sampling from lamb birthweight population

5 / 17

Page 6: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Confidence interval, known σ, formal derivation

Say we know σ (for now) and the data are normal. Then

Y ∼ N (µ, σY ) = N

(µ,

σ√n

).

The (1− α)100% confidence interval for µ is:[Y − zα

2

σ√n,Y + zα

2

σ√n

]

We can standardize Y to get

Z =Y − µσ/√n.

6 / 17

Page 7: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Confidence interval, known σ, formal derivation

We can show Pr{−zα2≤ Z ≤ zα

2} = 1− α. Then

1− α = Pr{−zα2≤ Z ≤ zα

2}

= Pr

{−zα

2≤

Y − µσ/√n≤ zα

2

}= Pr

{−zα

2

σ√n≤ Y − µ ≤ zα

2

σ√n

}= Pr

{Y − zα

2

σ√n≤ µ ≤ Y + zα

2

σ√n

}

7 / 17

Page 8: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Confidence interval

Y ± 1.645 σ√n

is a 90% probability interval for µ.

Y ± 1.96 σ√n

is a 95% probability interval for µ.

Y ± 2.575 σ√n

is a 99% probability interval for µ.

We don’t always know σY = σ√n

, but we do know SEY = s√n

.

What Y−µSEY

is distributed as?

8 / 17

Page 9: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Estimating σ by s gives a t distribution

If Y has normal distribution, then Y−µSEY

has a Student’s t

distribution with n − 1 degrees of freedom.

The students t distribution looks like a standard normal, buthas fatter tails to account for extra variability in estimatingσY = σ√

nby SEY = s√

n.

Using R, command t.test(data) gives a 95% CI for µ.

For small sample sizes (n < 30, say), data need to beapproximately normal, otherwise the central limit theoremkicks in.

9 / 17

Page 10: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Student’s t curves (df=1,2,3,5,10 & 30), and normal curve

10 / 17

Page 11: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Definition of critical value tα2 ,n−1

We replace “zα2

” (from a normal) by the equivalent t distributionvalue, denoted tα

2,n−1. Table of these on back inside cover.

11 / 17

Page 12: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Example 6.3.1 butterfly data

Wing area of n = 14 male Monarch butterly wings at OceanoDunes in California.

This is a small sample size (n < 30). We need to check if the dataare normal to trust the confidence interval; the histogram looksroughly bell-shaped and the normal probability plot looksreasonably straight.

Page 13: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Confidence interval in R using t.test

> butterfly=c(33.9,33.0,30.6,36.6,36.5,34.0,36.1,32.0,28.0,32.0,32.2,32.3,32.3,30.0)

> par(mfrow=c(1,2))

> hist(butterfly)

> qqnorm(butterfly)

> t.test(butterfly)

One Sample t-test

data: butterfly

t = 49.6405, df = 13, p-value = 3.292e-16

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

31.39303 34.24983

sample estimates:

mean of x

32.82143

The part we care about right now is just

95 percent confidence interval:

31.39303 34.24983

We are 95% confident that the true population mean wing area isbetween 31.4 and 34.2 cm2.

13 / 17

Page 14: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Other confidence levels

Sometimes people want a 90% CI or a 99% CI. As confidencegoes up, the interval must become wider. To be moreconfident that the mean is in the interval, we need to includemore plausible values.

The corresponding multipliers are t0.05, t0.025, and t0.005 for90%, 95%, and 99% CI’s, respectively. These are in the tableon the inside cover of the back of your book if you construct aCI by hand.

In R, use t.test(data,conf.level=0.90) for a 90% test CIt.test(data,conf.level=0.99) for 99% CI.

> t.test(butterfly,conf.level=0.9)

90 percent confidence interval:

31.65052 33.99234

> t.test(butterfly)

95 percent confidence interval:

31.39303 34.24983

> t.test(butterfly,conf.level=0.99)

99 percent confidence interval:

30.82976 34.81309

14 / 17

Page 15: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Interpretation of CI

If we did a meta-experiment and collected samples of size nrepeatedly and formed 95% CI’s, approximately 95 in 100would cover µ.

Increasing n only makes the intervals smaller; still 95% of theCI’s would cover µ.

However, we only get to see one of these intervals, because weonly conduct one study.

Interpretation is important. “With 95% confidence the truemean of population characterstic is between a and b

units .”

15 / 17

Page 16: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Blue Jay bill length n = 5

Meta-experiment for Blue Jay bill length where µ = 25.4 mm &σ = 0.08 mm.

16 / 17

Page 17: Sections 6.1, 6.2 and 6 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L11-Sec6_2to6_3.pdfdistribution with n 1 degrees of freedom. The students t distribution

Section 6.2 Standard error of Y

Eggshell thickness n = 20

Meta-experiment for Blue Jay bill length where µ = 25.4 mm &σ = 0.08 mm.

17 / 17