Economics 105: Statistics

Economics 105: Statistics• Any questions?• Review #1 • GH 9 and GH 10 due on Wednesday

Confidence Intervals

Population Mean

σ Unknown

ConfidenceIntervals

PopulationProportion

σ Known

• If the population standard deviation σ is unknown, we can substitute the sample standard deviation, sx

• This introduces extra uncertainty, since sx varies from sample to sample

• Use t distribution instead of the normal distribution

Confidence Interval for μ(σ Unknown)

Student’s t distribution• William Sealy Gosset was an Irish statistician who worked for Guinness Brewery in Dublin in the early 1900s. He was interested in the effects of various ingredients and temperature on beer, but only had a few batches of each “formula” to analyze. Thus, he needed a way to correctly treat SMALL SAMPLES in statistical analysis.

• Not supposed to be publishing, so used the pseudonym, “Student”

Student’s t distribution• The t is a family of distributions

• The shape depends on degrees of freedom (d.f.)– Number of observations that are free to vary after

sample mean has been calculated

d.f. = n - 1

Student’s t distribution

t0

t (df = 5)

t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

Standard Normal

(t with df = ∞)

Note: t Z as n increases

• Assumptions– Population standard deviation is unknown– Population is normally distributed

• Use Student’s t distribution• Confidence Interval Estimate:

(where t is the critical value of the t distribution with n -1 degrees of freedom and an area of α/2 in each tail)

Confidence Interval for μ(σ Unknown)

(continued)

Student’s t Table

Upper Tail Area

df .25 .10 .05

1 1.000 3.078 6.314

2 0.817 1.886 2.920

3 0.765 1.638 2.353

t0 2.920The body of the table contains t values, not probabilities

Let: n = 3 df = n - 1 = 2 = 0.10 /2 = 0.05

/2 = 0.05=TINV(0.1,2) 2.91998558

=TDIST(A3,2,1) 0.05=TDIST(2.92,2,1) 0.049999578

t distribution valuesWith comparison to the Z value

Confidence t t t Z Level (10 d.f.) (20 d.f.) (30 d.f.) ____

0.80 1.372 1.325 1.310 1.28

0.90 1.812 1.725 1.697 1.645

0.95 2.228 2.086 2.042 1.96

0.99 3.169 2.845 2.750 2.58

Note: t Z as n increases

Confidence Intervals for • A manufacturer produces bags of flour whose weights are normally distributed. A random sample of 25 bags was taken and their mean weight was 19.8 ounces with a sample standard deviation of 1.2 ounces. • Find and interpret a 99% confidence interval for the true average weight for all bags of flour produced by the company.

Confidence Intervals

Population Mean

σ Unknown

ConfidenceIntervals

PopulationProportion

σ Known

Confidence Intervals for

• A random sample of 100 people shows

that 25 are left-handed.

• Form a 95% confidence interval for the

true proportion of left-handers

Confidence Intervals for • A random sample of 100 people shows

that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.

(continued)

Interpretation• We are 95% confident that the true percentage of

left-handers in the population is between 16.51% and 33.49%.

• Although the interval from 0.1651 to 0.3349 may or may not contain the true proportion, 95% of intervals formed in repeated samples of size 100 in this manner are expected to contain the true proportion.

Confidence Intervals for • A random sample of 1000 people shows

that 250 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.

(continued)

Determining Sample Size

For the Mean

DeterminingSample Size

For theProportion

Sampling Error• The required sample size can be found to reach a

desired margin of error (e) with a specified level of confidence (1 - )

• The margin of error is also called sampling error– the amount of imprecision in the estimate of the population

parameter– the amount added and subtracted to the point estimate to

form the confidence interval


For the Mean


Sampling error (margin of error)


For the Mean


(continued)

Now solve for n to get


• To determine the required sample size for the mean, you must know:

– The desired level of confidence (1 - ), which determines the critical Z value

– The acceptable sampling error, e

– The standard deviation, σ

(continued)

Required Sample Size Example

If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence?

(Always round up)

So the required sample size is n = 220

If σ is unknown

• If unknown, σ can be estimated when using the required sample size formula

–Use a value for σ that is expected to be at least as large as the true σ

–Select a pilot sample and estimate σ with the sample standard deviation, s



For theProportion

Now solve for n to get

(continued)

Determining Sample Size• To determine the required sample size for

the proportion, you must know:– The desired level of confidence (1 - ), which

determines the critical Z value

– The acceptable sampling error, e

– The true proportion of “successes”, π

• π can be estimated with a pilot sample, if necessary (or conservatively use π = 0.5)

(continued)


How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence?

(Assume a pilot sample yields p = 0.12)


Solution:

For 95% confidence, use Z = 1.96

e = 0.03

p = 0.12, so use this to estimate π

So use n = 451

(continued)

What is a Hypothesis?• A hypothesis is a claim

(assumption) about a population parameter:

Example: The mean monthly cell phone bill of this city is μ = $42

Example: The proportion of adults in this city with cell phones is π = 0.68

The Null Hypothesis, H0

• States the claim or assertion to be tested

Example: The average number of TV sets in

U.S. Homes is equal to three ( )

• Is always about a population parameter, not about a sample statistic

The Null Hypothesis, H0

• Begin with the assumption that the null hypothesis is true–Similar to the notion of innocent until

proven guilty• Refers to the status quo• Always contains “=” , “≤” or “” sign• May or may not be rejected

(continued)

The Alternative Hypothesis, H1

• Is the opposite of the null hypothesis– e.g., The average number of TV sets in U.S.

homes is not equal to 3 ( H1: μ ≠ 3 )

• Challenges the status quo• Never contains the “=” , “≤” or “” sign• May or may not be proven• Is generally the hypothesis that the

researcher is trying to prove

Population

Claim: thepopulationmean age is 50.(Null Hypothesis:

REJECT

Supposethe samplemean age is 20: X = 20

SampleNull Hypothesis

20 likely if μ = 50?=Is

Hypothesis Testing Process

If not likely,

Now select a random sample

H0: μ = 50 )

X

Sampling Distribution of X

μ = 50If H0 is true

If it is unlikely that we would get a sample mean of this value ...

... then we reject the null

hypothesis that μ = 50.

Reason for Rejecting H0

20

... if in fact this were the population mean…

X

Economics 105: Statistics

Documents

Transcript of Economics 105: Statistics