Handout 10 Summer2009

90
Introduction to Probability and Statistics Probability & Statistics for Engineers & Scientists, 8th Ed. 2007 Handout #10 Instructor: Kuo-Jung Lee TA: Brian Shea The pdf file for this class is available on the class web page. http://www.stat.umn.edu/~kjlee/STAT3021_Summer2009.html 1

description

It is a summer Handout

Transcript of Handout 10 Summer2009

Page 1: Handout 10 Summer2009

Introduction to Probability and Statistics

Probability & Statistics for Engineers & Scientists, 8th Ed.

2007

Handout #10

Instructor: Kuo-Jung Lee

TA: Brian Shea

The pdf file for this class is available on the class web page.

http://www.stat.umn.edu/~kjlee/STAT3021_Summer2009.html

1

Page 2: Handout 10 Summer2009

Tests of hypotheses about one mean, variance σ2 knownH0 H1 Critical Region

µ = µ0 µ > µ0 x > µ0 + zασ√n

µ = µ0 µ < µ0 x < µ0 − zασ√n

µ = µ0 µ 6= µ0 |x− µ0| > zα/2σ√n

Tests of hypotheses about one mean, variance σ2 unknownH0 H1 Critical Region

µ = µ0 µ > µ0 x > µ0 + t(α,ν=n−1) ·s√n

µ = µ0 µ < µ0 x < µ0 − t(α,ν=n−1) ·s√n

µ = µ0 µ 6= µ0 |x− µ0| > t(α/2,ν=n−1) ·s√n

2

Page 3: Handout 10 Summer2009

Two-Sample T -Test

Tests of hypotheses for equality of two means;

equal variances, σ21 = σ2

2H0 H1 Critical Region

µ1 = µ2 µ1 > µ2 x− y > t(α,ν=n1+n2−2) · sp ·√

1/n1 + 1/n2

µ1 = µ2 µ1 < µ2 x− y < −t(α,ν=n1+n2−2) · sp ·√

1/n1 + 1/n2

µ1 = µ2 µ1 6= µ2 |x− y| > t(α/2,ν=n1+n2−2) · sp ·√

1/n1 + 1/n2

Paired T -Test

Tests of hypotheses about difference of two means,

variance unknownH0 H1 Critical Region

µ1 − µ2 = d µ1 − µ2 > d x− y > d + t(α,ν=n−1) ·sd√n

µ1 − µ2 = d µ1 − µ2 < d x− y < d− t(α,ν=n−1) ·sd√n

µ1 − µ2 = d µ1 − µ2 6= d |x− y − d| > t(α/2,ν=n−1) ·sd√n

3

Page 4: Handout 10 Summer2009

Tests of hypotheses about variance; χ2 = (n−1)s2

σ20

.

H0 H1 Critical Regionσ2 = σ2

0 σ2 > σ20 χ2 > χ2

(α,ν=n−1)σ2 = σ2

0 σ2 < σ20 χ2 < χ2

(1−α,ν=n−1)σ2 = σ2

0 σ2 6= σ20 χ2 > χ2

(α/2,ν=n−1) or χ2 < χ2(1−α/2,ν=n−1)

Tests of hypotheses about the equality of variancesH0 H1 Critical Region

σ21 = σ2

2 σ21 > σ2

2s21s22

> f(α,ν1=n1−1,ν2=n2−1)

σ21 = σ2

2 σ21 < σ2

2s21s22

< f(1−α,ν1=n1−1,ν2=n2−1)

σ21 = σ2

2 σ21 6= σ2

2s21s22

> f(α/2,ν1=n1−1,ν2=n2−1) or

s21s22

< f(1−α/2,ν1=n1−1,ν2=n2−1)

4

Page 5: Handout 10 Summer2009

Tests of hypotheses for one proportionH0 H1 Critical Region

p = p0 p > p0 z = x/n−p0√p0(1−p0)/n

> zα

p = p0 p < p0 z = x/n−p0√p0(1−p0)/n

< zα

p = p0 p 6= p0 z = |x/n−p0|√p0(1−p0)/n

> zα/2

Tests of hypotheses for two proportions; p = x1+x2n1+n2

H0 H1 Critical Region

p1 = p2 p1 > p2 z = p1−p2√p(1−p)( 1

n1+ 1

n2)

> zα

p1 = p2 p1 < p2 z = p1−p2√p(1−p)( 1

n1+ 1

n2)

< zα

p1 = p2 p1 6= p2 z = |p1−p2|√p(1−p)( 1

n1+ 1

n2)

> zα/2

5

Page 6: Handout 10 Summer2009

χ2-tests

H0 H1 Critical RegionExpected Distribution Not H0 χ2 > χ2

(α,ν=k−1)Independence Dependence χ2 > χ2

(α,ν=(r−1)×(c−1))Homogeneity Inhomogeneity χ2 > χ2

(α,ν=(r−1)×(c−1))p0 = p2 = · · · = pk Not all equal χ2 > χ2

(α,ν=k−1)

6

Page 7: Handout 10 Summer2009

Chapter 10

One- and Two-Sample Tests of Hypotheses

Statistical Hypotheses: General Concepts

7

Page 8: Handout 10 Summer2009

Instead of making an estimate about a population parameter,

you’ll learn how to test a claim about a parameter.

Example 1

Suppose that you work for Gallup and are asked to test a claim

that the proportion of eligible American voters who support

Barack Obama is p = 0.47. To test the claim, you take a ran-

dom sample of n = 1200 eligible voters and find 594 of them

support Barack Obama. Your sample statistic is p = 0.495.

Is the sample statistic identical enough to your claim (p = 0.47)

to decide that the claim is true, or different enough from the

claim (p = 0.47) to decide that the claim is false?

8

Page 9: Handout 10 Summer2009

Example 2

Suppose that you work for Gallup and are asked to test a claim

that 68 percent of Americans approve of Obama’s performance

as the nation’s chief executive. To test the claim, you take a

random sample of n = 1200 eligible voters and find 780 of them

support Barack Obama. Your sample statistic is p = 0.65.

Is the sample statistic identical enough to your claim (p = 0.68)

to decide that the claim is true, or different enough from the

claim (p = 0.68) to decide that the claim is false?

9

Page 10: Handout 10 Summer2009

Example 3

• A medical researcher may decide on the basis of experimental

evidence whether coffee drinking increases the risk of

cancer in humans.

• An engineer might have to decide on the basis of sample data

whether there is a difference between the accuracy of

two kinds of gauges.

• A sociologist might wish to collect appropriate data to enable

him or her to decide whether a person’s blood type and

eye color are independent variables.

10

Page 11: Handout 10 Summer2009

In each of these cases the scientist or engineer postulates or

conjectures something about a system. In addition, each must

involve the use of experimental data and decision-making that is

based on the data.

Statistical Hypothesis

A statistical hypothesis is an assertion or conjecture con-

cerning one or more populations.

We take a random sample from the population of interest and

use the data contained in this sample to provide evidence that

either supports or does not support the the hypothesis. Evidence

from the sample that is inconsistent with the stated hypothesis

leads to a rejection of the hypothesis.

11

Page 12: Handout 10 Summer2009

The rejection of a hypothesis implies that the sample evidence

refutes it. Put another way, rejection means that there is a

small probability of obtaining the sample information ob-

served when, in fact, the hypothesis is true.

Example 4

Suppose that the hypothesis postulated by the engineer is that

the fraction defective in a certain process is p = 0.10. A sam-

ple of 100 revealing 20 defective items is certainly evidence of

rejection. Why?

If, indeed, p = 0.10, the probability of obtaining 20 or more

defectives is approximately 0.002. With the resulting small risk

of a wrong conclusion it would seem safe to reject the hypothesis

that p = 0.10.

12

Page 13: Handout 10 Summer2009

The formal statement of a hypothesis is often influenced by the

structure of the probability of a wrong conclusion. If the scientist

is interested in strongly supporting a contention, he or she

hopes to arrive at the contention in the form of rejection a

hypothesis.

Example 5

If the medical researcher wishes to show strong evidence in favor

of the contention that coffee drinking increases the risk of cancer,

the hypothesis tested should be of the form ”there is no increase

in cancer risk produced by drinking coffee.” As a result, the

contention is reached via a rejection.

13

Page 14: Handout 10 Summer2009

Example 6

Similarly, to support the claim that one kind of gauge is more

accurate than another, the engineer tests the hypothesis that

there is no difference in the accuracy of the two kinds of gauges.

The foregoing implies that when the data analyst formalizes ex-

perimental evidence on the basis of hypothesis testing, the formal

statement of hypothesis is very important.

14

Page 15: Handout 10 Summer2009

Null Hypothesis: H0 and Alternative Hypothesis: H1

The structure of hypothesis testing will be formulated with the

use of the term null hypothesis which we wish to test and is

denoted by H0.

The alternative hypothesis H1 usually represents the question

to be answered, the theory to be tested and thus its specification

is crucial.

Reject H0: in favor of H1 because of sufficient evidence in the

data.

Fail to reject H0: because of insufficient evidence in the data.

15

Page 16: Handout 10 Summer2009

Stating a Hypothesis

A null hypothesis, H0, is statistical hypothesis that contains a

statement of equality, such as ”≤,=,≥”.

The alternative hypothesis, H1, is the complement of the null

hypothesis. It is a statement that must be true if H0 is false and

it contains a statement of inequality, such as ”>, 6=, <”.

16

Page 17: Handout 10 Summer2009

One-Tailed Tests

H0 : θ = ( or ≤ ) θ0 versus H1 : θ > θ0

H0 : θ = ( or ≥ ) θ0 versus H1 : θ < θ0

Two-Tailed Test

H0 : θ = θ0 versus H1 : θ 6= θ0

17

Page 18: Handout 10 Summer2009

How Are the Null and Alternative Hypotheses

Chosen?

18

Page 19: Handout 10 Summer2009

Example 7

Write the claim as a mathematical sentence. State the null and

alternative hypotheses.

1. A university claims that the proportion of its students who

graduate in four years is 82%.

2. A television manufacturer claims that the variance of the life

of a certain type of television is less than or equal to 3.5.

3. A cereal company claims that the mean weight of the con-

tents of its 20-ounce cereal boxes is more than 20 ounces.

19

Page 20: Handout 10 Summer2009

Solution:

1. The claim ”the proportion . . . is 82%” can be written as

p = 0.82. Its complement is p 6= 0.82. Because contains the

statement of equality, it becomes the null hypothesis.

H0 : p = 0.82 versus H1 : p 6= 0.82.

2. The claim ”the mean . . . is less than or equal to 3.5” can be

written as σ2 ≤ 3.5. Its complement is σ2 > 3.5. Because

σ2 ≤ 3.5 contains the statement of equality, it becomes the

null hypothesis.

H0 : σ2 ≤ 3.5 versus H1 : σ2 > 3.5.

20

Page 21: Handout 10 Summer2009

3. The claim ”the mean . . . is more than 20 ounces” can be

written as µ > 20. Its complement is µ ≤ 20. Because

µ ≤ 20 contains the statement of equality, ti becomes the

null hypothesis.

H0 : µ ≤ 20 versus H1 : µ > 20.

Page 22: Handout 10 Summer2009

Example 8

A manufacturer of a certain brand of rice cereal claims that the

average saturated fat content does not exceed 1.5 grams. State

the null and alternative hypotheses to be used in testing this

claim and determine where the critical regions is located.

Solution:

The manufacture’s claim should be rejected only if µ is greater

than 1.5 milligrams and should not be rejected if µ is less than

or equal to 1.5 milligrams. We test

H0 : µ = 15 versue H1 : µ > 15,

the non-rejection of H0 does not rule out values less than 1.5

milligrams. Since we have a one-tailed test, the greater than

symbol indicates that the critical region lies entirely in the right

tail of the distribution of our test statistic X.21

Page 23: Handout 10 Summer2009

Testing a Statistical Hypothesis

22

Page 24: Handout 10 Summer2009

A certain type of cold vaccine is known to be only 25% effec-

tive after 2 years. To determine if a new and somewhat more

expensive vaccine is superior in providing protection against the

same virus for a longer period of time.

Suppose that 20 people are chosen at random and inoculated.

In an actual study of this type of participants receiving the new

vaccine might number several thousand. The number 20 is being

used here only to demonstrate the basic steps in carrying out a

statistical test. If more than 8 of those receiving the new vaccine

surpass the 2-year period without contracting the virus, the new

vaccine will be considered superior to the one presently in use.

The requirement that the number exceed 8 is somewhat arbitrary

but appears reasonable in that it represents a modest gain over

the 5 people that could be expected to receive protection if

the 20 people had been inoculated with the vaccine already in

23

Page 25: Handout 10 Summer2009

use. We are essentially testing the null hypothesis that the new

vaccine is equally effective after a period of 2 years as the one

now commonly used. The alternative hypothesis is that the new

vaccine is in fact superior. This is equivalent to testing the

hypothesis that the binomial parameter for the probability of a

success on a given trial is p = 1/4 against the alternative that

p > 1/4. This is usually written as follows:

H0 : p = 0.25 versus H1 : p > 0.25.

Decision criterion for test p = 0.25 versus p > 0.25.

Page 26: Handout 10 Summer2009

Remember, the only way to be certain of whether H0 is true or

false is to test the entire population. Because your decision (to

reject H0 or fail to reject H0) is based on a sample, you must

accept the fact that your decision might be incorrect.

Type I Error

Rejection of the null hypothesis when it is true is called a type

I error.

α = P (Reject H0|H0 is Ture)

The probability of committing a type I error, also called the level

of significance.

24

Page 27: Handout 10 Summer2009

Type II Error

Nonrejection of the null hypothesis when it is false is called a

type II error.

β = P (Do Not Reject H0|H1 is Ture)

Type I and Type II Errors

H0 is true H0 is falseDo not reject H0 Correct decision Type II error (β)

Reject H0 Type I error (α) Correct decision

25

Page 28: Handout 10 Summer2009

Example 9

The proportion of adults living in a small town who are college

graduates is estimated to be p = 0.6. To test this hypothesis,

a random sample of 15 adults is selected is anywhere from 6 to

12, we will not reject the null hypothesis that p = 0.6; otherwise,

we shall conclude that p 6= 0.6.

1. Evaluate type I Error, α.

2. Evaluate type II Error, β for the alternative p = 0.5.

26

Page 29: Handout 10 Summer2009

Solution:

H0 : p = 0.6 versus H1 : p 6= 0.6.

1.

α = P (Type I Error)

= P (X < 6 or X > 12|p = 0.6)

=5∑

x=0

(15x

)(0.6)x(0.4)15−x +

15∑x=13

(15x

)(0.6)x(0.4)15−x

≈ 0.0609

27

Page 30: Handout 10 Summer2009

2.

β = P (Type II Error)

= P (6 ≤ X ≤ 13|p = 0.5)

=12∑

x=6

(15x

)(0.5)x(0.5)15−x ≈ 0.8454

Ideally, we like to use a test procedure for which the type I and

type II error probabilities are both small.

Page 31: Handout 10 Summer2009

Example 10

A fabric manufacturer believes that the proportion of orders for

raw material arriving late is p = 0.6. If a random sample of

10 orders show that 3 or fewer arrived late, the hypothesis that

p = 0.6 should be rejected in favor of the alternative p < 0.6.

Using the binomial distribution.

1. Find the probability of committing a type I error if the true

proportion is p = 0.6.

2. Find the probability of committing a type II error for the

alternative p = 0.3.

28

Page 32: Handout 10 Summer2009

Solution:

29

Page 33: Handout 10 Summer2009

The Role of α, β, and Sample Size

Let us assume the the director of the testing program is unwilling

to commit a type II error when the alternative hypothesis p = 0.5

is true even though we have found the probability of such an

error to be β = 0.8454. A reduction in β is always possible by

increasing the size of the critical region.

By adopting a new decision procedure, we have reduced the

probability of committing a type II error at the the expense of

increasing the probability of committing a type I error.

30

Page 34: Handout 10 Summer2009

Example 11

Consider what happens to the values of α and β when we changeour rejection region to be anywhere greater than 10 or less than7 that Now, in testing p = 0.6 against the alternative hypothesisthat p = 0.5, we find that

α = P (Type I Error) = P (X < 7 or X > 10|p = 0.6)

≈ 0.2402

β = P (Type II Error) = P (7 ≤ X ≤ 10|p = 0.5)

≈ 0.6876

For a fixed sample size, a decrease in the probability of oneerror will usually result in an increase in the probability of theother error.

Fortunately, the probability of committing both types of er-ror can be reduced by increasing the sample size.

31

Page 35: Handout 10 Summer2009

Example 12

Consider the null hypothesis that the average weight of male

students in a certain college is 68 kilograms against alternative

hypothesis that it is unequal to 68. That is, we wish to test

H0 : µ = 68 versus H1 : µ 6= 68.

The alternative hypothesis allows for the possibility that µ < 68

or µ > 68. The probability of committing a type I error, or the

level of significance of our test, is equal to the sum of the areas

that have been shaded in each tail of the distribution in Figure.

Assume the standard deviation of the population of weights to

be σ = 3.6. Our decision statistic, based on a random sample of

size n = 36, will be X. From C.L.T., we know that the sampling

32

Page 36: Handout 10 Summer2009

distribution of X is approximately normal with standard deviation

σX = 3.6/6 = 0.6.

α = P (X < 67|µ = 68) + P (X > 69|µ = 68) = 0.0950.

Thus, 9.5% of all samples of size 36 would lead us to reject

µ = 68 kilograms when, in fact, it is true.

Critical region for testing µ = 68 versus µ 6= 68.

Page 37: Handout 10 Summer2009

The reduction in α is not sufficient by itself to guarantee a good

testing procedure. We must also evaluate β for various alterna-

tive hypotheses.

β = P (67 ≤ X ≤ 69|µ = 70) = 0.0132.

If the true value of µ is the alternative µ = 70, the value of β

will be 0.0132.

Probability of type II error for testing µ = 68 versus µ 6= 70.

Page 38: Handout 10 Summer2009

β = P (67 ≤ X ≤ 69|µ = 68.5) = 0.8661.

Probability of type II error for testing µ = 68 versus µ 6= 68.5.

Page 39: Handout 10 Summer2009

Power

The power of a test is the probability of rejecting H0 given that

a specific alternative is true.

1− β = P (Reject H0|H1 is True).

In a sense, the power is a more succinct measure of how sensitive

the test is for ”detecting differences” between a mean of 68 and

68.5. In this case, if µ is truly 68.5, the test described will

properly reject H0 only 13.39% of the time.

33

Page 40: Handout 10 Summer2009

After stating the null and alternative hypotheses and specifying

the level of significance (α), the next step in a hypothesis test

is to obtain a random sample from the population and calculate

sample statistics such as the mean and the standard deviation.

The statistic that is compared to the parameter in the null hy-

pothesis is called the test statistic. The type of test used and

sampling distribution is based on the test statistic.

34

Page 41: Handout 10 Summer2009

Test statistic

A test statistic T is a quantity calculated from our sampleof data. Its value is used to decide whether or not the nullhypothesis should be rejected in our hypothesis test. The choiceof a test statistic will depend on the assumed probability modeland the hypotheses under question.

Critical Region and Critical Value

The critical region C of a statistical test refers to the set ofvalues for the corresponding test statistic which tend to supportthe alternative hypothesis rather than the null hypothesis.

Reject H0 ⇔ T ∈ C

The critical value(s) for a hypothesis test is a threshold to whichthe value of the test statistic in a sample is compared to deter-mine whether or not the null hypothesis is rejected.

35

Page 42: Handout 10 Summer2009

Important Properties of a Test of Hypothesis

1. The type I error and type II error are related.

2. The size of the critical region can always be reduced by ad-

justing the critical value.

3. An increase in the sample size n will reduce α and β simul-

taneously.

4. The greater the distance between the true value and the

hypothesized value, the smaller β will be.

36

Page 43: Handout 10 Summer2009

P -Value

Each statistical test has an associated null hypothesis, the p-value is the probability that your sample could have been

drawn from the population(s) being tested (or that a moreimprobable sample could be drawn) given the assumption thatthe null hypothesis is true. A p-value of .05, for example, in-dicates that you would have only a 5% chance of drawing thesample being tested if the null hypothesis was actually true.

This is a way of assessing the ”believability” of the null hypoth-esis, given the evidence provided by a random sample.

Definition

A p-value is the lowest level (of significance) at which the ob-served value of the test statistic is significant.

37

Page 44: Handout 10 Summer2009

Interpreting a P -Value

Could random variation alone account for the difference between

the null hypothesis and observations from a random sample?

• A small p-value implies that random variation due to the sam-

pling process alone is not likely to account for the observed

difference.

• With a small p-value we reject H0. The true property of the

population is significantly different from what was stated in

H0.

Thus, small p-values are strong evidence AGAINST H0.

38

Page 45: Handout 10 Summer2009

Example 13

H0 : µ = 68 versus H1 : µ 6= 68.

Suppose a value of z = 1.87 is observed. In fact, in a two-tailed

scenario one can quantify this risk as

p-value = 2P (Z > 1.87|µ = 10) = 0.0614

The risk of committing a type I error if one rejects H0 in this

could hardly be considered severe. Strictly speaking, with α =

0.05 the value is not significant.

Although this evidence against H0 is not as strong as that which

would result from a rejection at an α = 0.05 level, it is important

information to the user. The approach is designed to give the

user an alternative (in terms of a probability) to a mere

”reject” or ”do not reject” conclusion.

39

Page 46: Handout 10 Summer2009

For example, if z is 2.73, it is informative for the user to observe

that

p-value = 2P (Z > 2.73|µ = 10) = 0.0064

It is important to know that under the condition of H0, a value

of z = 2.73 is an extremely rare event. Namely, a value at least

that large in magnitude would only occur 64 times in 10,000

experiments.

Page 47: Handout 10 Summer2009

Approach to Hypothesis Testing

1. State the null and alternative hypotheses.

2. Choose a fixed significance level α.

3. Choose an appropriate test statistic and establish the critical

region based on α.

4. From the computed test statistic, reject H0 if the test statis-

tic is in the critical region. Otherwise, do not reject.

5. Draw scientific or engineering conclusion.

40

Page 48: Handout 10 Summer2009

Significance Testing (P -Value Approach)

1. State null and alternative hypotheses.

2. Choose an appropriate test statistic.

3. Compute p-value based on computed value of test statistic.

4. Use judgment based on p-value and knowledge of scientific

system.

41

Page 49: Handout 10 Summer2009

Single Sample: Tests Concerning a Single Mean

(Variance Known)

42

Page 50: Handout 10 Summer2009

Test for Single Mean (Variance Known)

z =x− µ0

σ/√

n> zα/2 or z =

x− µ0

σ/√

n< −zα/2

If −zα/2 < z < zα/2, do not reject H0. Rejections of H0, of

course, implies acceptance of the alternative hypothesis H1 :

µ 6= µ0. With this definition of critical region it should be clear

that there will be probability α of rejecting H0 (failing into the

critical region) when, indeed, µ = µ0. For H1 : µ > µ0, rejection

results when z > zα. For H1 : µ < µ0, rejection results when

z < −zα.

43

Page 51: Handout 10 Summer2009

Example 14

A random sample of 100 recorded deaths in the United States

during the past year showed an average life span of 71.8 years.

Assuming a population standard deviation of 8.9 years, does this

seem to indicate that the mean life span today is greater than

70 years? Use a 0.05 level of significance.

1. H0 : µ = 70 years.

2. H1 : µ > 70 years.

3. α = 0.05.

44

Page 52: Handout 10 Summer2009

4. Critical region: z > 1.645, where z = x−µ0σ/√

n.

5. Computations: x = 71.8 years, σ = 8.9 years, and z =71.8−708.9/

√100

= 2.02.

6. Decision: Reject H0 and conclude that the mean life span

today is greater than 70 years.

The P -value corresponding to z = 2.02 is given by the area of

shaded region in Figure. P (Z > 2.02) = 0.0217. As a result,

the evidence in favor of H1 is even stronger than suggested by

a 0.05 level of significance.

Page 53: Handout 10 Summer2009

Decision criterion for test µ = 70 versus µ > 70.

45

Page 54: Handout 10 Summer2009

Example 15

A manufacturer of sports equipment has developed a new syn-

thetic fishing line that he claims has a mean breaking strength

of 8 kilograms with a standard deviation of 0.5 kilogram. Test

the hypothesis that µ = 8 kilograms against the alternative that

µ 6= 8 kilograms if a random sample of 50 lines is tested and

found to have a mean breaking strength of 7.8 kilograms. Use

a 0.01 level of significance.

1. H0 : µ = 8 kilograms.

2. H1 : µ 6= 8 kilograms.

3. α = 0.01.46

Page 55: Handout 10 Summer2009

4. Critical region: z < −2.575 and z > 2.575, where z = x−µ0σ/√

n.

5. Computations: x = 7.8 years, σ = 0.5 years, and z =7.8−8

0.5/√

100= −2.83.

6. Decision: Reject H0 and conclude that the mean life span

today is greater than 70 years.

The P -value corresponding to z = −2.83 is given by twice the

area of shaded region in Figure.. P (|Z| > 2.83) = 0.0046. As a

result, the evidence in favor of H1 is even stronger than suggested

by a 0.01 level of significance.

Page 56: Handout 10 Summer2009

Decision criterion for test p = 0.25 versus p > 0.25.

47

Page 57: Handout 10 Summer2009

Example 16

The average height of females in the freshman class of a certain

college has been 162.5 centimeters with a standard deviation of

6.9 centimeters. Is there reason to believe that there has been a

change in the average height at the 0.05 level of significance if a

random sample of 50 females in the present freshman class has

an average height of 165.2 centimeters? Assume the standard

deviation remains the same.

48

Page 58: Handout 10 Summer2009

Solution:

1. H0 : µ = 162.5 centimeters.

2. H1 : µ 6= 162.5 centimeters.

3. α = 0.05.

4. Critical region: |z| < 1.96 , where z = x−µ0σ/√

n.

5. Computations: x = 162.5 kilowatt hours, σ = 6.9 kilowatthours, and n = 50. Hence

z =42− 46

11.9/√

12= −1.16 p− value = P (T < −1.16) ≈ 0.135

49

Page 59: Handout 10 Summer2009

6. Do not reject H0 and conclude that the average number of

kilowatt hours expended annually by vacuum cleaners is not

significantly less than 46.

Page 60: Handout 10 Summer2009

Relationship to Confidence Interval Estimation

50

Page 61: Handout 10 Summer2009

Confidence interval estimation involves computation of bounds

for which it is ”reasonable” that the parameter in question is

inside the bounds. For the case of a single population mean

µ with σ2 known, the structure of both hypothesis testing and

confidence interval estimation is based on the random variable

Z =X − µ

σ/√

n.

It turns out that the testing of H0 : µ = µ0 against H1 : µ 6= µ0 at

a significance level α is equivalent to computing a 100(1− α)%

confidence interval on µ and rejecting H0 if µ0 is outside the

confidence interval. If µ0 is inside the confidence interval, the

hypothesis is not rejected.

51

Page 62: Handout 10 Summer2009

Single Sample: Tests Concerning a Single Mean

(Variance Unknown)

52

Page 63: Handout 10 Summer2009

Test for Single Mean (Variance Unknown)

t =x− µ0

s/√

n> tα/2(n− 1) or t =

x− µ0

s/√

n< −tα/2(n− 1)

If −tα/2(n−1) < t < tα/2(n−1), do not reject H0 for H1 : µ 6= µ0.

Rejections of H0, of course, implies acceptance of the alternative

hypothesis µ 6= µ0. With this definition of critical region it should

be clear that there will be probability α of rejecting H0 (failing

into the critical region) when, indeed, µ = µ0. For H1 : µ > µ0,

rejection results when t > tα(n − 1). For H1 : µ < µ0, rejection

results when t < −tα(n− 1).

53

Page 64: Handout 10 Summer2009

Example 17

The Edison Electric Institute has published figures on the annualnumber of kilowatt hours expended by various home appliances.It is claimed that a vacuum cleaner expends an a 46 kilowatthours per year. If a random sample of 12 homes included in aplanned study indicates that vacuum cleaners expend an averageof 42 kilowatt hours per year with a standard deviation of 11.9kilowatt hours, does this suggest at the 0.05 level of significancethat vacuum cleaners expend, on the average, less than 46 kilo-watt hours annually? Assume the population of kilowatt hoursto be normal.

1. H0 : µ = 46 kilowatt hours.

2. H1 : µ < 46 kilowatt hours.

54

Page 65: Handout 10 Summer2009

3. α = 0.05.

4. Critical region: t < −1.796 , where t = x−µ0s/√

n.

5. Computations: x = 42 kilowatt hours, s = 11.9 kilowatt

hours, and n = 12. Hence

t =42− 46

11.9/√

12= −1.16 p− value = P (T < −1.16) ≈ 0.135

6. Do not reject H0 and conclude that the average number of

kilowatt hours expended annually by vacuum cleaners is not

significantly less than 46.

Page 66: Handout 10 Summer2009

Example 18

An industrial company claims that the mean pH level of the

water in a nearby river is 6.8. You randomly select 19 water

samples and measure the pH of each. The sample mean and

standard deviation are 6.7 and 0.24, respectively. Is there enough

evidence to reject the company’s claim at α = 0.05. Assume the

population is normally distributed.

55

Page 67: Handout 10 Summer2009

Solution:

56

Page 68: Handout 10 Summer2009

Two Samples: Tests on Two Means

57

Page 69: Handout 10 Summer2009

Two-Sample T -Test

T =X1 − X2 − (µ1 − µ2)√

S2p

(1n1

+ 1n2

)where

S2p =

(n1 − 1)S21 + (n2 − 1)S2

2

n1 + n2 − 2

The t-distribution is involved and the two-sided hypothesis is not

rejected when

−tα/2(n1 + n2 − 2) < t < tα/2(n1 + n2 − 2)

For H1 : µ1− µ2 > d0, rejection results when t > tα(n1 + n2− 2).

For H1 : µ1− µ2 < d0, rejection results when t < tα(n1 + n2− 2).

58

Page 70: Handout 10 Summer2009

Paired T -Test

T =D − µD

Sd/√

n

where D and Sd are random variables representing the sample

mean and standard deviations of the differences of the obser-

vations in the experimental units. The t-distribution is involved

and the two-sided hypothesis is not rejected when

−tα/2(n− 1) < t < tα/2(n− 1)

For H1 : µD > d0, rejection results when t > tα(n− 1).

For H1 : µD < d0, rejection results when t < tα(n− 1).

59

Page 71: Handout 10 Summer2009

Example 19

An experiment was performed to compare the abrasive wear of

two different laminated materials. Twelve pieces of material 1

were tested by exposing each piece to a machine measuring wear.

Ten pieces of material 2 were similarly tested. In each case, the

depth of wear was observed. The samples of material 1 gave

an average (coded) wear of 85 units with a sample standard

deviation of 4, while the samples of material 2 gave an average

of 81 and a sample standard deviation of 5. Can we conclude at

the 0.05 level of significance that the abrasive wear of material

1 exceeds that of material 2 by more than 2 units? Assume the

populations to be approximately normal with equal variances.

60

Page 72: Handout 10 Summer2009

Solution:

1. H0 : µ1 − µ2 = 2.

2. H1 : µ1 − µ2 > 2.

3. α = 0.05.

4. Critical region: t > 1.725 , where T = x1−x2−(µ1−µ2)√s2p

(1

n1+ 1

n2

) with

ν = 20 degrees of freedom.

5. Computations:

x1 = 85, s1 = 4, n1 = 12,x2 = 81, s2 = 5, n1 = 10.

61

Page 73: Handout 10 Summer2009

sp =

√11 · 16 + 9 · 25

12 + 10− 2= 4.478,

t =(85− 81)− 2

4.478√

1/12 + 1/10= 1.04,

P =P (T > 1.04) ≈ 0.16.

6. Decision: Do not reject H0. We are unable to conclude that

the abrasive wear of material 1 exceeds that of material 2 by

more than 2 units.

Page 74: Handout 10 Summer2009

Example 20

A vendor of milk product produces and seel low-fat dry milk to

a company that use it to produce baby formula. In order to de-

termine the fat content of the milk, both the company and the

vendor take a sample from each lot and test it for fat content in

percent. Ten sets of paired test results are as follows

62

Page 75: Handout 10 Summer2009

Company Test Vendor TestLot Number Results (X) Results (Y ) Di = Xi − Yi

1 0.50 0.79 -0.292 0.58 0.71 -0.133 0.90 0.82 0.084 1.17 0.82 0.355 1.14 0.73 0.416 1.25 0.77 0.487 0.75 0.72 0.038 1.22 0.79 0.439 0.74 0.72 0.0210 0.80 0.91 -0.11

Page 76: Handout 10 Summer2009

Let µD denote the mean of the differences. Test H0 : µD = 0

against H1 : µD > 0 using a paired t-test with differences. Let

α = 0.05.

Page 77: Handout 10 Summer2009

Solution:

1. H0 : µD = 0.

2. H1 : µD > 0.

3. α = 0.05.

4. Critical region: t > 1.833 , where T = d−µDsd/

√n

with ν = 9degrees of freedom.

5. Computations: The sample mean and standard deviation forthe d′is are

d1 = 0.127, sd = 0.272, t = 0.127−00.272/

√10

= 1.477.

63

Page 78: Handout 10 Summer2009

6. Decision: Though the t-statistic is not significant at the 0.05

level. Do not reject H0. However,

P =P (T > 1.477) ≈ 0.087.

As a result, there is some evidence that there is a difference

in the fat content in percent.

Page 79: Handout 10 Summer2009

Example 21

A taxi company is trying to decide whether to purchase brand A

or brand B tires for its fleet of taxis. To estimate the difference

in the two brands, an experiments is conducted using 16 of each

brand. The tires are run until they wear out. The results are

Brand A: xA = 36,000 kilometers; sA = 5,000 kilometers.

Brand B: xB = 38,000 kilometers; sB = 5,200 kilometers.

Assuming the populations to be approximately normally distributed

with equal variances.

64

Page 80: Handout 10 Summer2009

1. Compute a 95% confidence interval for µA − µB.

2. Test the hypothesis that there is no difference in the average

wear of 2 brands of tires. Use a significance level of 0.05.

Page 81: Handout 10 Summer2009

Solution:

65

Page 82: Handout 10 Summer2009

Example 22

Referring to the previous problem, suppose a tire from eachcompany is assigned at random to the rear wheels of 8 taxis andthe following distance, in kilometers, recorded:

Taxi Brand A Brand B1 34,000 36,0002 45,000 46,0003 36,000 37,0004 32,000 31,0005 48,000 47,0006 32,000 36,0007 38,000 38,0008 30,000 31,000

Assume that the differences of the distances are approximatelynormally distributed.

66

Page 83: Handout 10 Summer2009

1. Compute a 95% confidence interval for µA − µB.

2. Test the hypothesis that there is no difference in the average

wear of 2 brands of tires. Use a significance level of 0.05.

Page 84: Handout 10 Summer2009

Solution:

67

Page 85: Handout 10 Summer2009

Choice of Sample Size for Testing Means

68

Page 86: Handout 10 Summer2009

In most practical circumstances the experiment should be planned

with a choice of sample size made prior to the data-taking pro-

cess if possible. The sample size is usually made to achieve good

power for a fixed α and fixed specified alternative.

Suppose that we wish to test the hypothesis

H0 : µ = µ0 versus H1 : µ > µ0.

with a significance level α when the variance σ2 is known. Sup-

pose the rejection region is R = {X > a}.For a specific alterna-

tive, say µ = µ0 + δ, the power of our test is

1− β = P(X > a|µ = µ0 + δ( H1 is ture)

)

69

Page 87: Handout 10 Summer2009

Testing µ = µ0 versus µ = µ0 + δ.

Page 88: Handout 10 Summer2009

Therefore,

β = P(X < a|µ = µ0 + δ (H1 is ture)

)= P

[X − (µ0 + δ)

σ/√

n<

a− (µ0 + δ)

σ/√

n|µ = µ0 + δ

]

= P

[Z <

a− µ0

σ/√

n−

δ

σ/√

n

]

= P

[Z < zα −

δ

σ/√

n

]from which we conclude that

−zβ = zα −δ

σ/√

n

and hence the choice of sample size

n ≈(zα + zβ)

2σ2

δ2

a result that is also true when the alternative hypothesis is H1 :

Page 89: Handout 10 Summer2009

µ < µ0. In a case of a two-tailed test we obtain the power 1− β

for a specified alternative when

n ≈(zα/2 + zβ)

2σ2

δ2

Page 90: Handout 10 Summer2009

Example 23

Suppose that we wish to test the hypothesis

H0 : µ = 68 kilograms versus H1 : µ > 68 kilograms

for the weights of male students at a certain college using anα = 0.05 level of significance when it is known that σ = 5. Findthe sample size required if the power of our test is to be 0.95when the true mean is 69 kilograms.Solution:Since α = β = 0.05, we have zα = zβ = 1.645. For the alterna-tive µ = 69 = µ0 + 1, we take δ = 1 and then

n =(1.645 + 1.645)2 · 25

1= 270.6

Therefore, 271 observations are required if the test is to rejectthe null hypothesis 95% of the time when, in fact, µ is as largeas 69 kilograms.

70