ANOVA Concept

Post on 28-Jul-2015

52 views 0 download

Transcript of ANOVA Concept

Analysis of Variance for Standard Designs

Chapter 15

Completely Randomized Design with a Single Factor

Table 15.3: A Completely Randomized Design

Treatment Mean

1

2

1 1.11 12

2 2.21 22

.1 2

1

2

t

n

n

tn tt t

y yy y

y yy y

y yt y y

The Model for a Completely Randomized Design

ij i ijy 1, ..., ; 1, ..., ii t j n

where

: Observation on j th experimental unit receiving treatment i

: Overall treatment mean, an unknown constant

ijy

The Model for a Completely Randomized Design, cont.

ij

: An effect due to treatment i, an unknown constant

: A random error associated with the response from the j th experimental unit receiving treatment i. We require that the s have a normal distribution with mean 0 and common variance . In addition, the errors must be independent.

i

ij2

Sums of Squares

2

..TSS ijij

y y

2 22

.. . .. .ij i i ij iij i ij

y y n y y y y

Total Sum of Squares (TSS)

Partition of TSS

Sums of Squares, cont.

Between-treatment sum of squares (SST)

Sum of squares for error (SSE)

2

. ..SST ii

n y y

2

.SSE= ij iij

y y

Table 15.4: Analysis of Variance Table for a Completely Randomized Design

Source SS df MS F

Treatments SST 1 MST=SST/ 1 MST/MSE

Error SSE MSE=SSE/

Total TSS 1

t t

N t N t

N

Unbiased Estimates

When is true, both MST and MSE are unbiased estimates of , the variance of the experimental error. That is, under Ho, both have a mean value in repeated sampling, called the expected mean squares, equal to .

1 2: ... 0o tH 2

2

Expected Mean Squares

2MSEE

2MST TE n

Under ,

aH

Advantages of the Completely Randomized Design

1. The design is extremely easy to construct.

2. The design is easy to analyze even though the sample sizes might not be the same for each treatment.

3. The design can be used for any number of treatments.

Disadvantages of the Completely Randomized Design

1. Although the completely randomized design can be used for any number of treatments, it is best suited for situations in which there are relatively few treatments.

2. The experimental units to which treatments are applied must be as homogeneous as possible. Any extraneous sources of variability will tend to inflate the error term, making it more difficult to detect differences among the treatment means.

Randomized Complete Block Design

Location

1 2 3 4

P1

P1

P1

P1

P2

P2

P2

P2

P3

P3

P3

P3

P4

P4

P4

P4

Table 15.6: Random assignment of 4 paints to 16 sections of roadway

Confounded factors: location & type of paint

Randomized Complete Block Design, cont.

Location

1 2 3 4

P2

P1

P3

P4

P2

P4

P1

P3

P1

P3

P4

P2

P1

P2

P4

P3

Table 15.7: Randomized complete block assignment of 4 paints to 16 sections of roadway

Definition

A randomized complete block design is an experimental design for comparing t treatments in b blocks. The blocks consist of t homogeneous experimental units. Treatments are randomly assigned to experimental units within a block, with each treatment appearing exactly once in every block.

Advantages of the Randomized Complete Block Design

1. The design is useful for comparing t treatment means in the presence of a single extraneous source of variability.

2. The statistical analysis is simple.

3. The design is easy to construct.

4. It can be used to accommodate any number of treatments in any number of blocks.

Disadvantages of the Randomized Complete Block Design

1. Because the experimental units within a block must be homogeneous, the design is best suited for a relatively small number of treatments.

2. This design controls for only one extraneous source of variability (due to blocks). Additional extraneous sources of variability tend to increase the error term, making it more difficult to detect treatment differences.

3. The effect of each treatment on the response must be approximately the same from block to block.

Figure 15.1: Treatment Means in a Randomized Block Design

100

90

80

70

60

50

40

1 2 3

Treatment

ij34

1

2

34

1

2

34

1

2

Plot of Treatment Mean by Treatment

The Hypotheses for Testing Treatment Mean Differences

1 2: ...

: At least one differs from the resto t

a i

H

H

The null hypothesis is no difference among treatment means versus the research hypothesis treatment means differ.

Total Sum of Squares

2

..TSS ijij

y y Partition of TSS:

2 2

. ..

22

. ..

22

ˆSST

ˆSSB

ˆˆ ˆSSE TSS-SST-SSB

i ii i

j jj j

ij ij i jij ij

b y y b

t y y t

e y

Table 15.11: Analysis of Variance Table for a Randomized Complete Block Design

Source SS df MS F

Treatments SST 1 MST SST/( 1) MST/MSE

Blocks SSB 1 MSB SSB/( 1) MSB/MSE

Error SSE 1 1 MSE=SSE/ 1 1

Total TSS 1

t t

b b

b t b t

bt

Expected Mean Squares

When is true, both MST and MSE are unbiased estimates of , the variance of the experimental error.

1 2: ...o tH 2

2 2MST MSEE E

Relative Efficiency

RE(RCB, CR): the relative efficiency of the randomized complete block design compared to a completely randomized design

Did blocking increase our precision for comparing treatment means in a given experiment?

Latin Square Design

Secretary

Problem 1 2 3 4

I A A C A

II B D A D

III D B D B

IV C C B C

Secretary

Problem 1 2 3 4

I A B C D

II B C D A

III C D A B

IV D A B C

Table 15.14: A Randomized Complete Block Design for the Spreadsheet Study

Table 15.15: A Latin Square Design for the Spreadsheet Study

Advantages of the Latin Square Design

1. The design is particularly appropriate for comparing t treatment means in the presence of two sources of extraneous variation.

2. The analysis is quite simple.

Disadvantages of the Latin Square Design

1. Although a Latin square can be constructed for any value of t, it is best suited for comparing t treatments when

2. Any additional extraneous sources of variability tend to inflate the error term, making it more difficult to detect differences among the treatment means.

3. The effect of each treatment on the response must be approximately the same across rows and columns.

5 10.t

Definition 15.2

A t x t Latin square design contains t rows and t columns. The t treatments are randomly assigned to experimental units within the rows and columns so that each treatment appears in every row and in every column.

Test for Treatment Effects

We can test specific hypotheses concerning the parameters in our model. In particular, we may wish to test for differences among the t treatment means.

1 2: ... 0

: At least one of the s is not equal to zeroo t

a i

H

H

Table 15.17: Analysis of Variance Table for a t x t Latin Square Design

Source SS df MS F

2

Treatments SST 1 MST SST/( 1) MST/MSE

Rows SSR 1 MSR SSR/( 1) MSR/MSE

Columns SSC 1 MSC = SSC/( -1) MSC/MSE

Error SSE ( 1)( 2) MSE SSE/( -1)( - 2)

Total TSS 1

t t

t t

t t

t t t t

t

Relative Efficiency

RE(LS, CR): the relative efficiency of the Latin square design compared to a completely randomized design

Did accounting for row/column sources of variability increase our precision in estimating the treatment means?

Factorial Treatment Structure in a Completely Randomized Design

A factorial experiment is an experiment in which the response y is observed at all factor-level combinations of the independent variables.

Figure 15.6a: Illustration of the Absence of Interaction in a 2 x 2 Factorial Experiment

Level 1, factor B

Level 2, factor B

Level 1 Level 2

Mea

n re

spon

se

Factors A and B do not interact

Figure 15.6b,c: Illustration of the Presence of Interaction in a 2 x 2 Factorial Experiment

Level 1 Level 2

Mea

n re

spon

se

Level 2, factor B

Level 1, factor B

Level 1 Level 2

Mea

n re

spon

se

Level 1, factor B

Level 2, factor B

Factors A and B interact

Table 15.25: Expected Values for a 2 x 2 Factorial Experiment

Factor B

Factor A Level 1 Level 2

1 1 1 2

2 1 2 2

Level 1

Level 2

Table 15.26: Expected Values for a 2 x 2 Factorial Experiment, with Replications

Factor B

Factor A Level 1 Level 2

1 1 11 1 2 12

2 1 21 2 2 22

Level 1

Level 2

Definition 15.4

Two factors A and B are said to interact if the difference in mean responses for two levels of one factor is not constant across levels of the second factor.

Profile Plot

See Figure 15.6

Used to amplify the notion of interaction: when no interaction is present, the difference in the mean response between two levels of one factor is the same for levels of the other factor.

Table 15.27: AOV Table for a Completely Randomized Two-Factor Factorial Experiment

Source SS df MS F Main Effect

A SSA 1 MSA SSA /( -1) MSA/MSE

B SSB 1 MSB SSB /( -1) MSB/MSE

Interaction

AB SSAB ( 1)( 1) MSAB SSAB /( -1)( 1) MSAB/MSE

Error SSE ( 1) MSE SSE / ( -1)

Total TSS 1

a a

b b

a b a b

ab n ab n

abn

Illustration of Significant, Orderly Interaction

100

90

80

70

60

50

40

Level 1 Level 2 Level 3 Level 4

Factor A

Mea

n re

spon

seLevel 3, factor B

Level 2, factor B

Level 1, factor B

Figure 15.8: Profile plot in which interactions are present, but interactions are orderly

Illustration of Significant, Disorderly Interaction

100

90

80

70

60

50

40

Level 1 Level 2 Level 3 Level 4

Factor A

Mea

n re

spon

se Level 2, factor B

Level 1, factor B

Level 3, factor B

Figure 15.9: Profile plot in which interactions are present, and interactions are disorderly

Factorial Treatment Structure in a Randomized Complete Block Design

Estimation of Treatment Differences and Comparisons of Treatment Means

/ 2

2i i

t

y y t sn

100(1-)% Confidence Interval for the Difference in Treatment Means

where s is the square root of MSE in the AOV table and t/2 can be obtained from Table 2 in the Appendix for a = /2 and the degrees of freedom for MSE.

Multiple Comparison Procedures