Analysis of Variance and Experimental Design  Frontier...
Embed Size (px)
Transcript of Analysis of Variance and Experimental Design  Frontier...

12/29/02 ANOVA_EXAMPLE 1
Analysis of Variance and Experimental Design
An Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of k Population Means
Multiple Comparison ProceduresAn Introduction to Experimental DesignCompletely Randomized DesignsRandomized Block Design

12/29/02 ANOVA_EXAMPLE 2
An Introduction to Analysis of Variance
Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means using data obtained from observational or experimental studies.We want to use the sample results to test the following hypotheses.
H0: 1 = 2 = 3 = . . . = k Ha: Not all population means are equal
If H0 is rejected, we cannot conclude that all population means are different.Rejecting H0 means that at least two population means have different values.

12/29/02 ANOVA_EXAMPLE 3
Assumptions for Analysis of Variance
For each population, the response variable is normally distributed.The variance of the response variable, denoted 2, is the same for all of the populations.The observations must be independent.

12/29/02 ANOVA_EXAMPLE 4
Analysis of Variance:Testing for the Equality of K Population Means
BetweenSamples Estimate of Population VarianceWithinSamples Estimate of Population VarianceComparing the Variance Estimates: The F TestThe ANOVA Table

12/29/02 ANOVA_EXAMPLE 5
BetweenSamples Estimateof Population Variance
A betweensamples estimate of 2 is called the mean square between (MSB).
The numerator of MSB is called the sum of squares between (SSB).The denominator of MSB represents the degrees of freedom associated with SSB.
MSB =
=n x x
k
j jj
k( )2
1
1MSB =
=n x x
k
j jj
k( )2
1
1

12/29/02 ANOVA_EXAMPLE 6
Logic behind ANOVA
There are two independent estimates of the common variance 2 :
One estimate is based on the variability among the sample means themselves, and the other estimate of 2 is based on the variability of the data within each sample.By Comparing these two estimates, we will determine whether the population means are equal.

12/29/02 ANOVA_EXAMPLE 7
WithinSamples Estimateof Population Variance
The estimate of 2 based on the variation of the sample observations within each sample is called the mean square within (MSW).
The numerator of MSW is called the sum of squares within (SSW).The denominator of MSW represents the degrees of freedom associated with SSW.
MSW =
=
( )n s
n k
j jj
k
T
1 21MSW =
=
( )n s
n k
j jj
k
T
1 21

12/29/02 ANOVA_EXAMPLE 8
Comparing the Variance Estimates: The F Test
If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSB/MSW is an F distribution with MSB d.f. equal to k  1 and MSW d.f. equal to nT  k.If the means of the k populations are not equal, the value of MSB/MSW will be inflated because MSB overestimates 2.Hence, we will reject H0 if the resulting value of MSB/MSW appears to be too large to have been selected at random from the appropriate Fdistribution.

12/29/02 ANOVA_EXAMPLE 9
Test for the Equality of kPopulation Means
Hypotheses
H0: 1 = 2 = 3 = . . . = k Ha: Not all population means are equal
Test StatisticF = MSB/MSW
Rejection RuleReject H0 if F > F
where the value of F is based on an F distribution with k  1 numerator degrees of freedom and nT  1 denominator degrees of freedom.

12/29/02 ANOVA_EXAMPLE 10
Sampling Distribution of MSTR/MSE
The figure below shows the rejection region associated with a level of
significance equal to where F denotes the critical value.
Do Not Reject H0Do Not Reject H0 Reject H0Reject H0MSTR/MSEMSTR/MSE
Critical ValueCritical ValueFF

12/29/02 ANOVA_EXAMPLE 11
The ANOVA Table
Source of Sum of Degrees of MeanVariation Squares Freedom Squares FTreatment SSTR k  1 MSTR MSTR/MSEError SSE nT  k MSETotal SST nT  1
SST divided by its degrees of freedom nT  1 is simply the overall sample variance that would be obtained if we treated the entire nTobservations as one data set.

12/29/02 ANOVA_EXAMPLE 12
Example: Reed Manufacturing
Analysis of VarianceJ. R. Reed would like to know if the mean number of
hours worked per week is the same for the departmentmanagers at her three manufacturing plants (Buffalo,Pittsburgh, and Detroit).
A simple random sample of 5 managers from each ofthe three plants was taken and the number of hoursworked by each manager for the previous week isshown on the next slide.

12/29/02 ANOVA_EXAMPLE 13
Example: Reed Manufacturing
Analysis of VariancePlant 1 Plant 2 Plant 3
Observation Buffalo Pittsburgh Detroit
1 48 73 512 54 63 633 57 66 614 54 64 545 62 74 56
Sample Mean 55 68 57Sample Variance 26.0 26.5 24.5

12/29/02 ANOVA_EXAMPLE 14
Example: Reed Manufacturing
Analysis of VarianceHypotheses
H0: 1 = 2 = 3 Ha: Not all the means are equal
where: 1 = mean number of hours worked per
week by the managers at Plant 1 2 = mean number of hours worked per
week by the managers at Plant 2 3 = mean number of hours worked per
week by the managers at Plant 3

12/29/02 ANOVA_EXAMPLE 15
Example: Reed Manufacturing
Analysis of VarianceMean Square BetweenSince the sample sizes are all equal
x = (55 + 68 + 57)/3 = 60SSB = 5(55  60)2 + 5(68  60)2 + 5(57  60)2
= 490MSB = 490/(3  1) = 245
Mean Square WithinSSW = 4(26.0) + 4(26.5) + 4(24.5) = 308
MSW = 308/(15  3) = 25.667
==

12/29/02 ANOVA_EXAMPLE 16
Alternative way for calculations
Analysis of VarianceMean Square Between
SSB = 1/5[(5 x55) 2 + (5 x 68) 2 +(5 x57) 2] 1/15[(15 x 60) 2 ]== 54490 54000= 490
MSB = 490/(3  1) = 245
Mean Square WithinSSW = (482 + 542 + .562) 54490
= 5479854490= 308
MSW = 308/(15  3) = 25.667
=

12/29/02 ANOVA_EXAMPLE 17
Example: Reed Manufacturing
Analysis of VarianceF  TestIf H0 is true, the ratio MSB/MSW should be near 1 since both MSB and MSW are estimating 2. If Hais true, the ratio should be significantly larger than1 since MSB tends to overestimate 2.
Rejection RuleAssuming = .05, F.05 = 3.89 (2 d.f. numerator, 12 d.f. denominator). Reject H0 if F > 3.89

12/29/02 ANOVA_EXAMPLE 18
Example: Reed Manufacturing
Analysis of VarianceTest Statistic
F = MSB/MSW = 245/25.667 = 9.55Conclusion
F = 9.55 > F.05 = 3.89, so we reject H0. The mean number of hours worked per week by department managers is not the same at each plant.

12/29/02 ANOVA_EXAMPLE 19
Example: Reed Manufacturing
Analysis of VarianceANOVA Table
Source of Sum of Degrees of MeanVariation Squares Freedom Square F
Treatments 490 2 245 9.55Error 308 12 25.667Total 798 14

12/29/02 ANOVA_EXAMPLE 20
Multiple Comparison Procedures
Suppose that analysis of variance has provided statistical evidence to reject the null hypothesis of equal population means. Fishers least significance difference (LSD) procedure can be used to determine where the differences occur.

12/29/02 ANOVA_EXAMPLE 21
Fishers LSD Procedure
HypothesesH0: i = jHa: i j
Test Statistic
Rejection Rule
Reject H0 if t < ta/2 or t > ta/2
where the value of ta/2 is based on a t distributionwith nT  k degrees of freedom.
tx x
n n
i j
i j
=
+MSW ( )1 1t
x x
n n
i j
i j
=
+MSW ( )1 1

12/29/02 ANOVA_EXAMPLE 22
Fishers LSD ProcedureBased on the Test Statistic xi  xj
HypothesesH0: i = jHa: i j
Test Statisticxi  xj
Rejection Rule
Reject H0 if xi  xj > LSD
where
__
__
__
)11(MSWLSD 2/ji nn
t +=
__
__ __

12/29/02 ANOVA_EXAMPLE 23
Example: Reed Manufacturing
Analysis of VariancePlant 1 Plant 2 Plant 3
Observation Buffalo Pittsburgh Detroit
1 48 73 512 54 63 633 57 66 614 54 64 545 62 74 56
Sample Mean 55 68 57Sample Variance 26.0 26.5 24.5

12/29/02 ANOVA_EXAMPLE 24
Example: Reed Manufacturing
Analysis of VarianceANOVA Table
Source of Sum of Degrees of MeanVariation Squares Freedom Square F
Treatments 490 2 245 9.55Error 308 12 25.667Total 798 14

12/29/02 ANOVA_EXAMPLE 25
Example: Reed Manufacturing
Fishers LSDAssuming = .05,
Hypotheses (A) H0: 1 = 2Ha: 1 2
Test Statisticx1  x2 = 55  68 = 13
ConclusionThe mean number of hours worked at Plant 1 is not equal to the mean number worked at Plant 2.
LSD = + =2 179 25 667 151
5 6 98. . ( ) .LSD = + =2 179 25 6671
51
5 6 98. . ( ) .
__ __

12/29/02 ANOVA_EXAMPLE 26
Example: Reed Manufacturing
Fishers LSDHypotheses (B)
H0: 1 = 3Ha: 1 3
Test Statisticx1  x3 = 55  57 = 2
ConclusionThere is no significant difference between the mean number of hours worked at Plant 1 and the mean number of hours worked at Plant 3.
__ __

12/29/02 ANOVA_EXAMPLE 27
Example: Reed Manufacturing
Fishers LSDHypotheses (C)
H0: 2 = 3Ha: 2 3
Test Statisticx2  x3 = 68  57 = 11
ConclusionThe mean number of hours worked at Plant 2 is not equal to the mean number worked at Plant 3.
__ __

12/29/02 ANOVA_EXAMPLE 28
An Introduction to Experimental Design
Statistical studies can be classified as being either experimental or observational.In an experimental study, one or more factors are controlled so that data can be obtained about how the factors influence the variables of interest.In an observational study, no attempt is made to control the factors.Causeandeffect relationships are easier to establish in experimental studies than in observational studies.

12/29/02 ANOVA_EXAMPLE 29
An Introduction to Experimental Design
A factor is a variable that the experimenter has selected for investigation.A treatment is a level of a factor.Experimental units are the objects of interest in the experiment.A completely randomized design is an experimental design in which the treatments are randomly assigned to the experimental units.If the experimental units are heterogeneous, blocking can be used to form homogeneous groups, resulting in a randomized block design.

12/29/02 ANOVA_EXAMPLE 30
Completely Randomized Designs
BetweenTreatments Estimate of Population VarianceWithinTreatments Estimate of Population VarianceComparing the Variance Estimates: The F TestThe ANOVA TablePairwise Comparisons

12/29/02 ANOVA_EXAMPLE 31
BetweenTreatments Estimate of Population Variance
In the context of experimental design, the betweensamples estimate of 2 is referred to as the mean square due to treatments (MSTR).It is the same as what we previously called mean square between (MSB).The formula for MSTR is
The numerator is called the sum of squares due to treatments (SSTR).The denominator k  1 represents the degrees of freedom associated with SSTR.
MSTR =
=n x x
k
j jj
k( )
1
2
1MSTR =
=n x x
k
j jj
k( )
1
2
1

12/29/02 ANOVA_EXAMPLE 32
WithinTreatments Estimate of Population Variance
The second estimate of 2, the withinsamples estimate, is referred to as the mean square due to error (MSE).It is the same as what we previously called mean square within (MSW).The formula for MSE is
The numerator is called the sum of squares due to error (SSE).The denominator nT  k represents the degrees of freedom associated with SSE.
MSE =
=
( )n s
n k
j jj
k
T
1 21MSE =
=
( )n s
n k
j jj
k
T
1 21

12/29/02 ANOVA_EXAMPLE 33
ANOVA Table for aCompletely Randomized Design
Source of Sum of Degrees of MeanVariation Squares Freedom Squares F
Treatments SSTR k  1 SSTR/k1 MSTR/MSW
Error SSE nT  k SSE/nT 1
Total SST nT  1

12/29/02 ANOVA_EXAMPLE 34
Example: Home Products, Inc.
Home Products, Inc. is considering marketing a longlasting car wax. Three different waxes (Type 1, Type 2,and Type 3) have been developed.
In order to test the durability of these waxes, 5 new cars were waxed with Type 1, 5 with Type 2, and 5 withType 3. Each car was then repeatedly run through an automatic carwash until the wax coating showed signsof deterioration. The number of times each car wentthrough the carwash is shown on the next slide.
Home Products, Inc. must decide which wax tomarket. Are the three waxes equally effective?

12/29/02 ANOVA_EXAMPLE 35
Example: Home Products, Inc.
Wax Wax WaxObservation Type 1 Type 2 Type 3
1 48 73 512 54 63 633 57 66 614 54 64 545 62 74 56
Sample Mean 55 68 57Sample Variance 26.0 26.5 24.5

12/29/02 ANOVA_EXAMPLE 36
Example: Home Products, Inc.
Completely Randomized DesignHypotheses
H0: 1 = 2 = 3 Ha: Not all the means are equal
where: 1 = mean number of washes for Type 1
wax2 = mean number of washes for Type
2 wax 3 = mean number of washes for Type 3
wax

12/29/02 ANOVA_EXAMPLE 37
Example: Home Products, Inc.
Completely Randomized DesignMean Square Between Treatments
Since the sample sizes are all equalx = (x1 + x2 + x3)/3 = (55 + 68 + 57)/3 = 60
SSTR = 5(55  60)2 + 5(68  60)2 + 5(57  60)2 = 490
MSTR = 490/(3  1) = 245Mean Square Error
SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308MSE = 308/(15  3) = 25.667

12/29/02 ANOVA_EXAMPLE 38
Example: Home Products, Inc.
Completely Randomized DesignRejection RuleAssuming = .05, F.05 = 3.89 (2 d.f. numeratorand 12 d.f. denominator). Reject H0 if F > 3.89.Test Statistic
F = MSTR/MSE = 245/25.667 = 9.55ConclusionSince F = 9.55 > F.05 = 3.89, we reject H0. Themean number of carwashes are not the same forall three waxes.

12/29/02 ANOVA_EXAMPLE 39
Example: Home Products, Inc.
Completely Randomized DesignANOVA Table
Source of Sum of Degrees of MeanVariation Squares Freedom Squares F
Treatments 490 2 245 9.55Error 308 12 25.667
Total 798 14

12/29/02 ANOVA_EXAMPLE 40
Randomized Block Design
The ANOVA ProcedureComputations and Conclusions

12/29/02 ANOVA_EXAMPLE 41
The ANOVA Procedure
The ANOVA procedure for the randomized block design requires us to partition the sum of squares total (SST) into three groups: sum of squares due to treatments, sum of squares due to blocks, and sum of squares due to error.The formula for this partitioning is
SST = SSTR + SSBL + SSE
The total degrees of freedom, nT  1, are partitioned such that k  1 degrees of freedom go to treatments, b  1 go to blocks, and (k  1)(b  1) go to the error term.

12/29/02 ANOVA_EXAMPLE 42
ANOVA Table for aRandomized Block Design
Source of Sum of Degrees of MeanVariation Squares Freedom Squares F
Treatments SSTR k  1
Blocks SSBL b  1
Error SSE (k  1)(b  1)
Total SST nT  1
MSTR SSTR
=k 1
MSTR SSTR
=k 1
MSE SSE= ( )( )k b1 1
MSE SSE= ( )( )k b1 1
MSBL SSBL
=b 1
MSBL SSBL
=b 1
MSTRMSE
MSTRMSE

12/29/02 ANOVA_EXAMPLE 43
Example: Eastern Oil Co.
Eastern Oil has developed three new blends of gasoline and must decide which blend or blends to produce and distribute. A study of the miles per gallon ratings of the three blends is being conducted to determine if the mean ratings are the same for the three blends.
Five automobiles have been tested using each of the three gasoline blends and the miles per gallon ratings are shown on the next slide.

12/29/02 ANOVA_EXAMPLE 44
Example: Eastern Oil Co.
Automobile Type of Gasoline (Treatment) Blocks(Block) Blend X Blend Y Blend Z Means
1 31 30 30 30.3332 30 29 29 29.3333 29 29 28 28.6674 33 31 29 31.0005 26 25 26 25.667
TreatmentMeans 29.8 28.8 28.4

12/29/02 ANOVA_EXAMPLE 45
Example: Eastern Oil Co.Randomized Block Design
Mean Square Due to TreatmentsThe overall sample mean is 29. Thus,
SSTR = 5[(29.8  29)2 + (28.8  29)2 + (28.4  29)2] = 5.2MSTR = 5.2/(3  1) = 2.6
Mean Square Due to BlocksSSBL = 3[(30.333  29)2 + . . . + (25.667  29)2] = 51.33
MSBL = 51.33/(5  1) = 12.8 Mean Square Due to Error
SSE = 62  5.2  51.33 = 5.47MSE = 5.47/[(3  1)(5  1)] = .68

12/29/02 ANOVA_EXAMPLE 46
Example: Eastern Oil Co.
Randomized Block DesignRejection RuleAssuming = .05, F.05 = 4.46 (2 d.f. numerator and 8 d.f. denominator). Reject H0 if F > 4.46.Test Statistic
F = MSTR/MSE = 2.6/.68 = 3.82ConclusionSince 3.82 < 4.46, we cannot reject H0. There is not sufficient evidence to conclude that the miles per gallon ratings differ for the three gasoline blends.

12/29/02 ANOVA_EXAMPLE 47
Factorial ExperimentsSo far we considered only one factor. If we want to draw conclusion about two or more factors, we use factorial experiments.The term factorial is used because the experimental conditions includes all possible combinations of the factors.If we have two factors A and B with 3 and 2 levels the factorial design will be called 3x2 Factorial design.

12/29/02 ANOVA_EXAMPLE 48
The ANOVA Procedure: Two factors a x b factorial Design
The ANOVA procedure for the factorial design requires us to partition the sum of squares total (SST) into three groups: sum of squares due to Factor A ( a levels), sum of squares due to Factor B (b levels), and sum of squares due to Interaction of Factor A and B.The formula for this partitioning is
SST = SSA + SSB + +SSAB+ SSEThe total degrees of freedom, nT  1, are partitioned such that a  1 degrees of freedom go to Factor A, b  1 go to Factor B, (a 1)(b  1) go to the the interaction of A and B, and ab(r1) go to the error.

12/29/02 ANOVA_EXAMPLE 49
ANOVA Table for the twofactor factorial experiment
NT 1SSTTotal
MSE =SSE/ ab(r1)ab(r1)SSEError
MSAB/MSEMSAB = SSAB/(a1)(b1)(a1)(b1)SSABInteraction
MSB/MSEMSB = SSB/(b1)b1SSBFactor B
MSA/MSEMSA = SSA/(a1)a1SSAFactor A
FMean Sum of Square
Degrees of Freedom
Sum of squares
Source of variation

12/29/02 ANOVA_EXAMPLE 50
Interaction between the factors
Two factors A and B are said to interact if the difference in mean responses for two levels of one factor is not constant across levels of the second factors.
Analysis of Variance and Experimental DesignAn Introduction to Analysis of VarianceAssumptions for Analysis of VarianceAnalysis of Variance:Testing for the Equality of K Population MeansBetweenSamples Estimateof Population VarianceLogic behind ANOVAWithinSamples Estimateof Population VarianceComparing the Variance Estimates: The F TestTest for the Equality of k Population MeansSampling Distribution of MSTR/MSEThe ANOVA TableExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingAlternative way for calculationsExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingMultiple Comparison ProceduresFishers LSD ProcedureFishers LSD ProcedureBased on the Test Statistic xi  xjExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingAn Introduction to Experimental DesignAn Introduction to Experimental DesignCompletely Randomized DesignsBetweenTreatments Estimate of Population VarianceWithinTreatments Estimate of Population VarianceANOVA Table for aCompletely Randomized DesignExample: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Randomized Block DesignThe ANOVA ProcedureANOVA Table for aRandomized Block DesignExample: Eastern Oil Co.Example: Eastern Oil Co.Example: Eastern Oil Co.Example: Eastern Oil Co.Factorial ExperimentsThe ANOVA Procedure: Two factors a x b factorial DesignANOVA Table for the twofactor factorial experimentInteraction between the factors