Analysis of Variance and Experimental Design - Frontier...

of 50 /50
12/29/02 ANOVA_EXAMPLE 1 Analysis of Variance and Experimental Design An Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of k Population Means Multiple Comparison Procedures An Introduction to Experimental Design Completely Randomized Designs Randomized Block Design

Embed Size (px)

Transcript of Analysis of Variance and Experimental Design - Frontier...

  • 12/29/02 ANOVA_EXAMPLE 1

    Analysis of Variance and Experimental Design

    An Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of k Population Means

    Multiple Comparison ProceduresAn Introduction to Experimental DesignCompletely Randomized DesignsRandomized Block Design

  • 12/29/02 ANOVA_EXAMPLE 2

    An Introduction to Analysis of Variance

    Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means using data obtained from observational or experimental studies.We want to use the sample results to test the following hypotheses.

    H0: 1 = 2 = 3 = . . . = k Ha: Not all population means are equal

    If H0 is rejected, we cannot conclude that all population means are different.Rejecting H0 means that at least two population means have different values.

  • 12/29/02 ANOVA_EXAMPLE 3

    Assumptions for Analysis of Variance

    For each population, the response variable is normally distributed.The variance of the response variable, denoted 2, is the same for all of the populations.The observations must be independent.

  • 12/29/02 ANOVA_EXAMPLE 4

    Analysis of Variance:Testing for the Equality of K Population Means

    Between-Samples Estimate of Population VarianceWithin-Samples Estimate of Population VarianceComparing the Variance Estimates: The F TestThe ANOVA Table

  • 12/29/02 ANOVA_EXAMPLE 5

    Between-Samples Estimateof Population Variance

    A between-samples estimate of 2 is called the mean square between (MSB).

    The numerator of MSB is called the sum of squares between (SSB).The denominator of MSB represents the degrees of freedom associated with SSB.

    MSB =

    =n x x

    k

    j jj

    k( )2

    1

    1MSB =

    =n x x

    k

    j jj

    k( )2

    1

    1

  • 12/29/02 ANOVA_EXAMPLE 6

    Logic behind ANOVA

    There are two independent estimates of the common variance 2 :

    One estimate is based on the variability among the sample means themselves, and the other estimate of 2 is based on the variability of the data within each sample.By Comparing these two estimates, we will determine whether the population means are equal.

  • 12/29/02 ANOVA_EXAMPLE 7

    Within-Samples Estimateof Population Variance

    The estimate of 2 based on the variation of the sample observations within each sample is called the mean square within (MSW).

    The numerator of MSW is called the sum of squares within (SSW).The denominator of MSW represents the degrees of freedom associated with SSW.

    MSW =

    =

    ( )n s

    n k

    j jj

    k

    T

    1 21MSW =

    =

    ( )n s

    n k

    j jj

    k

    T

    1 21

  • 12/29/02 ANOVA_EXAMPLE 8

    Comparing the Variance Estimates: The F Test

    If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSB/MSW is an F distribution with MSB d.f. equal to k - 1 and MSW d.f. equal to nT - k.If the means of the k populations are not equal, the value of MSB/MSW will be inflated because MSB overestimates 2.Hence, we will reject H0 if the resulting value of MSB/MSW appears to be too large to have been selected at random from the appropriate Fdistribution.

  • 12/29/02 ANOVA_EXAMPLE 9

    Test for the Equality of kPopulation Means

    Hypotheses

    H0: 1 = 2 = 3 = . . . = k Ha: Not all population means are equal

    Test StatisticF = MSB/MSW

    Rejection RuleReject H0 if F > F

    where the value of F is based on an F distribution with k - 1 numerator degrees of freedom and nT - 1 denominator degrees of freedom.

  • 12/29/02 ANOVA_EXAMPLE 10

    Sampling Distribution of MSTR/MSE

    The figure below shows the rejection region associated with a level of

    significance equal to where F denotes the critical value.

    Do Not Reject H0Do Not Reject H0 Reject H0Reject H0MSTR/MSEMSTR/MSE

    Critical ValueCritical ValueFF

  • 12/29/02 ANOVA_EXAMPLE 11

    The ANOVA Table

    Source of Sum of Degrees of MeanVariation Squares Freedom Squares FTreatment SSTR k - 1 MSTR MSTR/MSEError SSE nT - k MSETotal SST nT - 1

    SST divided by its degrees of freedom nT - 1 is simply the overall sample variance that would be obtained if we treated the entire nTobservations as one data set.

  • 12/29/02 ANOVA_EXAMPLE 12

    Example: Reed Manufacturing

    Analysis of VarianceJ. R. Reed would like to know if the mean number of

    hours worked per week is the same for the departmentmanagers at her three manufacturing plants (Buffalo,Pittsburgh, and Detroit).

    A simple random sample of 5 managers from each ofthe three plants was taken and the number of hoursworked by each manager for the previous week isshown on the next slide.

  • 12/29/02 ANOVA_EXAMPLE 13

    Example: Reed Manufacturing

    Analysis of VariancePlant 1 Plant 2 Plant 3

    Observation Buffalo Pittsburgh Detroit

    1 48 73 512 54 63 633 57 66 614 54 64 545 62 74 56

    Sample Mean 55 68 57Sample Variance 26.0 26.5 24.5

  • 12/29/02 ANOVA_EXAMPLE 14

    Example: Reed Manufacturing

    Analysis of VarianceHypotheses

    H0: 1 = 2 = 3 Ha: Not all the means are equal

    where: 1 = mean number of hours worked per

    week by the managers at Plant 1 2 = mean number of hours worked per

    week by the managers at Plant 2 3 = mean number of hours worked per

    week by the managers at Plant 3

  • 12/29/02 ANOVA_EXAMPLE 15

    Example: Reed Manufacturing

    Analysis of VarianceMean Square BetweenSince the sample sizes are all equal

    x = (55 + 68 + 57)/3 = 60SSB = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2

    = 490MSB = 490/(3 - 1) = 245

    Mean Square WithinSSW = 4(26.0) + 4(26.5) + 4(24.5) = 308

    MSW = 308/(15 - 3) = 25.667

    ==

  • 12/29/02 ANOVA_EXAMPLE 16

    Alternative way for calculations

    Analysis of VarianceMean Square Between

    SSB = 1/5[(5 x55) 2 + (5 x 68) 2 +(5 x57) 2] 1/15[(15 x 60) 2 ]== 54490 54000= 490

    MSB = 490/(3 - 1) = 245

    Mean Square WithinSSW = (482 + 542 + .562) 54490

    = 54798-54490= 308

    MSW = 308/(15 - 3) = 25.667

    =

  • 12/29/02 ANOVA_EXAMPLE 17

    Example: Reed Manufacturing

    Analysis of VarianceF - TestIf H0 is true, the ratio MSB/MSW should be near 1 since both MSB and MSW are estimating 2. If Hais true, the ratio should be significantly larger than1 since MSB tends to overestimate 2.

    Rejection RuleAssuming = .05, F.05 = 3.89 (2 d.f. numerator, 12 d.f. denominator). Reject H0 if F > 3.89

  • 12/29/02 ANOVA_EXAMPLE 18

    Example: Reed Manufacturing

    Analysis of VarianceTest Statistic

    F = MSB/MSW = 245/25.667 = 9.55Conclusion

    F = 9.55 > F.05 = 3.89, so we reject H0. The mean number of hours worked per week by department managers is not the same at each plant.

  • 12/29/02 ANOVA_EXAMPLE 19

    Example: Reed Manufacturing

    Analysis of VarianceANOVA Table

    Source of Sum of Degrees of MeanVariation Squares Freedom Square F

    Treatments 490 2 245 9.55Error 308 12 25.667Total 798 14

  • 12/29/02 ANOVA_EXAMPLE 20

    Multiple Comparison Procedures

    Suppose that analysis of variance has provided statistical evidence to reject the null hypothesis of equal population means. Fishers least significance difference (LSD) procedure can be used to determine where the differences occur.

  • 12/29/02 ANOVA_EXAMPLE 21

    Fishers LSD Procedure

    HypothesesH0: i = jHa: i j

    Test Statistic

    Rejection Rule

    Reject H0 if t < -ta/2 or t > ta/2

    where the value of ta/2 is based on a t distributionwith nT - k degrees of freedom.

    tx x

    n n

    i j

    i j

    =

    +MSW ( )1 1t

    x x

    n n

    i j

    i j

    =

    +MSW ( )1 1

  • 12/29/02 ANOVA_EXAMPLE 22

    Fishers LSD ProcedureBased on the Test Statistic xi - xj

    HypothesesH0: i = jHa: i j

    Test Statisticxi - xj

    Rejection Rule

    Reject H0 if |xi - xj| > LSD

    where

    __

    __

    __

    )11(MSWLSD 2/ji nn

    t +=

    __

    __ __

  • 12/29/02 ANOVA_EXAMPLE 23

    Example: Reed Manufacturing

    Analysis of VariancePlant 1 Plant 2 Plant 3

    Observation Buffalo Pittsburgh Detroit

    1 48 73 512 54 63 633 57 66 614 54 64 545 62 74 56

    Sample Mean 55 68 57Sample Variance 26.0 26.5 24.5

  • 12/29/02 ANOVA_EXAMPLE 24

    Example: Reed Manufacturing

    Analysis of VarianceANOVA Table

    Source of Sum of Degrees of MeanVariation Squares Freedom Square F

    Treatments 490 2 245 9.55Error 308 12 25.667Total 798 14

  • 12/29/02 ANOVA_EXAMPLE 25

    Example: Reed Manufacturing

    Fishers LSDAssuming = .05,

    Hypotheses (A) H0: 1 = 2Ha: 1 2

    Test Statistic|x1 - x2| = |55 - 68| = 13

    ConclusionThe mean number of hours worked at Plant 1 is not equal to the mean number worked at Plant 2.

    LSD = + =2 179 25 667 151

    5 6 98. . ( ) .LSD = + =2 179 25 6671

    51

    5 6 98. . ( ) .

    __ __

  • 12/29/02 ANOVA_EXAMPLE 26

    Example: Reed Manufacturing

    Fishers LSDHypotheses (B)

    H0: 1 = 3Ha: 1 3

    Test Statistic|x1 - x3| = |55 - 57| = 2

    ConclusionThere is no significant difference between the mean number of hours worked at Plant 1 and the mean number of hours worked at Plant 3.

    __ __

  • 12/29/02 ANOVA_EXAMPLE 27

    Example: Reed Manufacturing

    Fishers LSDHypotheses (C)

    H0: 2 = 3Ha: 2 3

    Test Statistic|x2 - x3| = |68 - 57| = 11

    ConclusionThe mean number of hours worked at Plant 2 is not equal to the mean number worked at Plant 3.

    __ __

  • 12/29/02 ANOVA_EXAMPLE 28

    An Introduction to Experimental Design

    Statistical studies can be classified as being either experimental or observational.In an experimental study, one or more factors are controlled so that data can be obtained about how the factors influence the variables of interest.In an observational study, no attempt is made to control the factors.Cause-and-effect relationships are easier to establish in experimental studies than in observational studies.

  • 12/29/02 ANOVA_EXAMPLE 29

    An Introduction to Experimental Design

    A factor is a variable that the experimenter has selected for investigation.A treatment is a level of a factor.Experimental units are the objects of interest in the experiment.A completely randomized design is an experimental design in which the treatments are randomly assigned to the experimental units.If the experimental units are heterogeneous, blocking can be used to form homogeneous groups, resulting in a randomized block design.

  • 12/29/02 ANOVA_EXAMPLE 30

    Completely Randomized Designs

    Between-Treatments Estimate of Population VarianceWithin-Treatments Estimate of Population VarianceComparing the Variance Estimates: The F TestThe ANOVA TablePairwise Comparisons

  • 12/29/02 ANOVA_EXAMPLE 31

    Between-Treatments Estimate of Population Variance

    In the context of experimental design, the between-samples estimate of 2 is referred to as the mean square due to treatments (MSTR).It is the same as what we previously called mean square between (MSB).The formula for MSTR is

    The numerator is called the sum of squares due to treatments (SSTR).The denominator k - 1 represents the degrees of freedom associated with SSTR.

    MSTR =

    =n x x

    k

    j jj

    k( )

    1

    2

    1MSTR =

    =n x x

    k

    j jj

    k( )

    1

    2

    1

  • 12/29/02 ANOVA_EXAMPLE 32

    Within-Treatments Estimate of Population Variance

    The second estimate of 2, the within-samples estimate, is referred to as the mean square due to error (MSE).It is the same as what we previously called mean square within (MSW).The formula for MSE is

    The numerator is called the sum of squares due to error (SSE).The denominator nT - k represents the degrees of freedom associated with SSE.

    MSE =

    =

    ( )n s

    n k

    j jj

    k

    T

    1 21MSE =

    =

    ( )n s

    n k

    j jj

    k

    T

    1 21

  • 12/29/02 ANOVA_EXAMPLE 33

    ANOVA Table for aCompletely Randomized Design

    Source of Sum of Degrees of MeanVariation Squares Freedom Squares F

    Treatments SSTR k - 1 SSTR/k-1 MSTR/MSW

    Error SSE nT - k SSE/nT -1

    Total SST nT - 1

  • 12/29/02 ANOVA_EXAMPLE 34

    Example: Home Products, Inc.

    Home Products, Inc. is considering marketing a long-lasting car wax. Three different waxes (Type 1, Type 2,and Type 3) have been developed.

    In order to test the durability of these waxes, 5 new cars were waxed with Type 1, 5 with Type 2, and 5 withType 3. Each car was then repeatedly run through an automatic carwash until the wax coating showed signsof deterioration. The number of times each car wentthrough the carwash is shown on the next slide.

    Home Products, Inc. must decide which wax tomarket. Are the three waxes equally effective?

  • 12/29/02 ANOVA_EXAMPLE 35

    Example: Home Products, Inc.

    Wax Wax WaxObservation Type 1 Type 2 Type 3

    1 48 73 512 54 63 633 57 66 614 54 64 545 62 74 56

    Sample Mean 55 68 57Sample Variance 26.0 26.5 24.5

  • 12/29/02 ANOVA_EXAMPLE 36

    Example: Home Products, Inc.

    Completely Randomized DesignHypotheses

    H0: 1 = 2 = 3 Ha: Not all the means are equal

    where: 1 = mean number of washes for Type 1

    wax2 = mean number of washes for Type

    2 wax 3 = mean number of washes for Type 3

    wax

  • 12/29/02 ANOVA_EXAMPLE 37

    Example: Home Products, Inc.

    Completely Randomized DesignMean Square Between Treatments

    Since the sample sizes are all equalx = (x1 + x2 + x3)/3 = (55 + 68 + 57)/3 = 60

    SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490

    MSTR = 490/(3 - 1) = 245Mean Square Error

    SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308MSE = 308/(15 - 3) = 25.667

  • 12/29/02 ANOVA_EXAMPLE 38

    Example: Home Products, Inc.

    Completely Randomized DesignRejection RuleAssuming = .05, F.05 = 3.89 (2 d.f. numeratorand 12 d.f. denominator). Reject H0 if F > 3.89.Test Statistic

    F = MSTR/MSE = 245/25.667 = 9.55ConclusionSince F = 9.55 > F.05 = 3.89, we reject H0. Themean number of carwashes are not the same forall three waxes.

  • 12/29/02 ANOVA_EXAMPLE 39

    Example: Home Products, Inc.

    Completely Randomized DesignANOVA Table

    Source of Sum of Degrees of MeanVariation Squares Freedom Squares F

    Treatments 490 2 245 9.55Error 308 12 25.667

    Total 798 14

  • 12/29/02 ANOVA_EXAMPLE 40

    Randomized Block Design

    The ANOVA ProcedureComputations and Conclusions

  • 12/29/02 ANOVA_EXAMPLE 41

    The ANOVA Procedure

    The ANOVA procedure for the randomized block design requires us to partition the sum of squares total (SST) into three groups: sum of squares due to treatments, sum of squares due to blocks, and sum of squares due to error.The formula for this partitioning is

    SST = SSTR + SSBL + SSE

    The total degrees of freedom, nT - 1, are partitioned such that k - 1 degrees of freedom go to treatments, b - 1 go to blocks, and (k - 1)(b - 1) go to the error term.

  • 12/29/02 ANOVA_EXAMPLE 42

    ANOVA Table for aRandomized Block Design

    Source of Sum of Degrees of MeanVariation Squares Freedom Squares F

    Treatments SSTR k - 1

    Blocks SSBL b - 1

    Error SSE (k - 1)(b - 1)

    Total SST nT - 1

    MSTR SSTR-

    =k 1

    MSTR SSTR-

    =k 1

    MSE SSE= ( )( )k b1 1

    MSE SSE= ( )( )k b1 1

    MSBL SSBL-

    =b 1

    MSBL SSBL-

    =b 1

    MSTRMSE

    MSTRMSE

  • 12/29/02 ANOVA_EXAMPLE 43

    Example: Eastern Oil Co.

    Eastern Oil has developed three new blends of gasoline and must decide which blend or blends to produce and distribute. A study of the miles per gallon ratings of the three blends is being conducted to determine if the mean ratings are the same for the three blends.

    Five automobiles have been tested using each of the three gasoline blends and the miles per gallon ratings are shown on the next slide.

  • 12/29/02 ANOVA_EXAMPLE 44

    Example: Eastern Oil Co.

    Automobile Type of Gasoline (Treatment) Blocks(Block) Blend X Blend Y Blend Z Means

    1 31 30 30 30.3332 30 29 29 29.3333 29 29 28 28.6674 33 31 29 31.0005 26 25 26 25.667

    TreatmentMeans 29.8 28.8 28.4

  • 12/29/02 ANOVA_EXAMPLE 45

    Example: Eastern Oil Co.Randomized Block Design

    Mean Square Due to TreatmentsThe overall sample mean is 29. Thus,

    SSTR = 5[(29.8 - 29)2 + (28.8 - 29)2 + (28.4 - 29)2] = 5.2MSTR = 5.2/(3 - 1) = 2.6

    Mean Square Due to BlocksSSBL = 3[(30.333 - 29)2 + . . . + (25.667 - 29)2] = 51.33

    MSBL = 51.33/(5 - 1) = 12.8 Mean Square Due to Error

    SSE = 62 - 5.2 - 51.33 = 5.47MSE = 5.47/[(3 - 1)(5 - 1)] = .68

  • 12/29/02 ANOVA_EXAMPLE 46

    Example: Eastern Oil Co.

    Randomized Block DesignRejection RuleAssuming = .05, F.05 = 4.46 (2 d.f. numerator and 8 d.f. denominator). Reject H0 if F > 4.46.Test Statistic

    F = MSTR/MSE = 2.6/.68 = 3.82ConclusionSince 3.82 < 4.46, we cannot reject H0. There is not sufficient evidence to conclude that the miles per gallon ratings differ for the three gasoline blends.

  • 12/29/02 ANOVA_EXAMPLE 47

    Factorial ExperimentsSo far we considered only one factor. If we want to draw conclusion about two or more factors, we use factorial experiments.The term factorial is used because the experimental conditions includes all possible combinations of the factors.If we have two factors A and B with 3 and 2 levels the factorial design will be called 3x2 Factorial design.

  • 12/29/02 ANOVA_EXAMPLE 48

    The ANOVA Procedure: Two factors a x b factorial Design

    The ANOVA procedure for the factorial design requires us to partition the sum of squares total (SST) into three groups: sum of squares due to Factor A ( a levels), sum of squares due to Factor B (b levels), and sum of squares due to Interaction of Factor A and B.The formula for this partitioning is

    SST = SSA + SSB + +SSAB+ SSEThe total degrees of freedom, nT - 1, are partitioned such that a - 1 degrees of freedom go to Factor A, b - 1 go to Factor B, (a- 1)(b - 1) go to the the interaction of A and B, and ab(r-1) go to the error.

  • 12/29/02 ANOVA_EXAMPLE 49

    ANOVA Table for the two-factor factorial experiment

    NT -1SSTTotal

    MSE =SSE/ ab(r-1)ab(r-1)SSEError

    MSAB/MSEMSAB = SSAB/(a-1)(b-1)(a-1)(b-1)SSABInteraction

    MSB/MSEMSB = SSB/(b-1)b-1SSBFactor B

    MSA/MSEMSA = SSA/(a-1)a-1SSAFactor A

    FMean Sum of Square

    Degrees of Freedom

    Sum of squares

    Source of variation

  • 12/29/02 ANOVA_EXAMPLE 50

    Interaction between the factors

    Two factors A and B are said to interact if the difference in mean responses for two levels of one factor is not constant across levels of the second factors.

    Analysis of Variance and Experimental DesignAn Introduction to Analysis of VarianceAssumptions for Analysis of VarianceAnalysis of Variance:Testing for the Equality of K Population MeansBetween-Samples Estimateof Population VarianceLogic behind ANOVAWithin-Samples Estimateof Population VarianceComparing the Variance Estimates: The F TestTest for the Equality of k Population MeansSampling Distribution of MSTR/MSEThe ANOVA TableExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingAlternative way for calculationsExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingMultiple Comparison ProceduresFishers LSD ProcedureFishers LSD ProcedureBased on the Test Statistic xi - xjExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingExample: Reed ManufacturingAn Introduction to Experimental DesignAn Introduction to Experimental DesignCompletely Randomized DesignsBetween-Treatments Estimate of Population VarianceWithin-Treatments Estimate of Population VarianceANOVA Table for aCompletely Randomized DesignExample: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Example: Home Products, Inc.Randomized Block DesignThe ANOVA ProcedureANOVA Table for aRandomized Block DesignExample: Eastern Oil Co.Example: Eastern Oil Co.Example: Eastern Oil Co.Example: Eastern Oil Co.Factorial ExperimentsThe ANOVA Procedure: Two factors a x b factorial DesignANOVA Table for the two-factor factorial experimentInteraction between the factors