Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed...

51
Analysis of Variance Analysis of Variance Διατμηματικό Διατμηματικό ΠΜΣ ΠΜΣ Επαγγελματική Επαγγελματική και και Περιβαλλοντική Περιβαλλοντική Υγεία Υγεία - - Διαχείριση Διαχείριση και και Οικονομική Οικονομική Αποτίμηση Αποτίμηση Δημήτρης Δημήτρης Φουσκάκης Φουσκάκης

Transcript of Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed...

Page 1: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Analysis of VarianceAnalysis of Variance

ΔιατμηματικόΔιατμηματικό

ΠΜΣΠΜΣΕπαγγελματικήΕπαγγελματική

καικαι

ΠεριβαλλοντικήΠεριβαλλοντική

ΥγείαΥγεία--ΔιαχείρισηΔιαχείριση

καικαι

ΟικονομικήΟικονομική ΑποτίμησηΑποτίμηση

ΔημήτρηςΔημήτρης

ΦουσκάκηςΦουσκάκης

Page 2: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

IntroductionIntroductionSo far we have analyzed two sets of observations, either So far we have analyzed two sets of observations, either paired data or data from two independent samples. paired data or data from two independent samples. These ideas extend to situations where we have three or These ideas extend to situations where we have three or more sets of observations. more sets of observations. With several groups of observations it is obviously With several groups of observations it is obviously possible to compare each pair of groups using t tests, possible to compare each pair of groups using t tests, but this is not a good approach since it will increase the but this is not a good approach since it will increase the probability of type I error (finding a significant difference probability of type I error (finding a significant difference just by chance). Suppose there are 10 groups. Thus just by chance). Suppose there are 10 groups. Thus there are 45 possible pairs of groups to be compare. there are 45 possible pairs of groups to be compare. Using a 5% level of significance would imply that Using a 5% level of significance would imply that .05.05××4545≈≈2 comparisons are likely to be significant by 2 comparisons are likely to be significant by chance. Thus it is necessary to use a new methodology.chance. Thus it is necessary to use a new methodology.

Page 3: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

IntroductionIntroductionLets suppose that there are k Lets suppose that there are k independentindependent

groups (groups (oneone

classifying categorical classifying categorical

variable variable --

factor with k possible outcomes) with factor with k possible outcomes) with nnii

observations in the observations in the ithith

group and group and yyijijdenotes the observed value of a continuous variable Y of the denotes the observed value of a continuous variable Y of the jthjth

individual in the individual in the ithith

group. We want test the hypothesis that the mean of the random vgroup. We want test the hypothesis that the mean of the random variable Y is the ariable Y is the same among all groups. The method will use is called same among all groups. The method will use is called one way analysis of one way analysis of variance (onevariance (one--way ANOVA).way ANOVA).

If the groups being compared have been fixed by the design of thIf the groups being compared have been fixed by the design of the study, we perform e study, we perform the the oneone--way fixed effects ANOVAway fixed effects ANOVA. In other instances if the groups being compared . In other instances if the groups being compared is a sample of a large number of groups and we are not interesteis a sample of a large number of groups and we are not interested in comparing the d in comparing the specific groups, but instead we want to generalize the results tspecific groups, but instead we want to generalize the results to all other possible o all other possible number of groups, then we perform the number of groups, then we perform the oneone--way random effects ANOVA.way random effects ANOVA.

In the case where several measurements are taken on each individIn the case where several measurements are taken on each individual in a single ual in a single sample, and therefore we have dependent groups, we perform the ssample, and therefore we have dependent groups, we perform the so called o called

repeated measurements ANOVA.repeated measurements ANOVA.

These ideas can be extended in a case we have more than one clasThese ideas can be extended in a case we have more than one classifying variables, sifying variables, lets say m. Then we perform the lets say m. Then we perform the mm--way ANOVA.way ANOVA.

Analyses using both fixed and random effects are calledAnalyses using both fixed and random effects are called

““mixed models.mixed models.””

Page 4: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

LetLet’’s assume the following model:s assume the following model:

ij i ijy a e= μ + +

where μ

is a constant

common in every group, ai

is a constant specific to the ith

group and eij

is an error term which is normally distributed with mean 0 and variance σ2. Thus, a typical observation from the jth

group is normally distributed with mean μ

+ ai

and variance σ2.

Testing the hypothesis of a common mean among all groups is equivalent

therefore of testing Ho

: all ai

= 0 vs

H1

: at least one ai

0.

With the above model the means of an arbitrary number of groups, each of which follows a normal distribution with the same variance, can be compared. Whether the variability in the data comes mostly from variability within groups

or can truly be

attributed to variability between groups can be determined.

•Thus the assumptions of the method is that (a) in each group we have a normal distribution with the same variance, (b) groups are independent and (c) groups are fixed by the design of the study.

Page 5: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Lets denote by the mean of the Lets denote by the mean of the ithith group and by the mean over all groups group and by the mean over all groups

of Y. The deviation of an individual of Y. The deviation of an individual observation from the overall mean can be observation from the overall mean can be represented as represented as

iyy

ij ij i iy y (y y ) (y y) (1)− = − + −

Within group variability Between group variability

If the between group variability is large and the within group variability is small then H0

should be rejected. Conversely if the between group variability

is small and the within group variability is large then H0

should not be rejected.

Page 6: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Page 7: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Of course we have to check those Of course we have to check those deviations for all observations. If both deviations for all observations. If both sides of (1) are squared and then summed sides of (1) are squared and then summed up we end up with the following up we end up with the following relationship (the cross product term can be relationship (the cross product term can be shown to be zero): shown to be zero):

( )i in nk k k2 2 2

ij ij i ii 1 j 1 i=1 j=1 i=1

y y (y y ) (y y)= =

− = − + −∑∑ ∑∑ ∑Total Sum of Squares (TSS) Within Sum of Squares (WSS)

Between Sum of Squares (BSS)

Page 8: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Short Computational Form for the BSS Short Computational Form for the BSS and WSSand WSS 2k

i i 2k ki 12 2 ..

i i i ii=1 i 1

k2

i ii=1

n yyBSS n y n y

n n

WSS (n 1)s

=

=

⎛ ⎞⎜ ⎟⎝ ⎠= − = −

= −

∑∑ ∑

where y..

Is the sum of the observations across all groups, n is the total number of observations and si

is the standard deviation in the ith

group.

Page 9: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Between Mean Square (BMS) = BSS/(kBetween Mean Square (BMS) = BSS/(k--1)1)Within Mean Square (WMS) = Within Mean Square (WMS) = WSS/(nWSS/(n--kk))The significance test will be based on the ratio F The significance test will be based on the ratio F = BMS/WMS. Under H= BMS/WMS. Under H00

this ratio follows the this ratio follows the FF distribution with distribution with kk--11

and and nn--kk

dfdf. Therefore if . Therefore if αα

is is

the significant level of the test we reject Hthe significant level of the test we reject H00

if F > if F > FFkk--1,n1,n--1,1,αα

. The p. The p--value of the test is given by the value of the test is given by the area to the right of F under the area to the right of F under the FFkk--1,n1,n--11

distribution, i.e. is thedistribution, i.e. is the

P(P(FFkk--1,n1,n--11

> F).> F).

Page 10: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Page 11: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Page 12: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA

Page 13: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA Example:Example:

TwentyTwenty--two patients undergoing two patients undergoing

cardiac bypass surgery were randomised to one cardiac bypass surgery were randomised to one of three ventilation groups:of three ventilation groups:

Group I: Group I: Patients received a 50% nitrous oxide and Patients received a 50% nitrous oxide and 50% oxygen mixture continuously for 24 hours;50% oxygen mixture continuously for 24 hours;Group II: Group II: Patients received a 50% nitrous oxide Patients received a 50% nitrous oxide and 50% oxygen mixture only during the and 50% oxygen mixture only during the operation;operation;Group III: Group III: Patients received no nitrous oxide but Patients received no nitrous oxide but received 35received 35--50% oxygen for 24 hours.50% oxygen for 24 hours.

Does the three groups have the same red cell Does the three groups have the same red cell folatefolate

levels (levels (μμg/lg/l))??

Page 14: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way fixed effects ANOVA way fixed effects ANOVA Subject Group I

(n=8) Group II

(n=9)Group III

(n=5)1 243 206 2412 251 210 2583 275 226 2704 291 249 2935 347 255 3286 354 2737 380 2858 392 2959 309

Mean 316.6 256.4 278.0SD 58.7 37.1 33.8

Source of variation Degrees of freedom

Sums of squares Mean squares

F P

Between groups 2 15515.88 7757.9 3.71 0.04 Within groups 19 39716.09 2090.3 Total 21 55231.97

The data in each group look plausible samples from a Normal distribution. The standard deviation in group I is rather higher than those in the other groups, but moderate variability is not a problem, especially when the samples are small. Bartlett’s test is an extension of the F test for assessing the null hypothesis that more than two samples come from populations with the same variance. Some statistical packages incorporate this test.

Reject H0

Page 15: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Confidence IntervalsConfidence IntervalsA 95% C.I. can be constructed for the A 95% C.I. can be constructed for the mean of any group. E.g. for the first group:mean of any group. E.g. for the first group:

A 95% C.I. can be constructed for the A 95% C.I. can be constructed for the difference between any two means. E.g. difference between any two means. E.g. for the first two groups:for the first two groups:

1 n k,0.025 res 1(y t s / n ) (316.6 2.093 2090.3 / 8)(282.7, 350.4)

−± ⋅ = ± ⋅

= Square root of WMS

1 2 n k,0.025 res 1 2(y y t s 1/ n 1/ n ) (316.6 2.093 22.22)(13.7, 106.7)

−− ± ⋅ + = ± ⋅

=

Page 16: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Multiple ComparisonsMultiple ComparisonsWith 2 groups the interpretation of a significant difference With 2 groups the interpretation of a significant difference is reasonable, but how do we interpret significant variation is reasonable, but how do we interpret significant variation among the means of 3 or more groups? Further analysis among the means of 3 or more groups? Further analysis is required to find out how the means differ, for example is required to find out how the means differ, for example whether one group differs from all the others (especially whether one group differs from all the others (especially when the groups are not ordered as we will describe when the groups are not ordered as we will describe later). One possibility is to compare each pair of means in later). One possibility is to compare each pair of means in turn with a t test, but as we described at the beginning of turn with a t test, but as we described at the beginning of the section this will increase the probability of type I error. the section this will increase the probability of type I error. Several methods have been proposed to deal with this Several methods have been proposed to deal with this problem and here we will mention the problem and here we will mention the BonferroniBonferroni

methodmethod. The disadvantage of the method is that it is too . The disadvantage of the method is that it is too conservative. It might happen that although the ANOVA conservative. It might happen that although the ANOVA produced a statistical significant result, after Bonferonni produced a statistical significant result, after Bonferonni method no pair of means came out significantly different. method no pair of means came out significantly different.

Page 17: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Bonferonni MethodBonferonni Method

Decide in advance which groups you are Decide in advance which groups you are particularly interested in comparing (the particularly interested in comparing (the fewer the better).fewer the better).Perform modified t test to compare the Perform modified t test to compare the pairs of groups of interest, using the pairs of groups of interest, using the BonferroniBonferroni

method to adjust the P values. method to adjust the P values.

i j

res i j

y yt

s 1/ n 1/ n−

=+

If we perform b paired comparisons multiply each P –

value by b with the restriction that the result

cannot exceed 1.

Page 18: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Bonferonni MethodBonferonni Method

For the last example lets use the For the last example lets use the Bonferonni method for comparing the Bonferonni method for comparing the means of the first two groups. The means of the first two groups. The modified t test is:modified t test is:

1 2

res 1 2

y y 316.6 256.4t 2.71s 1/ n 1/ n 45.72 1/ 8 1/ 9

− −= = =

+ +

Pvalue

= 0.014 (T distribution with

n-k

= 19 df

)

If we are comparing each pair of groups we will make b = 3 comparisons and thus the above Pvalue

becomes 0.014×3=0.042 two groups are significant different.

Page 19: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Ordered GroupsOrdered GroupsWhen the groups are ordered it is not reasonable When the groups are ordered it is not reasonable to compare each pair of groups, but rather we to compare each pair of groups, but rather we should study if there is a linear trend across the should study if there is a linear trend across the groups. groups. The idea is to partition the BSS into a component The idea is to partition the BSS into a component due to a linear trend and the remaining (nondue to a linear trend and the remaining (non--

linear) component. We give scores linear) component. We give scores llii

to the to the groups, where the values of the groups, where the values of the llii

are equally are equally spaced and sum up to 0. Then we calculate the spaced and sum up to 0. Then we calculate the sum of squares due to L sum of squares due to L

2 2i i i iSS(L ) L / (l / n ), w here L l y= =∑ ∑

Page 20: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Ordered GroupsOrdered Groups

Page 21: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Ordered GroupsOrdered Groups

We give the groups scores li

which are equally spaced and add to 0, such as -5, -3, -1, 1, 3, 5. Then L = 652 and SS(L)=6522/7.70=55147, one(1) df

and thus MS(L)=55147/1.

Then F=55147/3318=16.6 which is highly significant if we compare

it with F1,265

(265 = n-k). In general you reject the null hypothesis (no linear trend) if F

> F1,n-k,α

. The P- value is the area of F1,n-k to the right of F.

Therefore the mean serum trypsin

level rises with age. However, the non linear variation between the age groups is also highly significant, indicating that the linear trend only explains some of the age effect. Fitting a linear trend in one way ANOVA is equivalent to linear regression that will describe later.

Page 22: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Non –– Parametric ANOVA Parametric ANOVA –– the the KruskalKruskal

–– Wallis testWallis testRank the complete set of n observations Rank the complete set of n observations regardless of which group they are in, and regardless of which group they are in, and for each group the sum of the ranks is for each group the sum of the ranks is calculated (calculated (RRii

). We calculate the statistic ). We calculate the statistic

and compare it with the and compare it with the XX22

distribution with distribution with kk--1 1 dfdf. If . If

2i

i

R12H 3(n 1)n (n 1) n

= − ++ ∑

21, 0

21

H reject H . The p-value is the

area of to the right of H.κ− α

κ−

> Χ

Χ

Page 23: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Non –– Parametric ANOVA Parametric ANOVA –– the the KruskalKruskal

–– Wallis testWallis test

Page 24: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Non –– Parametric ANOVA Parametric ANOVA –– the the KruskalKruskal

–– Wallis testWallis testFentress et al (1986) reported the results of a randomized Fentress et al (1986) reported the results of a randomized comparison of 3 groups of 6 children suffering from frequent comparison of 3 groups of 6 children suffering from frequent and severe migraine. The active treatments given were and severe migraine. The active treatments given were relaxation response, either with or without biofeedback, and relaxation response, either with or without biofeedback, and a third group was not treated. The frequency and duration of a third group was not treated. The frequency and duration of headaches were recorded before and after the study period, headaches were recorded before and after the study period, and the difference between these measurements was used and the difference between these measurements was used as a measure of weekly headache activity. The next table as a measure of weekly headache activity. The next table shows the reduction in headache activity for each child, shows the reduction in headache activity for each child, expressed as a percentage. Negative values indicate an expressed as a percentage. Negative values indicate an increase in headache activity. Three children had a increase in headache activity. Three children had a complete absence of headaches at the end of the study complete absence of headaches at the end of the study period and thus a reduction of 100%. These observations period and thus a reduction of 100%. These observations are clearly unsuited for ANOVA, but we can apply the are clearly unsuited for ANOVA, but we can apply the KruskalKruskal

––Wallis test. Wallis test.

Page 25: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Non –– Parametric ANOVA Parametric ANOVA –– the the KruskalKruskal

–– Wallis testWallis test

2i

i

R12H 3(n 1) 5.69n(n 1) n

= − + =+ ∑

X2 distribution with 2 df.PPvaluevalue

= 0.058= 0.058Do Not Reject HDo Not Reject H00

Page 26: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVARepeated measures ANOVARepeated measures ANOVA

is used when all members of is used when all members of

a random sample are measured under a number of different a random sample are measured under a number of different conditions. As the sample is exposed to each condition in conditions. As the sample is exposed to each condition in turn, the measurement of the dependent variable is turn, the measurement of the dependent variable is repeated. Using a standard ANOVA in this case is not repeated. Using a standard ANOVA in this case is not appropriate because it fails to model the correlation between appropriate because it fails to model the correlation between the repeated measures: the data violate the ANOVA the repeated measures: the data violate the ANOVA

assumption of independence. We can simply say that assumption of independence. We can simply say that Repeated measures ANOVARepeated measures ANOVA

is an extension of the is an extension of the paired paired

t testt test. The next table shows the heart rate of nine patients . The next table shows the heart rate of nine patients with congestive heart failure before and shortly after with congestive heart failure before and shortly after administration of administration of enalaprilatenalaprilat, an , an angiotensinangiotensin--converting converting enzyme inhibitor. Measurements were taken before and at enzyme inhibitor. Measurements were taken before and at 30 60 and 120 minutes after dug administration. We can 30 60 and 120 minutes after dug administration. We can divide the total variability into components due to variation divide the total variability into components due to variation between times and between subjects, and there is some between times and between subjects, and there is some remaining variation which we refer to as residual variation. remaining variation which we refer to as residual variation.

Page 27: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVA

Page 28: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVA

Source Partial SS Source Partial SS dfdf MS F MS F ProbProb

> F> F

--------------------------------------------------------------------------------------------------------------------------------------------------------------Model 9117.52778 11 828.866162 67.03 0.0000Model 9117.52778 11 828.866162 67.03 0.0000subject 8966.55556 8 1120.81944 90.64 0.0000subject 8966.55556 8 1120.81944 90.64 0.0000times 150.972222 3 50.3240741 4.07 times 150.972222 3 50.3240741 4.07 0.01800.0180

Residual 296.777778 24 12.3657407 Residual 296.777778 24 12.3657407 --------------------------------------------------------------------------------------------------------------------------------------------------------------Total 9414.30556 35 268.980159 Total 9414.30556 35 268.980159

Reject H0 that there is no change in heart rate over the 2 hours

Page 29: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVA

The means (SD) for each time point The means (SD) for each time point indicate that heart rate fell by an average 4 indicate that heart rate fell by an average 4 beats per minute (beats per minute (bpmbpm) after 30 minutes ) after 30 minutes and remained fairly stable over the next 90 and remained fairly stable over the next 90 minutes. The average pattern is not minutes. The average pattern is not obvious from examination of the raw data. obvious from examination of the raw data.

Page 30: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVAWe could compare each pair of times with a pair t We could compare each pair of times with a pair t test and adjust the p values with the test and adjust the p values with the BonferroniBonferroni

correction, or we could look for a linear trend as correction, or we could look for a linear trend as we did before in this chapter. For these tests is we did before in this chapter. For these tests is essential to use the correct variance and this is essential to use the correct variance and this is the residual mean square. the residual mean square. By fitting the model we have assume that the true By fitting the model we have assume that the true response pattern of heart rate over time is the response pattern of heart rate over time is the same for each subject, and that the differences same for each subject, and that the differences between subjects are the same at each time. between subjects are the same at each time. Based on these assumptions we can predict Based on these assumptions we can predict values in each cell by adding the relevant row and values in each cell by adding the relevant row and column means and subtracting the overall mean column means and subtracting the overall mean which is 93.14bpm. which is 93.14bpm.

Page 31: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVA

Page 32: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVA

Then we can calculate the Then we can calculate the residuals residuals which are the differences between the which are the differences between the observed and the predicted data. observed and the predicted data.

Page 33: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Repeated Measures ANOVARepeated Measures ANOVAThere is no requirement for the data to be Normally distributed.There is no requirement for the data to be Normally distributed.The residuals are expected to have a Normal distribution and it The residuals are expected to have a Normal distribution and it is important to check this assumption after fitting the model.is important to check this assumption after fitting the model.The way we faced the problem of repeated observations was in thiThe way we faced the problem of repeated observations was in this s example by performing example by performing a two way ANOVAa two way ANOVA

(subject was categorical (subject was categorical

variable 1 and time was categorical variable 2). Therefore everyvariable 1 and time was categorical variable 2). Therefore everything thing we said (and will say) in this subsection apply in the general cwe said (and will say) in this subsection apply in the general case of ase of more than 1 categorical variables. The only difference is that more than 1 categorical variables. The only difference is that when when we are running a repeated measures ANOVA we still want, as in thwe are running a repeated measures ANOVA we still want, as in the e one way ANOVA case, to have equal variances within each group. one way ANOVA case, to have equal variances within each group. To test this assumption we perform the To test this assumption we perform the Mauchly'sMauchly's

Test of Test of

SphericitySphericity

and if this comes out statistical significant we correct the and if this comes out statistical significant we correct the pp--value of the time factor using for example the value of the time factor using for example the LowerLower--bound test.bound test.Some statistical packages can perform repeated measures ANOVA Some statistical packages can perform repeated measures ANOVA as a one way technique and the results are exactly the same. as a one way technique and the results are exactly the same.

Page 34: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Two Way ANOVATwo Way ANOVAAllows the researcher to test the effectives of Allows the researcher to test the effectives of twotwo

independent categorical variables; hence, this procedure independent categorical variables; hence, this procedure is called is called TwoTwo--Way ANOVA. Way ANOVA. Suppose we want to look at Suppose we want to look at students GPA's (Y variable) based on the type of their students GPA's (Y variable) based on the type of their major (science, humanities, other) and their class status major (science, humanities, other) and their class status (freshmen, sophomore, junior, senior).(freshmen, sophomore, junior, senior).

Categorical Variable 1: Major

Categorical Variable 2: Class Status

Freshman Sophomore Junior Senior

Science 2.8 3.1 3.2 2.7

Humanities 3.3 3.5 3.6 3.1

Other 3.0 3.2 2.9 3.0

Page 35: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Two Way ANOVATwo Way ANOVAThe example given showed only one value per cell.The example given showed only one value per cell.

As As

we know, it is better to find a large sample, since the we know, it is better to find a large sample, since the power of the test will be better.power of the test will be better.

We can do this, but we We can do this, but we

must make sure that each cell is represented an equal must make sure that each cell is represented an equal number of times.number of times.

For example a researcher is interested For example a researcher is interested

in whether either people's dietary group or their gender in whether either people's dietary group or their gender is a factor in hypertension (measuring blood is a factor in hypertension (measuring blood pressure).pressure).

Three people with each possible combination Three people with each possible combination

were tested.were tested.

Factor 1: GenderFactor 2: Dietary Group

Pure Vegetarian Vegetarian + Fish Non Vegetarian

Male 100, 112, 102 112, 121, 118 122, 123, 131

Female 98, 101, 102 103, 109, 111 124, 101, 117

Page 36: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Two Way ANOVATwo Way ANOVA

In the last example that we have more In the last example that we have more than one observation per cell we could than one observation per cell we could examine the possible existence of a examine the possible existence of a significant significant interactioninteraction

between the two between the two

factors. This means that the factors. This means that the effect on one effect on one factor is not the same at the levels of factor is not the same at the levels of another, another, egeg

the effect of dietary group on the effect of dietary group on

hypertension is different between males hypertension is different between males and females. and females.

Page 37: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Parametric two way ANOVA Non Parametric two way ANOVA –– FriedmanFriedman’’s tests testThe assumption that the residuals have a Normal The assumption that the residuals have a Normal distribution cannot be assessed before fitting the model. distribution cannot be assessed before fitting the model. Sometimes, however it can been seen from the raw data Sometimes, however it can been seen from the raw data that the model will not fit well, especially when we have that the model will not fit well, especially when we have wide variation in standard deviation for each column or wide variation in standard deviation for each column or row. Then the non parametric method that we can use is row. Then the non parametric method that we can use is the Friedmanthe Friedman’’s test.s test.The next table shows some data from an experiment to The next table shows some data from an experiment to compare the leakage from 4 different types of immersion compare the leakage from 4 different types of immersion suit during simulated underwater helicopter escapes. suit during simulated underwater helicopter escapes. The wide variability of the The wide variability of the SDSD’’ss

suggest that a non suggest that a non

parametric 2 way ANOVA should be used. parametric 2 way ANOVA should be used.

Page 38: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Parametric two way ANOVA Non Parametric two way ANOVA –– FriedmanFriedman’’s tests test

Page 39: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Parametric two way ANOVA Non Parametric two way ANOVA –– FriedmanFriedman’’s tests testThe values are ranked in for each subject.The values are ranked in for each subject.

Page 40: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

Non Parametric two way ANOVA Non Parametric two way ANOVA –– FriedmanFriedman’’s tests testIf If RRii

is the sum of the ranks in the is the sum of the ranks in the iithth

group group

and we have k groups and n subjects then and we have k groups and n subjects then we calculate we calculate

2i

12H R 3n(k 1) 12.45nk(n 1)

= − + =+ ∑

X2 distribution with k-1=3 df.PPvaluevalue

< 0.006< 0.006

Reject HReject H00

Perform Wilcoxon

matched pair tests with Bonferroni

Correction to indicate where the differences lie.

21, 0

21

H reject H . The p-value is the

area of to the right of H.κ− α

κ−

> Χ

Χ

Page 41: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

Blood samples were obtained from 5 Blood samples were obtained from 5 postmenopausal women. Each blood sample postmenopausal women. Each blood sample was split into two equal aliquots which were sent was split into two equal aliquots which were sent in a blinded fashion to one laboratory for in a blinded fashion to one laboratory for analysis. The goal of the study was to access analysis. The goal of the study was to access how much variation in the analysis was how much variation in the analysis was attributable to between person versus within attributable to between person versus within person variation. The next table shows the person variation. The next table shows the reproducibility data for plasma reproducibility data for plasma estradiolestradiol

(pg/(pg/mLmL) )

from one laboratory. from one laboratory. Can we estimate the Can we estimate the degree of between person and within person degree of between person and within person variation from the data?variation from the data?

Page 42: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

Subject

Replicate

Absolute value of difference between replicates

MeanValue1 2

1 25.5 30.4 4.9 27.95

2 11.1 15.0 3.9 13.05

3 8.0 8.1 0.1 8.05

4 20.7 16.9 3.8 18.80

5 5.8 8.4 2.6 7.10

Page 43: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

From the previous table it appears that the From the previous table it appears that the variation between replicates depends to variation between replicates depends to some extent on the mean level. Subject 1 some extent on the mean level. Subject 1 who has the largest absolute difference who has the largest absolute difference between replicates also has the highest between replicates also has the highest mean value. This is quite common with mean value. This is quite common with laboratory measures. For this reason we laboratory measures. For this reason we will analyse the data on the will analyse the data on the lnln

(natural (natural

logarithmic) scale. logarithmic) scale.

Page 44: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

Subject

Replicate

Absolute value of difference between replicates

MeanValue1 2

1 3.2 3.4 0.2 3.3

2 2.4 2.7 0.3 2.6

3 2.0 2.1 0.1 2.1

4 3.0 2.8 0.2 2.9

5 1.7 2.1 0.4 1.9

Page 45: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

To access between and within person variability To access between and within person variability consider the following modelconsider the following model

ij i ijy a e , i = 1,..,5, j = 1, 2= μ + +

• yij

is the jth

replicate for ln(plasma

estradiol) for the ith

subject

ai

is a random variable

representing between subject variability and it is assumed to follow N(0, )

eij

is a random variable representing within subject variability, follows N(0,σ2) and is independent of ai

and any other eij

.

2Ασ

The above model refer to as a random effects one way ANOVA balance design (number of replicates per subject is the same).

Page 46: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

The mean of the The mean of the ithith

subject is subject is μμ++aaii

where where aaii

is normally distributed with mean 0 and is normally distributed with mean 0 and

sdsd

σσΑΑ

..

Thus the extend of between subject Thus the extend of between subject variation is determine by . As this variation is determine by . As this quantity increases the between subject quantity increases the between subject variation increases as well. variation increases as well. The within subject variation is determined The within subject variation is determined by by σσ22. . Thus if we have two replicated from Thus if we have two replicated from the same individual i they will be both the same individual i they will be both normally distributed with mean normally distributed with mean μμ

+ + aaii

and and

variance variance σσ22. .

2Ασ

Page 47: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

Ho: = 0 vs

H1

: >0.

Under H0

there is no between subject variation; all variation seen is attributable to within person variation (or noise).

Under H1

there is a true underline difference among means for individual subjects.

2Ασ 2

Ασ

Page 48: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVANumber of obs

=

10

R-squared =

0.9466

Source SS df MS F

Prob

> F

Between subject 2.6577464 4 .66443661 22.15

0.0022Within subject .15001217 5 .03000243

Total 2.8077586 9 .31197318

Intraclass

Asy. correlation S.E. [95% Conf. Interval]------------------------------------------------0.91359 0.07843 0.49970 0.99040

Estimated SD of subject effect .5632203Estimated SD within subject .1732121Est. reliability of a subject mean 0.95485(evaluated at n=2.00)

.

There are significantdifferences among theUnderline mean ln(plasma

estradiol)

values for different subjects.

σ2

Page 49: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

2 BMS WMS 0.6644 0.030ˆ max ,0 0.3172 2Α

− −⎛ ⎞σ = = =⎜ ⎟⎝ ⎠

Thus the between person variance is about 10 times as large as the within person variance, which indicates good reproducibility.

Another parameter that is often of interest in reproducibility

studies is the coefficient of variation (CV)

. Values < 20% are desirable.

1) Apply the ln transformation2) Estimate WSS from the ANOVA table

3) CV 100% WSS 100% 0.0300 17.3%= × = × =

Page 50: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

Finally we could measure a special type of Finally we could measure a special type of correlation between the replicated values of the correlation between the replicated values of the same subject. This is called same subject. This is called intraclassintraclass

correlation correlation coefficiencoefficien

and denoted by and denoted by ρρ11

. Values . Values < 0.4 indicate poor reproducibility, values < 0.4 indicate poor reproducibility, values between 0.4 and 0.75 indicate fair to good between 0.4 and 0.75 indicate fair to good reproducibility and finally values above 0.75 reproducibility and finally values above 0.75 indicate excellent reproducibility. Another indicate excellent reproducibility. Another interpretation of interpretation of ρρ11

is based on reliability rather is based on reliability rather than reproducibility. From the ANOVA output we than reproducibility. From the ANOVA output we can see that can see that ρρ11

is very large and thus there is is very large and thus there is excellent reproducibility in our data.excellent reproducibility in our data.

Page 51: Analysis of Variance - Τομέας Μαθηματικών / …fouskakis/3-4.ANOVA.pdfOne-way fixed effects ANOVA Example: Twenty-two patients undergoing cardiac bypass surgery were

OneOne--way random effects ANOVAway random effects ANOVA

In this subsection we have examined the one In this subsection we have examined the one ––

way way random effects ANOVA. In this model we are not random effects ANOVA. In this model we are not interested in comparing mean levels of our outcome interested in comparing mean levels of our outcome variable (variable (estradiolestradiol) among specific levels of a categorical ) among specific levels of a categorical variable (fixed effects ANOVA). Thus in our example we variable (fixed effects ANOVA). Thus in our example we are not interested in comparing mean are not interested in comparing mean estradiolestradiol

levels levels

between different women. It is a foregone conclusion between different women. It is a foregone conclusion that different women have different that different women have different estradiolestradiol

levels. levels.

Instead the women in the study are consider a random Instead the women in the study are consider a random sample of all the women who could have participate in sample of all the women who could have participate in the study and what is of interest is estimating what the study and what is of interest is estimating what proportion of the total variability of proportion of the total variability of estradiolestradiol

is is

attributable to between person versus within person attributable to between person versus within person variation. variation.