Post on 24-Feb-2016
description
• 6.1 - One Sample
Mean μ, Variance σ 2, Proportion π
• 6.2 - Two Samples Means, Variances, Proportions μ1 vs. μ2 σ1
2 vs. σ22 π1 vs. π2
• 6.3 - Multiple Samples Means, Variances, Proportions μ1, …, μk σ1
2, …, σk2 π1, …, πk
CHAPTER 6 Statistical Inference & Hypothesis Testing
• 6.1 - One Sample
Mean μ, Variance σ 2, Proportion π
• 6.2 - Two Samples Means, Variances, Proportions μ1 vs. μ2 σ1
2 vs. σ22 π1 vs. π2
• 6.3 - Multiple Samples Means, Variances, Proportions μ1, …, μk σ1
2, …, σk2 π1, …, πk
CHAPTER 6 Statistical Inference & Hypothesis Testing
s2 = SS/df2 2
1 1 2 2
1 2
( 1) ( 1)2pooled 2
n s n sn ns
(5 1)( ) (3 1)( )2pooled 5 3 2s
788.5 1663 1080
2 2(593 ) (520 )22 3 1s
546 546 1663
593 525 5202 3y 546
Example: Y = “$ Cost of a certain medical service”
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
667 653 614 612 6041 5y 630
• Analysis via T-test (if equivariance holds): Point estimates /iy y n1 2y y 84
NOTE:> 0
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
2 2(667 ) (604 )21 5 1s
630 630 788.5“Group Variances”
1663788.5 2.11 4F
Pooled Variance
The pooled variance is a weighted average of the group variances, using the degrees of freedom as the weights.
SS1 SS2
2 2(593 546) (520 546)22 3 1s
16632 2(667 630) (604 63021 5 1s
) 788.5
p-value =
SSErr = 64802 2
1 1 2 2
1 2
( 1) ( 1)2pooled 2
n s n sn ns
788.(5 1)( ) (3 1)( )2pooled 5 3 2
5 1663s 1080
2 2(593 ) (520 )22 3
5 61
546 4s 1663
593 525 5202 3y 546
Example: Y = “$ Cost of a certain medical service”
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
667 653 614 612 6041 5y 630
• Analysis via T-test (if equivariance holds): Point estimates /iy y n1 2y y 84
NOTE:> 0
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
2 2(667 ) (604 )21 5 1s
630 630 788.5“Group Variances”
1663788.5 2.11 4F
Pooled Variance
The pooled variance is a weighted average of the group variances, using the degrees of freedom as the weights.
s2 = SS/df
2 2(593 546) (520 546)22 3 1s
1663
dfErr = 6
Standard Error
20 pooled
1 2
1 1s.e. sn n
01 1s.e.5 3
1080 24
2 2(667 630) (604 63021 5 1s
) 788.5
01 22 ( ) 2 2P Y Y P T P T 6 624
8484 3.5
> 2 * (1 - pt(3.5, 6))[1] 0.01282634
Reject H0 at α = .05stat signif, Hosp > Clinic
R code:
> y1 = c(667, 653, 614, 612, 604)> y2 = c(593, 525, 520)> > t.test(y1, y2, var.equal = T)
Two Sample t-test
data: y1 and y2 t = 3.5, df = 6, p-value = 0.01283alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 25.27412 142.72588 sample estimates:mean of x mean of y 630 546
p-value < α = .05Reject H0 at this level.
The samples provide evidence that the difference between mean costs is (moderately) statistically significant, at the 5% level, with the hospital being higher than the clinic (by an average of $84).
Formal Conclusion
Interpretation
“Total Variability” = “Variability between groups” + “Variability within groups”
12
k
1Y 2Y kY
1 2 k
12
k
=
=Null
Hypothesis?
=H0:
HA: “At least one ‘treatment mean’ μi is significantly different from the others.
Analysis of Variance (ANOVA) Main Idea: Among several (k 2) independent, equivariant,
normally-distributed “treatment groups”…
Alternate method ~
• (if equivariance holds): Point estimates /iy y nANOVA F-test593 525 520
2 3y 546
Example: Y = “$ Cost of a certain medical service”
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
667 653 614 612 6041 5y 630 1 2 84y y
NOTE:> 0
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”667 653 614 612 604 593 525 520
5 3y
598.50
5 (630) 3 (546)
The grand mean is a weighted average of the group means, using the sample sizes as the weights.
667 653 614 612 604
1 5y 630 593 525 5202 3y 546
Analysis of Variance (ANOVA)
“Total Variability” =
Alternate method ~
“Variability between groups” + “Variability within groups”
12
k
1Y 2Y kY
1 2 k
12
k
=
==H0:
HA: “At least one ‘treatment mean’ μi is significantly different from the others.
Main Idea: Among several (k 2) independent, equivariant, normally-distributed “treatment groups”…
5( ) 3( )5 3
y
630 546 598.50
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
593 525 5202 3y 546
Example: Y = “$ Cost of a certain medical service”
667 653 614 612 6041 5y 630
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”
• (if equivariance holds): Point estimates /iy y nANOVA F-test
How far is the “total” sample from the grand mean?
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
5( ) 3( )5 3
y
630 546 598.50
593 525 5202 3y 546
Example: Y = “$ Cost of a certain medical service”
667 653 614 612 6041 5y 630
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”
• (if equivariance holds): Point estimates /iy y nANOVA F-test
SSTot = 2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 598.5 598.5 598.5 598.5 598.52 2 2(593 ) (525 ) (520 ) 598.5 598.5 598.5 = 19710 dfTot = (5+3) –1 = 7
k
Analysis of Variance (ANOVA)
“Total Variability” =
Alternate method ~
“Variability between groups” + “Variability within groups”
12
k
1Y 2Y kY
1 2 k
12
=
==H0:
How can we measure this? Imagine zero variability within groups…
Main Idea: Among several (k 2) independent, equivariant, normally-distributed “treatment groups”…
k
Analysis of Variance (ANOVA)
“Total Variability” =
Alternate method ~
“Variability between groups” + “Variability within groups”
12
k
1 2 = =H0:
How can we measure this? Imagine zero variability within groups…
k=
2Y kY
21
1Y
Main Idea: Among several (k 2) independent, equivariant, normally-distributed “treatment groups”…
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
{630, 630, 630, 630, 630}
Example: Y = “$ Cost of a certain medical service”
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”
• (if equivariance holds): Point estimates /iy y nANOVA F-test
SSTot = 2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 598.5 598.5 598.5 598.5 598.5= 19710
{546, 546, 546}
SSTrt = 2 25 ( ) 3 ( ) 630 598.5 546 598.5
2 2 2(593 ) (525 ) (520 ) 598.5 598.5 598.5
= 13230
dfTot = (5+3) –1 = 7
dfTrt = (2) –1 = 1
5( ) 3( )5 3
y
630 546 598.50
667 653 614 612 6041 5y 630 593 525 520
2 3y 546
“The Clonemast
er”
Analysis of Variance (ANOVA)
“Total Variability” =
Alternate method ~
“Variability between groups” + “Variability within groups”
12
k
1Y 2Y kY
1 2 k
12
k
=
==H0:
Main Idea: Among several (k 2) independent, equivariant, normally-distributed “treatment groups”…
• (if equivariance holds): Point estimates /iy y n593 525 520
2 3y 546667 653 614 612 6041 5y 630
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Example: Y = “$ Cost of a certain medical service”
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”
ANOVA F-test
SSTot = 2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 598.5 598.5 598.5 598.5 598.5= 19710
SSTrt = 2 25 ( ) 3 ( ) 630 598.5 546 598.5
2 2 2(593 ) (525 ) (520 ) 598.5 598.5 598.5
= 13230
dfTot = (5+3) –1 = 7
dfTrt = (2) –1 = 1
How far is each sample from its own group mean?
5( ) 3( )5 3
y
630 546 598.50
593 525 5202 3y 546667 653 614 612 604
1 5y 630
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Example: Y = “$ Cost of a certain medical service”
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”
• (if equivariance holds): Point estimates /iy y nANOVA F-test
SSTot = 2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 598.5 598.5 598.5 598.5 598.5= 19710
SSTrt = 2 25 ( ) 3 ( ) 630 598.5 546 598.5
2 2 2(593 ) (525 ) (520 ) 598.5 598.5 598.5
= 13230
dfTot = (5+3) –1 = 7
dfTrt = (2) –1 = 1
SSErr =
5( ) 3( )5 3
y
630 546 598.50
2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 630 630 630 630 6302 2 2(593 ) (525 ) (520 ) 546 546 546 BUT…
1 2y y 84
s2 = SS/df2 2
1 1 2 2
1 2
( 1) ( 1)2pooled 2
n s n sn ns
(5 1)( ) (3 1)( )2pooled 5 3 2s
788.5 1663 1080
2 2(593 ) (520 )22 3
5 61
546 4s 1663
593 525 5202 3y 546
Example: Y = “$ Cost of a certain medical service”
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
667 653 614 612 6041 5y 630
• Analysis via T-test (if equivariance holds): Point estimates /iy y nNOTE:
> 0
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
2 2(667 ) (6630 63004 )21 5 1s
788.5“Group Variances”
1663788.5 2.11 4F
Pooled Variance
The pooled variance is a weighted average of the group variances, using the degrees of freedom as the weights.
SS1 SS2
2 2(593 546) (520 546)22 3 1s
16632 2(667 630) (604 63021 5 1s
) 788.5
RECALL…
1 2y y 84
SSErr = 64802 2
1 1 2 2
1 2
( 1) ( 1)2pooled 2
n s n sn ns
(5 1)( ) (3 1)( )2pooled 5 3 2s
788.5 1663 1080
2 2(593 ) (520 )22 3
5 61
546 4s 1663
593 525 5202 3y 546
Example: Y = “$ Cost of a certain medical service”
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
667 653 614 612 6041 5y 630
• Analysis via T-test (if equivariance holds): Point estimates /iy y nNOTE:
> 0
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
2 2(667 ) (6630 63004 )21 5 1s
788.5“Group Variances”
1663788.5 2.11 4F
Pooled Variance
The pooled variance is a weighted average of the group variances, using the degrees of freedom as the weights.
s2 = SS/df
2 2(593 546) (520 546)22 3 1s
1663
dfErr = 6
2 2(667 630) (604 63021 5 1s
) 788.5
RECALL…
2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 630 630 630 630 6302 2 2(593 ) (525 ) (520 ) 546 546 546
593 525 5202 3y 546667 653 614 612 604
1 5y 630
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Example: Y = “$ Cost of a certain medical service”
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”
• (if equivariance holds): Point estimates /iy y nANOVA F-test
SSTot = 2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 598.5 598.5 598.5 598.5 598.5= 19710
SSTrt = 2 25 ( ) 3 ( ) 630 598.5 546 598.5
2 2 2(593 ) (525 ) (520 ) 598.5 598.5 598.5
= 13230
dfTot = (5+3) –1 = 7
dfTrt = (2) –1 = 1
SSErr =
5( ) 3( )5 3
y
630 546 598.50
593 525 5202 3y 546667 653 614 612 604
1 5y 630
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Example: Y = “$ Cost of a certain medical service”
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Clinic: Y2 ~ N(μ2, σ2) Hospital: Y1 ~ N(μ1, σ1) • Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2 = 0 (“No difference exists.") 2-sided test at significance level α = .05
“Group Means”
“Grand Mean”
• (if equivariance holds): Point estimates /iy y nANOVA F-test
SSTot = 2 2 2 2 2(667 ) (653 ) (614 ) (612 ) (604 ) 598.5 598.5 598.5 598.5 598.5= 19710
SSTrt = 2 25 ( ) 3 ( ) 630 598.5 546 598.5
2 2 2(593 ) (525 ) (520 ) 598.5 598.5 598.5
= 13230
dfTot = (5+3) –1 = 7
dfTrt = (2) –1 = 1
SSErr =
5( ) 3( )5 3
y
630 546 598.50
4( ) 2( )788.5 1663 dfErr = = 6(5+3) –2= 6480
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErr
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErr
Source df SS MS F-ratio p-value
Treatment 1 13230 13230
Error 6 6480 1080
Total 7 19710 –
ANOVA TableSSMSdf
2betweens
2withins
Note: This is also
2pooled .s
Trt
Err
MSMS
F
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErrTot
Err
Trt
Source df SS MS F-ratio p-value
Treatment 1 13230 13230
12.25 ????Error 6 6480 1080
Total 7 19710 –
ANOVA TableSSMSdf
2betweens
2withins
Note: This is also
2pooled .s
Trt
Err
MSMS
F
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErrTot
Err
Trt
2 11
1
SSdf
s 2 22
2
SSdf
s
2122
sFs
2 20 1 2
2 21 2
:
:A
H
H
Test Statistic
Sampling Distribution =?
Source df SS MS F-ratio p-value
Treatment 1 13230 13230
12.25
Error 6 6480 1080
Total 7 19710 –
ANOVA TableSSMSdf
2betweens
2withins
Note: This is also
2pooled .s
Trt
Err
MSMS
F
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErrTot
Err
Trt
1,6F
|
12.25
p-value
5.99
Source df SS MS F-ratio p-value
Treatment 1 13230 13230
12.25
Error 6 6480 1080
Total 7 19710 –
ANOVA TableSSMSdf
2betweens
2withins
Note: This is also
2pooled .s
Trt
Err
MSMS
F
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErrTot
Err
Trt
1,6F
|
12.25
p-value
|
5.99
= .05α
Source df SS MS F-ratio p-value
Treatment 1 13230 13230
12.25
Error 6 6480 1080
Total 7 19710 –
ANOVA TableSSMSdf
2betweens
2withins
Note: This is also
2pooled .s
Trt
Err
MSMS
F
1, 6(on )F
p < .05
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErrTot
Err
Trt
Source df SS MS F-ratio p-value
Treatment 1 13230 13230
12.25 .01282634
Error 6 6480 1080
Total 7 19710 –
ANOVA TableSSMSdf
2betweens
2withins
Note: This is also
2pooled .s
Trt
Err
MSMS
F
1, 6(on )F
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErrTot
Err
Trt
1–pf(12.25, 1, 6)
Source df SS MS F-ratio p-value
Treatment 1 13230
12.25 .01282634
Error 6 6480 1080
Total 7 –
13230
19710
ANOVA TableSSMSdf
2betweens
2withins
Trt
Err
MSMS
F
1, 6(on )F
1–pf(12.25, 1, 6)
TotErr
Trt
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErr
Thus, the treatment accounts for = 67.1% of the total variability in the response Y.
1323019710
R code:
# ANOVA FOR UNBALANCED DESIGN
> y1 = c(667, 653, 614, 612, 604)> y2 = c(593, 525, 520)> > Data = data.frame(+ Y = c(y1, y2),+ X = factor(rep(c("y1", "y2"), times = c(length(y1), length(y2))))+ )> > var.test(Y ~ X, data = Data) # EQUIVARIANCE?
F test to compare two variances
data: Y by X F = 0.4741, num df = 4, denom df = 2,p-value = 0.4738alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.01208057 5.04920249 sample estimates:ratio of variances 0.4741431
R code:
# ANOVA FOR UNBALANCED DESIGN
> out = aov(Y ~ X, data = Data)> anova(out)
Analysis of Variance Table
Response: Y Df Sum Sq Mean Sq F value Pr(>F) X 1 13230 13230 12.25 0.01283 *Residuals 6 6480 1080 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Note: Vis-à-vis T-test vs. F-test,
• p-value is the same using either method (.01283), since the sample is unchanged!
• The square of the Tdf -score (3.5) is equal to the F1, df -score (12.25).
(Recall that the square of the Z-score is equal to the -score.)21
1X 2X kX
1X 2X kX
Suppose this ANOVA “overall F-test” indicates that a significant difference exists between one (or more) of the treatment means, at = .05.
How can we find out which one(s)?
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.Example : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3,4) (3,5) (4,5)... ... ... ... ... ... ... ... ... ...p p p p p p p p p p
There are such comparisons.5
102
12
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
…etc…
PROBLEM???
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.Example : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3,4) (3,5) (4,5)... ... ... ... ... ... ... ... ... ...p p p p p p p p p p
There are such comparisons.5
102
12
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
…etc…
PROBLEM???
= .05
SPURIOUS SIGNIFICANCE!!!
* = .05/10
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.Example : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3,4) (3,5) (4,5)... ... ... ... ... ... ... ... ... ...p p p p p p p p p p
There are such comparisons.5
102
12
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
…etc…
Make each comparison at level * = / 10.
PROBLEM???
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.Example : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3,4) (3,5) (4,5)... ... ... ... ... ... ... ... ... ...p p p p p p p p p p
There are such comparisons.5
102
12
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
…etc…
BONFERRONI CORRECTIONMake each comparison at level * = / 10.
1
2
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
Analysis of Variance (ANOVA) Main Idea: Among several (k 2) independent, equivariant,
normally-distributed “treatment groups”…
Alternate method ~
MODEL ASSUMPTIONS?
1
2
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
Analysis of Variance (ANOVA) Main Idea: Among several (k 2) independent, equivariant,
normally-distributed “treatment groups”…
Alternate method ~
• Equivariance can be tested via very similar “two variances” F-test in 6.2.2 (but this is very sensitive to normality assumption), or others. If violated, can extend Welch Test for two means.
1
2
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
Analysis of Variance (ANOVA) Main Idea: Among several (k 2) independent, equivariant,
normally-distributed “treatment groups”…
Alternate method ~
• Normality can be tested via usual methods. If violated, use nonparametric Kruskal-Wallis Test.
1
2
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
Analysis of Variance (ANOVA) Main Idea: Among several (k 2) independent, equivariant,
normally-distributed “treatment groups”…
Alternate method ~
• Extensions of ANOVA for data in matched “blocks” designs, repeated measures, multiple factor levels within groups, etc.
1
2
k
1Y 2Y kY
1 2 k
12
k
= ==H0:
Analysis of Variance (ANOVA) Main Idea: Among several (k 2) independent, equivariant,
normally-distributed “treatment groups”…
Alternate method ~
• How to identify significant group(s)? Pairwise testing, with correction (e.g., Bonferroni) for spurious significance.
• Example: k = 5 groups result in 10 such tests, so let each α* = α / 10.
“spurious significance”