Math 141 - Lecture 22: Anova, Ancova, and Dummy Variablespeople.reed.edu/~jones/Courses/P22.pdf ·...

Math 141Lecture 22: Anova, Ancova, and Dummy Variables

Albyn Jones1

1Library 304jones@reed.edu

www.people.reed.edu/∼jones/courses/141

Albyn Jones Math 141

Reminder: The Linear Model

Yi = β0 + βuUi + βwWi + βxXi + εi

where the β’s are unknown constants, and the U ’s, W ’s andX ’s are known constants. The linear model includes severalspecial cases better known by other names:

Multiple Regression All explanatory variables are numeric.Analysis of Variance All explanatory variables are categorical.Analysis of Covariance Some explanatory variables are

numeric, and some are categorical.

where the β’s are unknown constants, and the U ’s, W ’s andX ’s are known constants. The linear model includes severalspecial cases better known by other names:Multiple Regression All explanatory variables are numeric.

Analysis of Variance All explanatory variables are categorical.Analysis of Covariance Some explanatory variables are

where the β’s are unknown constants, and the U ’s, W ’s andX ’s are known constants. The linear model includes severalspecial cases better known by other names:Multiple Regression All explanatory variables are numeric.Analysis of Variance All explanatory variables are categorical.

Analysis of Covariance Some explanatory variables arenumeric, and some are categorical.

where the β’s are unknown constants, and the U ’s, W ’s andX ’s are known constants. The linear model includes severalspecial cases better known by other names:Multiple Regression All explanatory variables are numeric.Analysis of Variance All explanatory variables are categorical.Analysis of Covariance Some explanatory variables are

Coding categorical variables

One special type of explanatory variable is the so-calleddummy variable or indicator variable to mark membership in acategory. Closely related are contrasts. The terminology youwill see in R help files is treatment coding for a dummy variable:

categorical logical treatment contrast

A vs B variable coding coding

A FALSE 0 -1

B TRUE 1 1

Coding categorical variables: Example

The Berkeley Longitudinal Study dataset includes a variablecalled sex, with values Male and Female.

In this case, R will automatically code a dummy variable withvalues 1 for Male and 0 for Female (based on alphabeticalorder).

Example: categorical explanatory variable

> ht2.lm <- lm(ht2 ˜ sex, data=Berkeley)> summary(ht2.lm)

Estimate Std. Error t value Pr(>|t|)(Intercept) 87.4656 0.5842 149.718 <2e-16sexMale 0.9344 0.8725 1.071 0.289

Residual standard error: 3.305 on 56 df

> with(Berkeley, tapply(ht2,sex,mean))Female Male

87.46563 88.40000

The summary() function reports the treatment coding bycombining the name of the variable (sex) with the categorycoded as 1: sexMale. The mean height for females is theintercept, for males it is the sum 87.4656 + 0.9344.

Compare: two sample t-test

> with(Berkeley,t.test(ht2˜sex,var.equal=TRUE))

Two Sample t-test

data: ht2 by sext = -1.0709, df = 56, p-value = 0.2888alternative hypothesis: true difference

in means is not equal to 095 percent confidence interval:-2.6822985 0.8135485

sample estimates:mean in group Female mean in group Male

87.46563 88.40000

Anova is just another linear model!(Assuming it is a fixed effects model. For random effects or amixed model don’t use least squares!)

Anova Example: Michelson’s 1879 data

●●

1 2 3 4 5

Michelson's Data

Example: Anova

Call: lm(Speed ˜ Run, data = Michelson)

Coefficients:Estimate Std. Error t value Pr(>|t|)

(Intercept) 299909.0 16.60 18067.73 < 2e-16Run2 -53.0 23.47 -2.25 0.026251Run3 -64.0 23.47 -2.72 0.007627Run4 -88.5 23.47 -3.77 0.000283Run5 -77.5 23.47 -3.30 0.001356

# Coincidence ??> sqrt(2)*16.6[1] 23.47595

Anova fits group means

# compute means by group of test runs> with(Michelson,tapply(Speed,Run,mean))

1 2 3 4 5299909.0 299856.0 299845.0 299820.5 299831.5

# compare to lm output> coef(Michelson.lm)(Intercept) Run2 Run3 Run4 Run5

299909.0 -53.0 -64.0 -88.5 -77.5

> 299909.0 + c(-53.0, -64.0, -88.5, -77.5)[1] 299856.0 299845.0 299820.5 299831.5

View the coding

> contrasts(Michelson$Run)2 3 4 5

1 0 0 0 02 1 0 0 03 0 1 0 04 0 0 1 05 0 0 0 1

Change the Baseline Group

> contrasts(Michelson$Run) <-contr.treatment(5,base=5)

> contrasts(Michelson$Run)1 2 3 4

1 1 0 0 02 0 1 0 03 0 0 1 04 0 0 0 15 0 0 0 0

Changing the basline group

# contrasts(Michelson$Run) <-# contr.treatment(5,base=5)# make group 5 the baseline

> coef(lm(Speed ˜ Run, data=Michelson))(Intercept) Run1 Run2 Run3 Run4299831.5 77.5 24.5 13.5 -11.0

> with(Michelson,tapply(Speed,Run,mean))1 2 3 4 5

299909.0 299856.0 299845.0 299820.5 299831.5

The data again...

●●

1 2 3 4 5

Michelson's Data

Michelson Data: Code Your Own Dummy Var.

> Run1 <- Michelson$Run == 1# create a dummy variable for first run

> coef(lm(Speed ˜ Run1, data=Michelson))(Intercept) Run1TRUE299838.25 70.75

> with(Michelson,tapply(Speed,Run1,mean))FALSE TRUE

299838.2 299909.0

> 299838.25 + 70.75[1] 299909

Traditional Analysis of Variance presentation

> summary(aov(Speed ˜ Run,Michelson))Df Sum Sq Mean Sq F value Pr(>F)

Run 4 94514 23628 4.288 0.00311Residuals 95 523510 5511

Math 141 - Lecture 22: Anova, Ancova, and Dummy Variablespeople.reed.edu/~jones/Courses/P22.pdf ·...

Documents

Transcript of Math 141 - Lecture 22: Anova, Ancova, and Dummy Variablespeople.reed.edu/~jones/Courses/P22.pdf ·...

Part VII Heteroskedasticity - Vaasan yliopistolipas.uwasa.fi/~sjp/Teaching/ecm/lectures/ecmc7.pdf · When x-variables include dummy-variables, be aware of the dummy-variable trap

ILTS Membership Brochure

Digital Temperature Sensing in a Variable Supply Environment

funciones de variable compleja

sources: Nemirovsky & Shapiro - Stanford University · Stochasticprogramming • objective and constraint functions fi(x,ω) depend on optimization variable x and a random variable

Ch9: Exact Inference: Variable Eliminationhaimk/pgm-seminar/part_1_and_3_shimi.pdf · 2015-11-15 · Ch9: Exact Inference: Variable Elimination Shimi Salant, Barak Sternberg . Part

Professional / Research Experience€¦ · Editorial board membership: Austin Journal of Nanomedicine & Nanotechnology (since 2013) ! Organizing Committees for local and International

The Accelerating Universe The Hubble DiagramRedshiftStandard CandlesCepheid Variable StarsSupernovaeInterpretation.

Latent Variable Models Probabilistic Models in the Study of …idiom.ucsd.edu/.../latent_variable_models_presentation.pdf · 2012. 8. 9. · We will cover two types of simple latent-variable

Function spaces with variable exponents - TU Chemnitzpotts/cms/cms14/Summerschool… · Extrapolation in variable Lebesgue spaces ... Lebesgue and Sobolev spaces with variable exponents,

Variable Fractional Delay FIR Filters with Sparse Coefficients

III. TRANSFERT DE CHALEUR PAR CONDUCTION EN REGIME VARIABLE · TRANSFERT DE CHALEUR PAR CONDUCTION EN REGIME VARIABLE 3.1 Conduction unidirectionnelle en régime variable sans changement

The Cleveland Street Variable Star Observatory presents

CAP¶ITULO Variables Aleatorias - CIMATjortega/MaterialDidactico/EPyE10/Cap4v1.6.… · es una variable aleatoria real, o simplemente una variable aleatoria, si se cumple que para

Speed of a wave v STATE VARIABLE. Energy of a photon E photon STATE VARIABLE.

Instrumental-Variable - University of British Columbiaruben/website/Santalo_School_2014_I.pdf · Instrumental-Variable Ruben Zamar Department of Statistics UBC March 12, 2014 Ruben

Variable micro- inspection system

LMH6503 Wideband, Low Power, Linear Variable Gain ... · PDF fileLMH6503 SNOSA78E – OCTOBER 2003– REVISED APRIL 2013 LMH6503 Wideband, Low Power, Linear Variable Gain Amplifier

Maximization of a Function of One Variable - Baylor …business.baylor.edu/scott_cunningham/teaching/microeconomics/... · Maximization of a Function of One Variable ... dx a ax d

The random variable - geomatica.como.polimi.itgeomatica.como.polimi.it/corsi/statistical-analysis/2_RV.pdf · Functions of one random variable 5. ... i will be represented graphically