Mixed Effects Models
Rebecca Atkins and Rachel SmithMarch 30, 2015
Up to now…• General Linear Model (lm)
– yi = α + β1𝑥1 + ... + β𝑝𝑥𝑝𝑖 + ε𝑖 with ε𝑖 ~ N(0, σi2)
• Generalized Linear Model (glm)– Non-normal error distributions for response variable; link
function
• Generalized Additive Model (gam)– Identify smoothed lines of best fit for non-linear
relationships
• Generalized Least Squares (gls)– Altered variance structures of Normal distribution
• Residuals are normally distributed• Histogram or Q-Q plot
• Residuals are “homogenous” or “homoscedastic” (constant variance) – No autocorrelation between observations
• plot residuals
• No colinearity between independent variables• Pairs plot in R
• The model is not biased by unduly influential observations
• “Cook’s Distance” and leverage
• Independent observations
Nested data?Blocking?
Repeated measures? Split-plot designs?
Spatial or temporal autocorrelation?
But what about…
Use of mixed models
• “Mixed effects models or multilevel models and are used when the data have a hierarchical form…which can have both fixed and random coefficients together with multiple error terms.”1
Zuur et al. 2007. Analyzing Ecological Data. Pg 127
Rob Thomas
Rob Thomas
GLS
LM, GAM, GLM
Model Structure
LMM, GLMM, GAMM
Parameter estimation
• ML = Maximum Likelihood– Common with GLM
• REML = Restricted Maximum Likelihood– Corrects ML estimation for the number of fixed
covariates– Less influenced by outliers than ML estimates– Common with LMM
R packages
• library(nlme) = Non-Linear Mixed Effects– lme = Linear Mixed Effects – gls = Generalised Least Squares– model<- lme (y ~ fixed, random = ~1|random,
data)
• library (lme4) = Linear Mixed Effects v.4 – lmer = Linear Mixed Effects REML– model <- lmer(y ~ fixed + (1|random), family =
gaussian (link = “identity”), data)
Nested Design Example
• Are there any differences between the NAP-richness relationship at these 9 beaches?– NAP = tidal height, predictive variable– Species richness = response variable
From Zuur 2009
richness values for beach i, i = 1,…,9
Fixed Term
Richness-NAP fixed effect across all beaches
Random Term
Richness-NAP random effect for
each beach
Mixed Effects Model Structure
From Zuur 2009
Model 1: Constant slope/intercept• yi = α + β𝑥i + ε𝑖 with ε𝑖 ~ N(0, σi
2)
• Assumes that the richness-NAP relationship is the same at all beaches
• model1 <- gls (richness ~ 1 + NAP, method = “REML”, data )
From Zuur 2007 and Rob Thomas
A model fitted using the REML method, but containing no random effects at all–basically a REML-fitted linear regression
i.e. a model that fits the same slope for each level of the random factor (fitted by REML by default)
Model 2: varying intercept, same slope
• yij = α+ β𝑥ij + aj + ε𝑖j where aj ~ N(0, σa2)
and ε𝑖j ~ N(0, σ2)• model2 <- lme (richness ~ NAP, random = ~1|
beach, method = “REML”, data)
From Zuur 2007 and Rob Thomas
Model 3: varying slope, varying intercept
• yij = α + β𝑥ij + aj + bjxij + ε𝑖j where aj ~ N(0, σa2),
bj ~ N(0, σb2), and ε𝑖j ~ N(0, σ2)
• model3 <-- lme (richness ~ NAP, random = ~NAP | beach, method = “REML”, data)
i.e. a model that fits a different slope for each level of the random factor (fitted by REML by default) From Zuur
2007 and Rob Thomas
Additional complexity
• Generalized Mixed models: lmer() and mgcv()• GLMM and GAMM; different underlying error
distributions
Mixed Effects Resources• Mixed Effects Models and Extensions in
Ecology with R (2009). Zuur, Ieno, Walker, Saveliev and Smith. Springer
Model selection?
• Check assumptions of the model (e.g., residuals and colinearity)
• Compare competing models (R. Thomas recommends comparing a gls (containing no random effects) to a linear effects mixed model to assess the importance of the random effect).
• Compare nested models using AIC • OR: stepwise model refinement
The End!
• Presenting examples (how many?)– How many types of models (nested- repeated
measures, split-plot designs, nonlinear, linear) – Page 102 (online) and 71 (book) of Zuur – BDRipley 271 (pdf)– Random slope vs intercept models
• Relate to R code / packages? – LME4 and NLME
• General/Generalised linear model • General/Generalised additive model (GAM):
identify smoothed lines of best fit through a dataset. A non-parametric smoothed relationship is chosen to fit a curve. – Non-Gaussian error distributions can also be chosen
as with GLM• Generalised least squares (GLS): incorporates a
random term that takes into account heteroskedasticity (non-homogenous variance and/or autocorrelation structures)– Can use gls function in nlme package – Multiple variance structures to pick from
• General/Generalised linear mixed modeling – nlme; lme4; asreml – Model fitting
• “ML” (Maximum likelihood): common with GLM • “REML” (Restricted Maximum likelihood): corrects ML estimation
for the number of fixed covariates. Less influenced by outliers than ML estimates
– Structure • Random intercept (same autocorrelation function for all levels of
fixed factors) • Random intercept + slope (autocorrelation function varies across
different levels of the fixed factor) • Random slope and > 1 random effect • Random effect only aside from the intercept (useful as a null
model to evaluate the importance of fixed effects in a GLMM) • Fixed factors only (not really a mixed model, but useful as a null
model to evaluate the importance of random effects in a GLMM)
• Generalised additive mixed modeling
Model 5: Random effect, no fixed effect
• yi= α + bi+ ε𝑖j with ε𝑖j ~ N(0, σ2)
• model5 <-- lme(richness ~1, random = ~1|beach, method = “RMEL”, data)
i.e. a model that fits the mean value for each level of the random factor (fitted by REML by default)
y
xFrom Rob Thomas
Top Related