Bayesian Random Effect Models - Duke University · PDF fileBayesian Random Effect Models...
Transcript of Bayesian Random Effect Models - Duke University · PDF fileBayesian Random Effect Models...
Bayesian Random Effect ModelsNovember 2, 2008
Bayesian Random Effect Models – p.1/19
One-Way AOV Model
In the classical one-way analysis of variance model:
Yij = µj + ǫij , ǫijiid∼ N(0, σ2)
interest is in the individual means or differences in meansfor the J particular groups that have been selected.
The µj are referred to as fixed effects. An alternativemodel is the random effects model that we illustrate next.
Bayesian Random Effect Models – p.2/19
Example: Math Achievement
Representative sample of U.S. public schools (160schools)
Within each school, a random sample of students isselected
Yij is a standardized measure of math achievementfor student i in school j
Other variables are also measured (more later)
Bayesian Random Effect Models – p.3/19
Hierachical Random Effect Model
The student level model:
Yij = µj + ǫij ǫijiid∼ N(0, σ2)
where σ2 measures how much variation each individualstudent deviates from their school mean µj
Question: How much do US high schools vary in theirmean mathematics achievement?Inference about the broader population of school means:
µjiid∼ N(µ, σ2
µ)
In this 2nd-level model, the school level means vary aboutan overall mean µ with variance σ2
µBayesian Random Effect Models – p.4/19
Random Effects
In 2-level model, the school-level means are viewed asrandom effects arising from a normal population.
µjiid∼ N(µ, σ2
µ)
µ is the overall population mean, a fixed effect
σ2 is the within-group variance or variance component
σ2µ is the between-group variance
2 additional parameters versus the J + 1 in the fixedeffects model.
How much US high schools vary in their meanmathematics achievement is captured by the variancecomponent σµ
Bayesian Random Effect Models – p.5/19
Mixed Effects Model
We can write µj = µ + sj where each school mean iscentered at the overall mean µ plus some normal randomeffect sj . Substituting this into the distribution for Yij , wearrive at the combined model:
Yij = µ + sj + ǫij
with fixed effect µ and school level random effects sj andindividual random effects ǫij, leading to what is known asa mixed effects model.
Bayesian Random Effect Models – p.6/19
Marginal Model
Because linear combinations of normals are normallydistributed we have the equivalent model:
Yij ∼ N(µ, σ2
µ + σ2)
whereCov(Yij , Yi′j) = σ2
µ
Cov(Yij , Yi′j′) = 0, for j 6= j′
This model that implies students within schools areexchangeable and that student achievements acrossdifferent schools are independent given the school effect.(reasonable assumption?)
Bayesian Random Effect Models – p.7/19
Intraclass Correlation
The intraclass correlation:
Corr(Yij , Yi′j =σ2
µ
σ2µ + σ2
provides a measure of the proportion of total variation thatis explained by between group variability. It is
0 when there is no between group variability σ2µ = 0
1 when there is no within group variability σ2 = 0
Bayesian Random Effect Models – p.8/19
Classical Estimation
Method of Moments Estimates: Equate E(MS)’s toobserved MS’s. Two equations with two unknowns,solve. MOM estimate of µ is sample mean.
Find the MLE’s of µ, σ2, σ2µ from the marginal model
for Yij (make sure estimates are in parameter space,i.e. estimates of variance components are positive)
Use REsidual Maximum Likelihood (REML). Fit fixedeffects by least squares, then estimated variancecomponents by ML using residuals.
Use library nlme in R and function lme to fit linearmixed effect models using Restricted MaximumLikelihood (reml) default or ML
lme(fixed = formula, data, random =formula, method)
Bayesian Random Effect Models – p.9/19
Edited Output
> summary(lme(fixed = y ˜ 1,random = ˜ 1 | school))
Linear mixed-effects model fit by REML
Random effects:Formula: ˜1 | school
(Intercept) ResidualStdDev: 2.934966 6.256862
Fixed effects: y ˜ 1Value Std.Error DF t-val p-value
(Intercept) 12.637 0.2443936 7025 51.71 0
Bayesian Random Effect Models – p.10/19
REML Estimates
µ̂ = 12.63
σ̂µ = 2.93 or σ̂2µ = 8.61
σ̂ = 6.26 or σ̂2 = 39.14
ρ = 8.61/(8.61 + 39.14) = .18
Roughly 20% of the variation in math achievement scorescan be attributed to differences among schools. Theremaining variation is due to variation among studentswithin schools.
No estimates of individual school means, however.
Bayesian Random Effect Models – p.11/19
Bayesian Model
Unknown parameters of interest: µj , µ, σ2, σ2µ
Distribution for µj is given by the 2nd level modelspecification
Specify joint prior distribution for remaining unknownsµ, σ2, σ2
µ
A default prior p(µ, σ2, σµ) ∝ 1/σ2
Obtain a joint posterior distribution over all unknowns
p(µ1, . . . , µJ , µ, σ2, σµ|Y ) ∝ p(Y |µ1, . . . , µJ , σ2)p(µj|µ, σ2
µ)p(µ, σ2,
(replace variance components with precisions φ and φµ)
Bayesian Random Effect Models – p.12/19
Hierarichal Model
Hierarchical model
p(Y | µj , φ) ∝∏
j
∏
i
φ1/2 exp
{
−1
2φ(Yij − µj)
2
}
p(µj|µ, φµ) ∝ φ1/2
µ exp
{
−1
2φµ (µj − µ)2
}
p(µ) ∝ 1
p(φ) ∝ 1/φ
p(φµ) ∝ ? (see HW 6)
p(µ1, . . . , µJ , µ, σ2, σµ|Y ) ∝ p(Y |µ1, . . . , µJ , σ2)p(µj|µ, σ2
µ)p(µ, σ2,Bayesian Random Effect Models – p.13/19
Markov Chain Monte Carlo Sampling
Cannot obtain the posterior distributions in closed form;instead create a Markov chain that generates values fromthe following full conditional distributions
µj|µ, φ, φµ, Y for j = 1, . . . , J
µ|µj , φ, φµ, Y
φ|µ, µj , φµ, Y
φµ|µ, µj , φ, Y
Gibbs Sampling gives a dependent sequence of drawsfrom the joint posterior distribution. Given the sampleoutput summarize to estimate the posterior densities,quantiles, etc.
Bayesian Random Effect Models – p.14/19
Full Conditional Distributions
Let θ = (θ1, . . . θp)
full conditional distribution for any component θj giventhe rest θ−j (rest)
p(θj|θ−j , Y ) ∝ L(θ)p(θ)
Only terms that involve θj are needed in writing thefull conditional
May “block” together terms to use “vectorized” code
Bayesian Random Effect Models – p.15/19
Full conditional for µ
⇒ p(µ|µ1, . . . µJ , φ, φµ, Y ) ∝J
∏
j=1
p(µj|µ, φµ)p(µ)
= N
(∑
µj
J,
1
Jφµ
)
Bayesian Random Effect Models – p.16/19
Summary of Output from WinBUGS
mean sd 2.5% 25% 50% 75% 97.5%rho 0.2 0.0 0.1 0.2 0.2 0.2 0.2sigma2.mu 8.8 1.1 6.9 8.0 8.7 9.5 11.1sigma2 39.2 0.6 37.9 38.7 39.1 39.6 40.4mu 12.6 0.3 12.1 12.5 12.6 12.8 13.1
Bayesian Random Effect Models – p.17/19
Posterior
Densities
11.512.0
12.513.0
13.5
0.0 0.5 1.0 1.5
µ
Density
0.150.20
0.25
0 5 10 15 20
ρDensity
3738
3940
41
0.0 0.1 0.2 0.3 0.4 0.5 0.6
σ2
Density
68
1012
14
0.00 0.10 0.20 0.30
σµ 2
Density
Bayesian
Random
EffectM
odels–
p.18/19
Summary
Can obtain posterior distribution for each schoolmean. Posterior mean will be a convex combinationbetween MLE (observed school mean) and overallmean
School level means are “shrunk” to overall mean;degree of shrinkage depends on variancecomponents
Compromise between fixed effects modelseach school has its own meancommon mean (µ1 = . . . , µj = µ)
avoids multiple testing
Bayesian results are useful for ranking of schools“report cards”
Bayesian Random Effect Models – p.19/19