Generalized Linear Models...Generalized Linear Models † GLMs generalize the standard linear model:...

Generalized

Linear Models

Objectives:

• Systematic + Random.

• Exponential family.

• Maximum likelihood estimation & inference.

45 Heagerty, Bio/Stat 571

Generalized Linear Models

• Models for independent observations

Yi, i = 1, 2, . . . , n.

• Components of a GLM:

. Random component

Yi ∼ f(Yi, θi, φ)

f ∈ exponential family

. Systematic component

ηi = Xiβ

ηi : linear predictor

Xi : (1× p) covariate vector

β : (p× 1) regression coefficient

. Link function

E(Yi | Xi) = µi

g(µi) = Xiβ

g(·) : link function

• GLMs generalize the standard linear model:

Yi = Xiβ + εi

. Random: Normal distribution

εi ∼ N (0, σ2)

. Systematic: linear combination of covariates

ηi = Xiβ

. Link: identity function

ηi = µi

• GLMs extend usefully to overdispersed and correlated data:

. GEE: marginal models / semi-parametric estimation &

inference

. GLMM: conditional models / likelihood estimation & inference

Exponential Family

(?) f(y; θ, φ) = exp[yθ − b(θ)

a(φ)+ c(y, φ)

θ = canonical parameter

φ = fixed (known) scale parameter

Properties: If Y ∼ f(y; θ, φ) in (?) then,

E(Y ) = µ = b′(θ)

var(Y ) = b′′(θ) · a(φ)

Canonical link function: A function g(·) such that:

η = g(µ) = θ (canonical parameter)

Variance function: A function V (·) such that:

var(Y ) = V (µ) · a(φ)

Usually : a(φ) = φ · wφ “scale” parameter

w weight

Examples of GLMS: logistic regression

y = s/m where s=number of successes / m trials

f(y; θ, φ) =(

)πs(1− π)m−s

= exp[y · log( π

1−π ) + log(1− π)1/m

+ log(

=⇒θ = log(π/(1− π))

b(θ) = − log(1− π) = log[1 + exp(θ)]

µ = b′(θ) =∂

∂θlog[1 + exp(θ)]

= exp(θ)/[1 + exp(θ)] = π

g(µ) = log[π/(1− π)] = θ

g : logit, log-odds function

var(y) = π(1− π) · 1m

V (µ) =

a(φ) =

• Poisson regression

y = number of events (count)

f(y; θ, φ) = λy exp(−λ)/y!

= exp [y · log(λ)− λ− log(y!)]

=⇒θ = log(λ)

b(θ) = λ = exp(θ)

µ = b′(θ) = exp(θ) = λ

g(µ) = θ = log(µ)

g : canonical link is log

• Poisson regression (continued)

var(y) = λ

V (µ) =

a(φ) =

◦ Other examples:

. gamma, inverse Gaussian (MN, Table 2.1)

. some survival models (MN, Chpt. 13)

Example: Seizure data (DHLZ ex. 1.6)

• Clinical trial of progabide, evaluating impact on epileptic seizures.

• Data:

. age = patient age in years

. base = 8-week baseline seizure count (pre-tx)

. tx = 0 if assigned placebo; 1 if assigned progabide

. Y1, Y2, Y3, Y4 seizure counts in 4 two-week periods following

treatment administration

• Models:

. linear model: Y4 = age + base + tx + ε

. Poisson GLM: log(µ4) = age + base + tx

Example: Seizure data (DHLZ ex. 1.6)

linear regression Poisson regression

est. s.e. Z est. s.e. Z

(Int) -4.97 3.62 -1.37 0.778 0.285 2.73

age 0.12 0.11 1.07 0.014 0.009 1.64

base 0.31 0.03 11.79 0.022 0.001 20.27

tx -1.36 1.37 -0.99 -0.270 0.102 -2.66

• Q: should we use log(base) for Poisson regression?

• Q: why does inference regarding significance of TX differ?

•••

••

•••• ••

••

• • •

••••

• ••

•••

••

•••

20 40 60 80 100 120 140

•••

••

•••

••

• •

••

20 40 60 80 100 120 140

0 20 40 60

y4[trt == 0]

0 20 40 60

y4[trt == 1]

Seizure Residuals vs. Fitted

••

• •

••

lmfit$fitted

0 10 20 30 40

••

• •

••

•••

••

• •

••

•••

••

glmfit$fitted

10 20 30 40 50 60

−1.0

−0.5

••

•••

••

Seizure Residuals vs. Fitted, using predict()

••

• •

••

predict(lmfit)

0 10 20 30 40

••

• •

••

•••

••

• •

••

•••

••

predict(glmfit)

1.0 1.5 2.0 2.5 3.0 3.5 4.0

••

Residual Diagnostics

◦ Used to assess model fit similarly as for linear models

• Q-Q plots for residuals

(may be hard to interpret for discrete data )

• residual plots:

? vs. fitted values

? vs. omitted covariates

• assessment of systematic departures

• assessment of variance function

Residual Diagnostics

◦ Types of residuals for GLMs:

1. Pearson residual

rP =yi − µi√

V (µi)∑(rP

i )2 = X2

2. Deviance residual (see resid(fit))

rDi = sign(y − µ)

√di∑

(rDi )2 = D(y, µ)

3. Working residual (see fit$resid)

rWi = (yi − µi)

∂ηi

∂µi= Zi − ηi

Fitting GLMS by Maximum Likelihood

Solve score equations:

Uj(β) =∂

∂βjlog L = 0 j = 1, 2, . . . , p

log-likelihood:

log L =n∑

[yi · θi − b(θi)

ai(φ)+ c(yi, φ)

log Li

=⇒Uj(β) =

∂ log L

∂βj=

∂ log Li

∂θi· ∂θi

∂µi· ∂µi

∂ηi· ∂ηi

∂βj

∂ log Li

∂θi=

1ai(φ)

(yi − b′(θi)) =1

ai(φ)(yi − µi)

∂θi

∂µi=

(∂µi

∂θi

= 1/V (µi)

∂ηi

∂βj= Xij

Therefore,

Uj(β) =n∑

∂µi

∂ηi

)· [ai(φ) · V (µi)]

−1 (Yi − µi)

GLM Information Matrix

• Either form:

[In](j, k) = cov[Uj(β), Uk(β)]

= −E

(∂2 log L

∂βj∂βk

• Let’s consider the second form...

GLM Information Matrix

[In](j, k) = −E

∂βkUj(β)

= −E

{(∂µi

∂βj

)· [ai(φ) · V (µi)]

−1 (Yi − µi)}]

(∂µi

∂βj

)· [ai(φ) · V (µi)]

(∂µi

∂βk

justify

Score and Information

• In vector/matrix form we have:

U(β) =

U1(β)

U2(β)...

Up(β)

∂µi

∂β=

(∂µi

∂β1

∂µi

∂β2. . . ∂µi

∂βp

= Xi∂µi

∂ηi

Score and Information

U(β) =n∑

(∂µi

· [ai(φ) · V (µi)]−1 (Yi − µi)

In =n∑

(∂µi

· [ai(φ) · V (µi)]−1

(∂µi

Fisher Scoring

Goal: Solve the score equations

U(β) = 0

Iterative estimation is required for most GLMs. The score equations

can be solved using Newton-Raphson (uses observed derivative of

score) or Fisher Scoring which uses the expected derivative of the

score (ie. −In).

Fisher Scoring

Algorithm:

• Pick an initial value: β(0)

• For j → (j + 1) update β(j)

β(j+1)

= β(j)

+(I(j)

U(β(j)

• Evaluate convergence using changes in log L or ‖β(j+1) − β(j)‖.

• Iterate until convergence criterion is satisfied.

Comments on Fisher Scoring:

1. IWLS is equivalent to Fisher Scoring (Biostat 570).

2. Observed and expected information are equivalent for canonical

links.

3. Score equations are an example of an estimating function(more on that to come!)

4. Q: What assumptions make E[U(β)] = 0?

5. Q: What is the relationship between In and∑

U iUTi ?

6. Q: What is a 1-step approximation to ∆β(−i)?

Inference for GLMs

Review of asymptotic likelihood theory:

−−−β2

(q × 1)

−−−(p− q × 1)

Goal: Test H0 : β2 = β02

(1) Likelihood Ratio Test:

2[log L(β1, β2)− log L(β

1, β02)

]∼ χ2(df = p− q)

Inference for GLMs

(2) Score Test:

U(β) =

U1(β1)

−−−U2(β2)

(q × 1)

−−−(p− q × 1)

U2(β0)T

{cov[U2(β

U2(β0) ∼ χ2(df = p− q)

(3) Wald Test:

(β2 − β02)

cov(β2)}−1

(β2 − β02) ∼ χ2(df = p− q)

Measures of Discrepancy

There are 2 primary measures:

• deviance

• Pearson’s X2

Deviance: Assume ai(φ) = φ/mi

(eg. normal: φ; binomial: 1/mi; Poisson: 1)

log L(β) =n∑

log fi(yi; θi, φ)

φ[yiθi − b(θi)] + ci(yi, φ)

Now consider log L as a function of µ, using the relationship b′(θ) = µ:

l(µ, φ; y) =∑

φ[yi · θ(µi)− b[θ(µi)] ] + ci(yi, φ)

The deviance is:

D(y, µ) = 2 · φ · [l(y, φ; y)− l(µ, φ;y)]

= 2 ·∑

mi { yi · [θ(yi)− θ(µi)] −

( b[θ(yi)]− b[θ(µi)] ) }

Deviance

Deviance generalizes the residual sum of squares for linear models:

Model 1 Model 2

−−−β2

(q × 1)

−−−(p− q × 1)

−−−β0

µ1 µ2

Deviance

Linear Model:

SSE(Model 2)− SSE(Model 1)σ2

∼ χ2(df = p− q)

D(y, µ2)−D(y, µ1)φ

∼ χ2(df = p− q)

Examples:

1. Normal: log f(yi; θi, φ) = − (yi−µi)2

2φ + C

D(y, µ) =∑

(yi − µi)2 = SSE

2. Poisson: log f(yi; θi, φ) = yi · log(µ)− µ + C

D(y, µ) = 2×[∑

yi · log(

)− (yi − µi)

3. Binomial: log f(yi; θi, φ) = mi

[yi · log

1−µ

)+ log(1− µ)

D(y, µ) = 2×[∑

yi · log(

)− (1− yi) · log

(1− yi

1− µi

Pearson’s X2

Assume: var(Yi) = φmi

V (µi)

Define: X2 =∑

i(yi − µi)2/[V (µi)/mi]

Examples:

1. Normal: X2 = SSE

2. Poisson: X2 = (yi − µi)2/µi (look familiar?)

3. Binomial: X2 = (yi − µi)2/[µi(1− µi)]

(??) If the model is correct (mean and variance) then,

(n− p)≈ φ

◦ Normal: SSE/(n− p) ≈ σ2 = φ

◦ Poisson: X2/(n− p) ≈ 1 = φ

◦ Binomial: X2/(n− p) ≈ 1 = φ

Example: Seizure data (DLZ ex. 1.6)

(n− p)=

136.6459− 4

= 2.48

D(y, µ)(n− p)

=147.0259− 4

= 2.67

Q: Poisson???

Summary:

• GLMs applicable to range of univariate outcomes.

• Systematic variation (regression )

Random variation (variance function, likelihood)

• Score equations of simple form.

• Inference using:

likelihood ratios (deviance)

score statistics

Wald statistics

• Model checking

regression structure / variance form V (µ)

References:

McCullagh P., and Nelder J.N. Generalized Linear Models, SecondEdition, Chapman and Hall, 1989.

Fahrmeir L., and Tutz G. Multivariate Statistical Modelling Based onGeneralized Linear Models, Springer-Verlag, 1991.

(see chapter 2)

McCulloch C.E., and Searle S.R. Generalized, Linear, and MixedModels, Wiley, 2001.

(see chapter 5)

Diggle P., Heagerty P.J., Liang K-Y., Zeger S.L. Longitudinal DataAnalysis, Second Edition, Oxford, 2002.

(see appendix A)

Generalized Linear Models...Generalized Linear Models † GLMs generalize the standard linear model:...

Documents

Transcript of Generalized Linear Models...Generalized Linear Models † GLMs generalize the standard linear model:...

Generalized Linear Models - Yanfei Kang, Ph.D. · 2018. 6. 7. · Generalized Linear Models Lecture 10: Nonparametric regression. Nonparametric regression y ... Local linear and local

Linear Shape Deformation Models With Local Support … · Linear Shape Deformation Models with Local Support using Graph-based Structured Matrix Factorisation ... linear deformations

Generalized Lotka-Volterra (GLV) Models - arXiv · non-linear dynamical systems of the type eq. (2), we may predict (without performing any computer runs) how the wealth distributions

Generalized linear models - University of California, San Diegovulstats.ucsd.edu/pdf/Gelman.ch-06.generalized-linear...112 GENERALIZED LINEAR MODELS where, as before, θ i =exp(X iβ).The

1 General properties - Stanford Universitystatweb.stanford.edu/~jtaylo/courses/stats306b/... · of exponential families: generalized linear models. 1.1 Likelihoods, scores, and Fisher

for Chapter Generalized Linear Models (GLMs) · The Kyphosis data consist of measurements on 81 children following ... Exponential family distributions ... Chi-square, Beta, ...

GLM I: Introduction to Generalized Linear Models

Generalized linear models - UCSD PSYC 201A/B · 2018. 1. 9. · 110 GENERALIZED LINEAR MODELS • Multinomial logit and probit models (Section 6.5) are extensions of logistic and

Introduction to General and Generalized Linear Modelshmad/GLM/Slides_2012/week03/lect03.pdf · 2012-02-13 · Introduction to General and Generalized Linear Models The Likelihood

10. November 2014 and M • Generalized models allows other distributions than the Gaussian and link functions between expectation and linear predictor • Mixed models allows ...

STAT 705 Introduction to generalized additive modelspeople.stat.sc.edu/hansont/stat704/gam2014.pdf · Generalized additive models Additive predictors When considering a transformation

H3: Linear models, Best Linear Unbiased Estimator

Part V The Generalized Linear Model Chapter 16 Introduction.

Stat 579: Generalized Linear Models and Extensionsluyan/stat57918/week72.pdf · 2018. 3. 7. · General Gaussian linear mixed model A general linear mixed model, may be expressed

Extended smooth Modelling and R - University of Bristolsw15190/talks/snw-R-talk.pdf · Generalized Additive Models I A generalized additive models relates a response variable yi to

Log-Linear Models, Logistic Regression and Conditional ...

Generalized Lotka-Volterra (GLV) Models · Generalized Lotka-Volterra (GLV) Models ... Microscopic representation or agent-oriented simulation ... ties which arise from the collective

SOLVING LINEAR RATIONAL EXPECTATIONS MODELS

Robust linear static panel data models using ε-contamination

Generalized power series solutions to linear partial ... › download › pdf › 81191877.pdf · Generalized power series solutions to linear partial differential equations ... computation