Generalised Additive (Mixed) Models - Personal Homepages ... · PDF file I Let y denote a set...

Click here to load reader

  • date post

    20-Jan-2019
  • Category

    Documents

  • view

    218
  • download

    0

Embed Size (px)

Transcript of Generalised Additive (Mixed) Models - Personal Homepages ... · PDF file I Let y denote a set...

Generalised Additive (Mixed) ModelsI refuse to use a soft G

Daniel Simpson

Department of Mathematical SciencesUniversity of Bath

Outline

Background

Latent Gaussian model

Example: Continuous vs Discrete

Priors on functions

Two main paradigms for statistical analysis

I Let y denote a set of observations, distributed according to aprobability model (y ;).

I Based on the observations, we want to estimate .

The classical approach:

denotes parameters (unknown fixed numbers), estimated forexample by maximum likelihood.

The Bayesian approach:

denotes random variables, assigned a prior (). Estimate based on the posterior:

( | y) = (y | )()(y)

(y | )().

Two main paradigms for statistical analysis

I Let y denote a set of observations, distributed according to aprobability model (y ;).

I Based on the observations, we want to estimate .

The classical approach:

denotes parameters (unknown fixed numbers), estimated forexample by maximum likelihood.

The Bayesian approach:

denotes random variables, assigned a prior (). Estimate based on the posterior:

( | y) = (y | )()(y)

(y | )().

Example (Ski flying records)

Assume a simple linear regression model with Gaussianobservations y = (y1, . . . , yn), where

E(yi ) = + xi , Var(yi ) = 1, i = 1, . . . , n

1960 1970 1980 1990 2000 2010

140

160

180

200

220

240

World records in ski jumping, 1961 - 2011

Year

Leng

th in

met

ers

The Bayesian approach

Assign priors to the parameters , and and calculate posteriors:

130 135 140 145

0.00

0.10

0.20

PostDens [(Intercept)]

Mean = 137.354 SD = 1.508

1.9 2.0 2.1 2.2 2.3 2.4

02

46

PostDens [x]

Mean = 2.14 SD = 0.054

0.05 0.10 0.15 0.20 0.25 0.30

05

1015

PostDens [Precision for the Gaussian observations]

Real-world datasets are usually much more complicated!

Using a Bayesian framework:

I Build (hierarchical) models to account for potentiallycomplicated dependency structures in the data.

I Attribute uncertainty to model parameters and latent variablesusing priors.

Two main challenges:

1. Need computationally efficient methods to calculateposteriors.

2. Select priors in a sensible way.

Real-world datasets are usually much more complicated!

Using a Bayesian framework:

I Build (hierarchical) models to account for potentiallycomplicated dependency structures in the data.

I Attribute uncertainty to model parameters and latent variablesusing priors.

Two main challenges:

1. Need computationally efficient methods to calculateposteriors.

2. Select priors in a sensible way.

Outline

Background

Latent Gaussian modelComputational framework and approximationsSemiparametric regression

Example: Continuous vs Discrete

Priors on functions

What is a latent Gaussian model?

Classical multiple linear regression model

The mean of an n-dimensional observational vector y is given by

i = E(Yi ) = +

nj=1

jzji , i = 1, . . . , n

where

: Intercept

: Linear effects of covariates z

Account for non-Gaussian observations

Generalized linear model (GLM)

The mean is linked to the linear predictor i :

i = g(i ) = +

nj=1

jzji , i = 1, . . . , n

where g(.) is a link function and

: Intercept

: Linear effects of covariates z

Account for non-linear effects of covariates

Generalized additive model (GAM)

The mean is linked to the linear predictor i :

i = g(i ) = +

nfk=1

fk(cki ), i = 1, . . . , n

where g(.) is a link function and

: Intercept

{fk()} : Non-linear smooth effects of covariates ck

Structured additive regression models

GLM/GAM/GLMM/GAMM+++The mean is linked to the linear predictor i :

i = g(i ) = +

nj=1

jzji +

nfk=1

fk(cki ) + i , i = 1, . . . , n

where g(.) is a link function and

: Intercept

: Linear effects of covariates z{fk()} : Non-linear smooth effects of covariates ck

: Iid random effects

Latent Gaussian models

I Collect all parameters (random variables) in the linearpredictor in a latent field

x = {,, {fk()},}.

I A latent Gaussian model is obtained by assigning Gaussianpriors to all elements of x .

I Very flexible due to many different forms of the unknownfunctions {fk()}:

I Include temporally and/or spatially indexed covariates.

I Hyperparameters account for variability and length/strengthof dependence.

Latent Gaussian models

I Collect all parameters (random variables) in the linearpredictor in a latent field

x = {,, {fk()},}.

I A latent Gaussian model is obtained by assigning Gaussianpriors to all elements of x .

I Very flexible due to many different forms of the unknownfunctions {fk()}:

I Include temporally and/or spatially indexed covariates.

I Hyperparameters account for variability and length/strengthof dependence.

Latent Gaussian models

I Collect all parameters (random variables) in the linearpredictor in a latent field

x = {,, {fk()},}.

I A latent Gaussian model is obtained by assigning Gaussianpriors to all elements of x .

I Very flexible due to many different forms of the unknownfunctions {fk()}:

I Include temporally and/or spatially indexed covariates.

I Hyperparameters account for variability and length/strengthof dependence.

Some examples of latent Gaussian models

I Generalized linear and additive (mixed) models

I Semiparametric regression

I Disease mapping

I Survival analysis

I Log-Gaussian Cox-processes

I Geostatistical models

I Spatial and spatio-temporal models

I Stochastic volatility

I Dynamic linear models

I State-space models

I +++

Unified framework: A three-stage hierarchical model

1. Observations: y

Assumed conditionally independent given x and 1:

y | x ,1 n

i=1

(yi | xi ,1).

2. Latent field: x

Assumed to be a GMRF with a sparse precision matrix Q(2):

x | 2 N((2),Q1(2)

).

3. Hyperparameters:

= (1,2)Precision parameters of the Gaussian priors:

().

Unified framework: A three-stage hierarchical model

1. Observations: yAssumed conditionally independent given x and 1:

y | x ,1 n

i=1

(yi | xi ,1).

2. Latent field: xAssumed to be a GMRF with a sparse precision matrix Q(2):

x | 2 N((2),Q1(2)

).

3. Hyperparameters: = (1,2)Precision parameters of the Gaussian priors:

().

Unified framework: A three-stage hierarchical model

1. Observations: yAssumed conditionally independent given x and 1:

y | x ,1 n

i=1

(yi | xi ,1).

2. Latent field: xAssumed to be a GMRF with a sparse precision matrix Q(2):

x | 2 N((2),Q1(2)

).

3. Hyperparameters: = (1,2)Precision parameters of the Gaussian priors:

().

Model summary

The joint posterior for the latent field and hyperparameters:

(x , | y) (y | x ,)(x ,)

n

i=1

(yi | xi ,)(x | )()

Remarks:

I m = dim() is often quite small, like m 6.

I n = dim(x) is often large, typically n = 102 106.

Target densities are given as high-dimensional integrals

We want to estimate:

I The marginals of all components of the latent field:

(xi | y) =

(x , | y)dxid

=

(xi | , y)( | y)d, i = 1, . . . , n.

I The marginals of all the hyperparameters:

(j | y) =

(x , | y)dxdj

=

( | y)dj , j = 1, . . .m.

Target densities are given as high-dimensional integrals

We want to estimate:

I The marginals of all components of the latent field:

(xi | y) =

(x , | y)dxid

=

(xi | , y)( | y)d, i = 1, . . . , n.

I The marginals of all the hyperparameters:

(j | y) =

(x , | y)dxdj

=

( | y)dj , j = 1, . . .m.

Example: Logistic regression, 2 2 factorial design

Example (Seeds)

Consider the proportion of seeds that germinates on each of 21plates. We have two seed types (x1) and two root extracts (x2).

> data(Seeds)

> head(Seeds)

r n x1 x2 plate

1 10 39 0 0 1

2 23 62 0 0 2

3 23 81 0 0 3

4 26 51 0 0 4

5 17 39 0 0 5

6 5 6 0 1 6

Summary data set

Number of seeds that germinated in each group:

Seed typesx1 = 0 x1 = 1

Root extractx2 = 0 99/272 49/123

x2 = 1 201/295 75/141

Statistical model

I Assume that the number of seeds that germinate on plate i isbinomial

ri Binomial(ni , pi ), i = 1, . . . , 21,

I Logistic regression model:

logit(pi ) = log

(pi

1 pi

)= + 1x1i + 2x2i + 3x1ix2i + i

where i N(0, 2 ) are iid.

Aim:

Estimate the main effects, 1 and 2 and a possible interactioneffect 3.

Statistical model

I Assume that the number of seeds that germinate on plate i isbinomial

ri Binomial(ni , pi ), i = 1, . . . , 21,