28-Jun-2020
• Parametric regression models

STK4080 H16

1. Likelihood for censored data

2. Parametric regression models

3. Poisson regression models

4. Accelerated failure time models

5. Martingale formulation of parametric survival models

• Parametric models without covariates

Let Ti have hazardα(t|θ), cumulative hazardA(t|θ), density f(t|θ) and survival functionS(t|θ).

We observe right censored data(T̃i, Di) whereT̃i = min(Ci, Ti)

andDi = I(T̃i = Ti) andCi a censoring time independent ofXi.

Such right censored data have likelihoodL(θ) ∝ ∏n

i=1 Li(θ)

where the likelihood contributions equal

Li(θ) = f(T̃i|θ)DiS(T̃i|θ)1−Di = α(T̃i|θ)Di exp(−A(T̃i|θ)).

Under standard assumptions the MLEθ̂ that maximizesL(θ) is

approximately

θ̂ ∼ N(θ, I(θ̂)−1)

whereI(θ) = −∂2 log(L(θ)) ∂θ2

If the true survival timesTi ∼ exp(ν) and we have right censored data(T̃i, Di) with τ as the maximal observation time,we get a

likelihood,

L =

n∏

i=1

νDi exp(−T̃iν) = νN•(τ) exp(−νR(τ))

whereN•(τ) = ∑n

i=1Di the no. observed events (occurrence)

andR(τ) = ∑n

i=1 T̃i the total exposure time.

It then follows that the MLE ofν equals

ν̂ = N•(τ)

R(τ) =

Occurrence Exposure

and the standard error ofν̂ is 1/ √ I(ν̂) = ν̂/

√ N•(τ).

• Piecewise constant hazardα(t) = θkIk(t)

whereIk(t) = 1 on intervalstk−1 < t ≤ tk and zero otherw. and 0 = t0 < t1 < . . . < tK = τ .

It can then be shown that withOk is the number of events andRk the total observational time (exposure) in intervalk the MLE for

θk is given

θ̂k = Ok Rk

.

Furthermore the information matrix becomes diagonal and the

large sample distribution of thêθk’s is independent normal

distributions with standard errorŝθk/ √ Ok

Also, the standard error forlog(θ̂) becomes1/ √ Ok and

θ̂ exp(±1.96/ √ Ok) is generally a preferable confidence interval.

• Parametric regression models, in general

Will assume that the distribution ofTi depends on a linear

predictorβ′xi and in addition on parameterθ.

The distribution may always be specified via the hazard, thus

Ti ∼ α(t|β′xi, θ)

i.e. the hazard for individuali. With cumulative hazard

A(t|β′xi, θ) we may express the likelihood contribution from individual i by

Li(β, θ) ∝ α(T̃i|β′xi, θ)Di exp(−A(T̃i|β′xi, θ))

where we assume thatTi andCi are independent givenxi.

• Classes of regression models

• Proportional hazards:α(t|β′x, θ) = exp(β′x)α0(t|θ) • Additive hazards:α(t|β′x, θ) = β′x+ α0(t|θ) • Accelerated failure time: α(t|β′x, θ) = exp(β′x)α0(exp(β′x)t|θ)

• Translation modelsα(t|β′x, θ) = α0(t+ β′x|θ) • and many more!

• Proportional hazards model

1. Semi-parametric method is the most common

(Cox-regression)

2. Poisson-regression is a numerically simpler variation

3. Via accelerated failure time models if the baseline is

Weibull! (or exponential)

4. In general by likelihood-optimization.

Have discussed 1. thoroughly, will consider 2. and 3.

• Additive hazard modelsα(t|β′x, θ) = β′x+ α0(t|θ)

1. Semi-parametric, Lin & Ying-model,

2. Special case of Aalen’s additive model,

α(t|β′x, θ) = β(t)′x+ β0(t|θ) 3. Parametric models are possible, but not programmed?

4. Poisson-regression is possible (Breslow, 1987) with

piecewise constant baseline

Have discussed 2. and mentioned 1. Will here briefly consider4.

• Poisson-regression

AssumeYi ∼ Po(ni exp(ψ + β′xi). This generates likelihood

L = ∏n

i=1

[ n Yi i exp(Yi(ψ+β

′xi))

Yi! exp(−ni exp(ψ + β′xi))

]

∝ ∏n

i=1 [exp(Yi(ψ + β ′xi)) exp(−ni exp(ψ + β′xi))]

which may be maximized with a program for Poisson-regression,

in Ras

• Generalized linear model:glm

• with Poisson-familyfamily=poisson

• Need "offset " for log(ni) (or alternatively with weighting)

• May also fit other link functions than default log-link f.ex. additive model E[Yi] = ni(ψ + β′xi)

Ti ∼ α(t|xi) = exp(ψ + β′xi)

and cumulative hazardA(t|xi) = t exp(ψ + β′xi).

With observations̃Ti = right censored lifetimes andDi =

indicator for events gives likelihood

L = ∏n

i=1 α(T̃i|xi)Di exp(−A(T̃i|xi)) =

∏n i=1 exp(Di(ψ + β

′xi)) exp(−T̃i exp(ψ + β′xi))

which is proportional to a Poisson-likelihood under assumption

Di ∼ Po(T̃i exp(ψ + β′xi)).

May thus fit this model by Poisson-regression! !!

• Example: Melanoma data

family=poisson))

Deviance Residuals:

Min 1Q Median 3Q Max

-1.757198 -0.7833086 -0.4186261 0.5735247 2.35767

Coefficients:

Value Std. Error z value

(Intercept) -3.27166724 0.793644496 -4.122333

ulcer -0.96045856 0.324053150 -2.963892

logthick 0.51104055 0.177549752 2.878295

age 0.01301791 0.007918443 1.643999

sex 0.34101496 0.270580076 1.260311

(Dispersion Parameter for Poisson family taken to be 1 )

Null Deviance: 232.0768 on 204 degrees of freedom

Residual Deviance: 188.029 on 200 degrees of freedom

Number of Fisher Scoring Iterations: 5

• Alternative to offset: Weighting

AssumeYi ∼ Po(niµi) so that E[Yi/ni] = µi and

Var

[ Yi ni

] = niµi n2i

= µi ni

May alternatively fit the model by

• ResponsesYi ni

• Weightswi = ni (in R: glm )

• Link function log(µi) = ψ + β′xi

glm -routine inRdo not require integer responses!

It will also work for "quasi-survival-responses"Di/T̃i with weightsT̃i.

• Example: Melanoma data, weighting (compare pg. 19)

Deviance Residuals:

Min 1Q Median 3Q Max

-1.757189 -0.7833315 -0.4186496 0.5735243 2.357629

Coefficients:

Value Std. Error t value

(Intercept) -3.27167476 0.792273252 -4.129478

ulcer -0.96041516 0.323111908 -2.972392

logthick 0.51101978 0.177128703 2.885020

age 0.01301783 0.007912787 1.645164

sex 0.34101511 0.270233537 1.261927

(Dispersion Parameter for Poisson family taken to be 1 )

Null Deviance: 232.0768 on 204 degrees of freedom

Residual Deviance: 188.029 on 200 degrees of freedom

Number of Fisher Scoring Iterations: 5

Ti ∼ α(t|xi) = ψ + β′xi With censored data(T̃i, Di) we get likelihood corresponding to

• Di ∼ Po(T̃iµi) • whereµi = ψ + β′xi • and such that E[Di/T̃i] = µi and Var

[ Di T̃i

] = T̃iµi

T̃ 2i = µi

T̃i

It is thus possible to fit this survival model byglm , weighting

However: With identity linkRneeds modification of responses

Di/T̃i = 0, need to make new responseD′i = Di + �i for small

�i, f.ex. 0.0001.

This may still be unstable, in the example on next page I needed to usegrthick instead oflogthick and omitage .

• Example: Melanoma data, additive exponential model

Deviance Residuals:

Min 1Q Median 3Q Max

-1.605446 -0.8411342 -0.3830547 0.6907776 2.426813

Coefficients:

Value Std. Error t value

(Intercept) 0.07537012 0.03424506 2.200905

ulcer -0.04302607 0.01545833 -2.783358

factor(grthick)2 0.03898369 0.01583326 2.462139

factor(grthick)3 0.03337742 0.02378281 1.403426

sex 0.01950260 0.01178936 1.654254

(Dispersion Parameter for Poisson family taken to be 1 )

Null Deviance: 232.0459 on 204 degrees of freedom

Residual Deviance: 192.2642 on 200 degrees of freedom

Number of Fisher Scoring Iterations: 5

• While we are at it: Exponential model, square root link