Parametric regression models - Universitetet i oslo sex 0.054402522 0.0305622598 1.780056...

download Parametric regression models - Universitetet i oslo sex 0.054402522 0.0305622598 1.780056 (Dispersion

of 40

  • date post

    28-Jun-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Parametric regression models - Universitetet i oslo sex 0.054402522 0.0305622598 1.780056...

  • Parametric regression models

    STK4080 H16

    1. Likelihood for censored data

    2. Parametric regression models

    3. Poisson regression models

    4. Accelerated failure time models

    5. Martingale formulation of parametric survival models

    Parametric regression models – p. 1/40

  • Parametric models without covariates

    Let Ti have hazardα(t|θ), cumulative hazardA(t|θ), density f(t|θ) and survival functionS(t|θ).

    We observe right censored data(T̃i, Di) whereT̃i = min(Ci, Ti)

    andDi = I(T̃i = Ti) andCi a censoring time independent ofXi.

    Such right censored data have likelihoodL(θ) ∝ ∏n

    i=1 Li(θ)

    where the likelihood contributions equal

    Li(θ) = f(T̃i|θ)DiS(T̃i|θ)1−Di = α(T̃i|θ)Di exp(−A(T̃i|θ)).

    Under standard assumptions the MLEθ̂ that maximizesL(θ) is

    approximately

    θ̂ ∼ N(θ, I(θ̂)−1)

    whereI(θ) = −∂2 log(L(θ)) ∂θ2

    is the information matrix. Parametric regression models – p. 2/40

  • Example: Exponential lifetimes

    If the true survival timesTi ∼ exp(ν) and we have right censored data(T̃i, Di) with τ as the maximal observation time,we get a

    likelihood,

    L =

    n∏

    i=1

    νDi exp(−T̃iν) = νN•(τ) exp(−νR(τ))

    whereN•(τ) = ∑n

    i=1Di the no. observed events (occurrence)

    andR(τ) = ∑n

    i=1 T̃i the total exposure time.

    It then follows that the MLE ofν equals

    ν̂ = N•(τ)

    R(τ) =

    Occurrence Exposure

    and the standard error ofν̂ is 1/ √ I(ν̂) = ν̂/

    √ N•(τ).

    Parametric regression models – p. 3/40

  • Piecewise constant hazardα(t) = θkIk(t)

    whereIk(t) = 1 on intervalstk−1 < t ≤ tk and zero otherw. and 0 = t0 < t1 < . . . < tK = τ .

    It can then be shown that withOk is the number of events andRk the total observational time (exposure) in intervalk the MLE for

    θk is given

    θ̂k = Ok Rk

    .

    Furthermore the information matrix becomes diagonal and the

    large sample distribution of thêθk’s is independent normal

    distributions with standard errorŝθk/ √ Ok

    Also, the standard error forlog(θ̂) becomes1/ √ Ok and

    θ̂ exp(±1.96/ √ Ok) is generally a preferable confidence interval.

    Parametric regression models – p. 4/40

  • Parametric regression models, in general

    Will assume that the distribution ofTi depends on a linear

    predictorβ′xi and in addition on parameterθ.

    The distribution may always be specified via the hazard, thus

    Ti ∼ α(t|β′xi, θ)

    i.e. the hazard for individuali. With cumulative hazard

    A(t|β′xi, θ) we may express the likelihood contribution from individual i by

    Li(β, θ) ∝ α(T̃i|β′xi, θ)Di exp(−A(T̃i|β′xi, θ))

    where we assume thatTi andCi are independent givenxi.

    Parametric regression models – p. 5/40

  • Classes of regression models

    • Proportional hazards:α(t|β′x, θ) = exp(β′x)α0(t|θ) • Additive hazards:α(t|β′x, θ) = β′x+ α0(t|θ) • Accelerated failure time: α(t|β′x, θ) = exp(β′x)α0(exp(β′x)t|θ)

    • Translation modelsα(t|β′x, θ) = α0(t+ β′x|θ) • and many more!

    Parametric regression models – p. 6/40

  • Proportional hazards model

    1. Semi-parametric method is the most common

    (Cox-regression)

    2. Poisson-regression is a numerically simpler variation

    3. Via accelerated failure time models if the baseline is

    Weibull! (or exponential)

    4. In general by likelihood-optimization.

    Have discussed 1. thoroughly, will consider 2. and 3.

    Parametric regression models – p. 7/40

  • Additive hazard modelsα(t|β′x, θ) = β′x+ α0(t|θ)

    1. Semi-parametric, Lin & Ying-model,

    2. Special case of Aalen’s additive model,

    α(t|β′x, θ) = β(t)′x+ β0(t|θ) 3. Parametric models are possible, but not programmed?

    4. Poisson-regression is possible (Breslow, 1987) with

    piecewise constant baseline

    Have discussed 2. and mentioned 1. Will here briefly consider4.

    Parametric regression models – p. 8/40

  • Poisson-regression

    AssumeYi ∼ Po(ni exp(ψ + β′xi). This generates likelihood

    L = ∏n

    i=1

    [ n Yi i exp(Yi(ψ+β

    ′xi))

    Yi! exp(−ni exp(ψ + β′xi))

    ]

    ∝ ∏n

    i=1 [exp(Yi(ψ + β ′xi)) exp(−ni exp(ψ + β′xi))]

    which may be maximized with a program for Poisson-regression,

    in Ras

    • Generalized linear model:glm

    • with Poisson-familyfamily=poisson

    • Need "offset " for log(ni) (or alternatively with weighting)

    • May also fit other link functions than default log-link f.ex. additive model E[Yi] = ni(ψ + β′xi)

    Parametric regression models – p. 9/40

  • Exponentially distributed lifetimes

    Ti ∼ α(t|xi) = exp(ψ + β′xi)

    and cumulative hazardA(t|xi) = t exp(ψ + β′xi).

    With observations̃Ti = right censored lifetimes andDi =

    indicator for events gives likelihood

    L = ∏n

    i=1 α(T̃i|xi)Di exp(−A(T̃i|xi)) =

    ∏n i=1 exp(Di(ψ + β

    ′xi)) exp(−T̃i exp(ψ + β′xi))

    which is proportional to a Poisson-likelihood under assumption

    Di ∼ Po(T̃i exp(ψ + β′xi)).

    May thus fit this model by Poisson-regression! !!

    Parametric regression models – p. 10/40

  • Example: Melanoma data

    > summary(glm(dead˜offset(log(lifetime))+ulcer+logth ick+age+sex,

    family=poisson))

    Deviance Residuals:

    Min 1Q Median 3Q Max

    -1.757198 -0.7833086 -0.4186261 0.5735247 2.35767

    Coefficients:

    Value Std. Error z value

    (Intercept) -3.27166724 0.793644496 -4.122333

    ulcer -0.96045856 0.324053150 -2.963892

    logthick 0.51104055 0.177549752 2.878295

    age 0.01301791 0.007918443 1.643999

    sex 0.34101496 0.270580076 1.260311

    (Dispersion Parameter for Poisson family taken to be 1 )

    Null Deviance: 232.0768 on 204 degrees of freedom

    Residual Deviance: 188.029 on 200 degrees of freedom

    Number of Fisher Scoring Iterations: 5

    Parametric regression models – p. 11/40

  • Alternative to offset: Weighting

    AssumeYi ∼ Po(niµi) so that E[Yi/ni] = µi and

    Var

    [ Yi ni

    ] = niµi n2i

    = µi ni

    May alternatively fit the model by

    • ResponsesYi ni

    • Weightswi = ni (in R: glm )

    • Link function log(µi) = ψ + β′xi

    glm -routine inRdo not require integer responses!

    It will also work for "quasi-survival-responses"Di/T̃i with weightsT̃i.

    Parametric regression models – p. 12/40

  • Example: Melanoma data, weighting (compare pg. 19)

    > summary(glm(I(dead/lifetime)˜+ulcer+logthick+age+s ex,

    family=poisson,weight=lifetime))

    Deviance Residuals:

    Min 1Q Median 3Q Max

    -1.757189 -0.7833315 -0.4186496 0.5735243 2.357629

    Coefficients:

    Value Std. Error t value

    (Intercept) -3.27167476 0.792273252 -4.129478

    ulcer -0.96041516 0.323111908 -2.972392

    logthick 0.51101978 0.177128703 2.885020

    age 0.01301783 0.007912787 1.645164

    sex 0.34101511 0.270233537 1.261927

    (Dispersion Parameter for Poisson family taken to be 1 )

    Null Deviance: 232.0768 on 204 degrees of freedom

    Residual Deviance: 188.029 on 200 degrees of freedom

    Number of Fisher Scoring Iterations: 5

    Parametric regression models – p. 13/40

  • Additive exponential regression model:

    Ti ∼ α(t|xi) = ψ + β′xi With censored data(T̃i, Di) we get likelihood corresponding to

    • Di ∼ Po(T̃iµi) • whereµi = ψ + β′xi • and such that E[Di/T̃i] = µi and Var

    [ Di T̃i

    ] = T̃iµi

    T̃ 2i = µi

    T̃i

    It is thus possible to fit this survival model byglm , weighting

    andidentity -link.

    However: With identity linkRneeds modification of responses

    Di/T̃i = 0, need to make new responseD′i = Di + �i for small

    �i, f.ex. 0.0001.

    This may still be unstable, in the example on next page I needed to usegrthick instead oflogthick and omitage .

    Parametric regression models – p. 14/40

  • Example: Melanoma data, additive exponential model

    > deadx summary(glm(I(deadx/lifetime)˜ulcer+factor(grthick )+sex,

    family=poisson(link=identity),weight=lifetime))

    Deviance Residuals:

    Min 1Q Median 3Q Max

    -1.605446 -0.8411342 -0.3830547 0.6907776 2.426813

    Coefficients:

    Value Std. Error t value

    (Intercept) 0.07537012 0.03424506 2.200905

    ulcer -0.04302607 0.01545833 -2.783358

    factor(grthick)2 0.03898369 0.01583326 2.462139

    factor(grthick)3 0.03337742 0.02378281 1.403426

    sex 0.01950260 0.01178936 1.654254

    (Dispersion Parameter for Poisson family taken to be 1 )

    Null Deviance: 232.0459 on 204 degrees of freedom

    Residual Deviance: 192.2642 on 200 degrees of freedom

    Number of Fisher Scoring Iterations: 5

    Parametric regression models – p. 15/40

  • While we are at it: Exponential model, square root link