Post on 28-Jun-2020
Parametric regression models
STK4080 H16
1. Likelihood for censored data
2. Parametric regression models
3. Poisson regression models
4. Accelerated failure time models
5. Martingale formulation of parametric survival models
Parametric regression models – p. 1/40
Parametric models without covariates
Let Ti have hazardα(t|θ), cumulative hazardA(t|θ), density
f(t|θ) and survival functionS(t|θ).
We observe right censored data(Ti, Di) whereTi = min(Ci, Ti)
andDi = I(Ti = Ti) andCi a censoring time independent ofXi.
Such right censored data have likelihoodL(θ) ∝∏n
i=1 Li(θ)
where the likelihood contributions equal
Li(θ) = f(Ti|θ)DiS(Ti|θ)1−Di = α(Ti|θ)Di exp(−A(Ti|θ)).
Under standard assumptions the MLEθ that maximizesL(θ) is
approximately
θ ∼ N(θ, I(θ)−1)
whereI(θ) = −∂2 log(L(θ))∂θ2
is the information matrix.Parametric regression models – p. 2/40
Example: Exponential lifetimes
If the true survival timesTi ∼ exp(ν) and we have right censored
data(Ti, Di) with τ as the maximal observation time,we get a
likelihood,
L =
n∏
i=1
νDi exp(−Tiν) = νN•(τ) exp(−νR(τ))
whereN•(τ) =∑n
i=1Di the no. observed events (occurrence)
andR(τ) =∑n
i=1 Ti the total exposure time.
It then follows that the MLE ofν equals
ν =N•(τ)
R(τ)=
OccurrenceExposure
and the standard error ofν is 1/√I(ν) = ν/
√N•(τ).
Parametric regression models – p. 3/40
Piecewise constant hazardα(t) = θkIk(t)
whereIk(t) = 1 on intervalstk−1 < t ≤ tk and zero otherw. and
0 = t0 < t1 < . . . < tK = τ .
It can then be shown that withOk is the number of events andRk
the total observational time (exposure) in intervalk the MLE for
θk is given
θk =Ok
Rk
.
Furthermore the information matrix becomes diagonal and the
large sample distribution of theθk’s is independent normal
distributions with standard errorsθk/√Ok
Also, the standard error forlog(θ) becomes1/√Ok and
θ exp(±1.96/√Ok) is generally a preferable confidence interval.
Parametric regression models – p. 4/40
Parametric regression models, in general
Will assume that the distribution ofTi depends on a linear
predictorβ′xi and in addition on parameterθ.
The distribution may always be specified via the hazard, thus
Ti ∼ α(t|β′xi, θ)
i.e. the hazard for individuali. With cumulative hazard
A(t|β′xi, θ) we may express the likelihood contribution from
individual i by
Li(β, θ) ∝ α(Ti|β′xi, θ)Di exp(−A(Ti|β′xi, θ))
where we assume thatTi andCi are independent givenxi.
Parametric regression models – p. 5/40
Classes of regression models
• Proportional hazards:α(t|β′x, θ) = exp(β′x)α0(t|θ)• Additive hazards:α(t|β′x, θ) = β′x+ α0(t|θ)• Accelerated failure time:
α(t|β′x, θ) = exp(β′x)α0(exp(β′x)t|θ)
• Translation modelsα(t|β′x, θ) = α0(t+ β′x|θ)• and many more!
Parametric regression models – p. 6/40
Proportional hazards model
1. Semi-parametric method is the most common
(Cox-regression)
2. Poisson-regression is a numerically simpler variation
3. Via accelerated failure time models if the baseline is
Weibull! (or exponential)
4. In general by likelihood-optimization.
Have discussed 1. thoroughly, will consider 2. and 3.
Parametric regression models – p. 7/40
Additive hazard modelsα(t|β′x, θ) = β′x+ α0(t|θ)
1. Semi-parametric, Lin & Ying-model,
2. Special case of Aalen’s additive model,
α(t|β′x, θ) = β(t)′x+ β0(t|θ)3. Parametric models are possible, but not programmed?
4. Poisson-regression is possible (Breslow, 1987) with
piecewise constant baseline
Have discussed 2. and mentioned 1. Will here briefly consider4.
Parametric regression models – p. 8/40
Poisson-regression
AssumeYi ∼ Po(ni exp(ψ + β′xi). This generates likelihood
L =∏n
i=1
[nYii exp(Yi(ψ+β
′xi))
Yi!exp(−ni exp(ψ + β′xi))
]
∝∏n
i=1 [exp(Yi(ψ + β′xi)) exp(−ni exp(ψ + β′xi))]
which may be maximized with a program for Poisson-regression,
in Ras
• Generalized linear model:glm
• with Poisson-familyfamily=poisson
• Need "offset " for log(ni) (or alternatively with
weighting)
• May also fit other link functions than default log-link
f.ex. additive model E[Yi] = ni(ψ + β′xi)Parametric regression models – p. 9/40
Exponentially distributed lifetimes
Ti ∼ α(t|xi) = exp(ψ + β′xi)
and cumulative hazardA(t|xi) = t exp(ψ + β′xi).
With observationsTi = right censored lifetimes andDi =
indicator for events gives likelihood
L =∏n
i=1 α(Ti|xi)Di exp(−A(Ti|xi))=
∏ni=1 exp(Di(ψ + β′xi)) exp(−Ti exp(ψ + β′xi))
which is proportional to a Poisson-likelihood under assumption
Di ∼ Po(Ti exp(ψ + β′xi)).
May thus fit this model by Poisson-regression! !!
Parametric regression models – p. 10/40
Example: Melanoma data
> summary(glm(dead˜offset(log(lifetime))+ulcer+logth ick+age+sex,
family=poisson))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.757198 -0.7833086 -0.4186261 0.5735247 2.35767
Coefficients:
Value Std. Error z value
(Intercept) -3.27166724 0.793644496 -4.122333
ulcer -0.96045856 0.324053150 -2.963892
logthick 0.51104055 0.177549752 2.878295
age 0.01301791 0.007918443 1.643999
sex 0.34101496 0.270580076 1.260311
(Dispersion Parameter for Poisson family taken to be 1 )
Null Deviance: 232.0768 on 204 degrees of freedom
Residual Deviance: 188.029 on 200 degrees of freedom
Number of Fisher Scoring Iterations: 5
Parametric regression models – p. 11/40
Alternative to offset: Weighting
AssumeYi ∼ Po(niµi) so that E[Yi/ni] = µi and
Var
[Yini
]=niµin2i
=µini
May alternatively fit the model by
• ResponsesYini
• Weightswi = ni (in R: glm )
• Link function log(µi) = ψ + β′xi
glm -routine inRdo not require integer responses!
It will also work for "quasi-survival-responses"Di/Ti withweightsTi.
Parametric regression models – p. 12/40
Example: Melanoma data, weighting (compare pg. 19)
> summary(glm(I(dead/lifetime)˜+ulcer+logthick+age+s ex,
family=poisson,weight=lifetime))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.757189 -0.7833315 -0.4186496 0.5735243 2.357629
Coefficients:
Value Std. Error t value
(Intercept) -3.27167476 0.792273252 -4.129478
ulcer -0.96041516 0.323111908 -2.972392
logthick 0.51101978 0.177128703 2.885020
age 0.01301783 0.007912787 1.645164
sex 0.34101511 0.270233537 1.261927
(Dispersion Parameter for Poisson family taken to be 1 )
Null Deviance: 232.0768 on 204 degrees of freedom
Residual Deviance: 188.029 on 200 degrees of freedom
Number of Fisher Scoring Iterations: 5
Parametric regression models – p. 13/40
Additive exponential regression model:
Ti ∼ α(t|xi) = ψ + β′xi
With censored data(Ti, Di) we get likelihood corresponding to
• Di ∼ Po(Tiµi)
• whereµi = ψ + β′xi
• and such that E[Di/Ti] = µi and Var[Di
Ti
]= Tiµi
T 2
i
= µiTi
It is thus possible to fit this survival model byglm , weighting
andidentity -link.
However: With identity linkRneeds modification of responses
Di/Ti = 0, need to make new responseD′
i = Di + εi for small
εi, f.ex. 0.0001.
This may still be unstable, in the example on next page I neededto usegrthick instead oflogthick and omitage .
Parametric regression models – p. 14/40
Example: Melanoma data, additive exponential model
> deadx<-dead+0.00001
> summary(glm(I(deadx/lifetime)˜ulcer+factor(grthick )+sex,
family=poisson(link=identity),weight=lifetime))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.605446 -0.8411342 -0.3830547 0.6907776 2.426813
Coefficients:
Value Std. Error t value
(Intercept) 0.07537012 0.03424506 2.200905
ulcer -0.04302607 0.01545833 -2.783358
factor(grthick)2 0.03898369 0.01583326 2.462139
factor(grthick)3 0.03337742 0.02378281 1.403426
sex 0.01950260 0.01178936 1.654254
(Dispersion Parameter for Poisson family taken to be 1 )
Null Deviance: 232.0459 on 204 degrees of freedom
Residual Deviance: 192.2642 on 200 degrees of freedom
Number of Fisher Scoring Iterations: 5
Parametric regression models – p. 15/40
While we are at it: Exponential model, square root link!
> deadx<-dead+0.00001
> summary(glm(I(deadx/lifetime)˜ulcer+logthick+age+s ex,
family=poisson(link=sqrt),weight=lifetime))
Deviance Residuals:
Min 1Q Median 3Q Max
-1.678628 -0.7994057 -0.3830467 0.5879871 2.578291
Coefficients:
Value Std. Error t value
(Intercept) 0.174134346 0.0856882213 2.032185
ulcer -0.101541779 0.0336639635 -3.016335
logthick 0.053093298 0.0168588388 3.149286
age 0.001679106 0.0008935945 1.879047
sex 0.054402522 0.0305622598 1.780056
(Dispersion Parameter for Poisson family taken to be 1 )
Null Deviance: 232.0459 on 204 degrees of freedom
Residual Deviance: 187.0307 on 200 degrees of freedom
Number of Fisher Scoring Iterations: 6
Parametric regression models – p. 16/40
Piecewise constant hazards
We may also use Poisson-regression under assumption of
piecewise constant hazards
α(t|x) = exp(β′x+ θj) whentj−1 < t ≤ tj
where0 = t0 < t1 < · · · < tJ is a partition of positive real
numbers.
Let
• Tij =
tj − tj−1 whenTi > tj
Ti − tj−1 whentj−1 < Ti ≤ tj
0 whenTi ≤ tj−1
• Dij = DiI(tj−1 ≤ Ti ≤ tj)
Parametric regression models – p. 17/40
Piecewise constant hazards, contd.
ThusTij = "exposure time" ind.i in interval(tj−1, tj ]
andDij = indicator for event ind.i in interval(tj−1, tj ].
Likelihood for data becomes, withαj = exp(θj) andA(t|x) =cumulative hazard with covariatex,
L =∏n
i=1 α(Ti|xi)Di exp(−A(Ti|xi))=
∏ni=1
∏Jj=1
[exp(θj + β′xi)
Dij exp(−Tij exp(θj + β′xi))]
This is proportional to a likelihood for
Dij ∼ Po(Tij exp(θj + β′xi)
and we may again use Poisson-regression to fit the model as longas we include afactor variable for interval.
Parametric regression models – p. 18/40
Melanoma data, 2 intervals> dead2int<-c(dead * (lifetime<=3),dead[lifetime>3])
> lifetime2<-c(pmin(lifetime,3),lifetime[lifetime>3] -3)
> intervall<-c(rep(1,length(lifetime)),rep(2,sum(lif etime>3)))
> ulcer2<-c(ulcer,ulcer[lifetime>3])
> logthick2<-c(logthick,logthick[lifetime>3])
> sex2<-c(sex,sex[lifetime>3])
> age2<-c(age,age[lifetime>3])
> summary(glm(dead2int˜offset(log(lifetime2))+interv all+ulcer2
+logthick2+age2+sex2,family=poisson),cor=F)
Coefficients:
Value Std. Error t value
(Intercept) -3.22116888 0.918476470 -3.5070783
intervall -0.02941870 0.270372123 -0.1088082
ulcer2 -0.95843701 0.324402947 -2.9544646
logthick2 0.51036993 0.177491307 2.8754644
age2 0.01287056 0.008027254 1.6033572
sex2 0.34053830 0.270448130 1.2591631
(Dispersion Parameter for Poisson family taken to be 1 )
Null Deviance: 293.0395 on 371 degrees of freedom
Residual Deviance: 248.9799 on 366 degrees of freedom
Number of Fisher Scoring Iterations: 5 Parametric regression models – p. 19/40
More piecewise constant hazard
When we let the interval lengthstj − tj−1 become small we get a
very flexible - almost semi-parametric - model.
In fact Cox-regression is the limit when alltj − tj−1 → 0
(Breslow, 1972).
But since Poisson-regression allows for more link-functions wealso get alternativs to Cox-regression.
Parametric regression models – p. 20/40
Aggregated data and piecewise constant hazard
Assume that covariate vectorx only may attain a finite number
of valuesz1, z2, . . . , zK (f.ex. only categorical covariates).
Then we may aggregate data to
• Total exposure time in interval(tj−1, tj ] with covariatezk:
T•j,zk =∑
i:xi=zkTij
• Total no. events in interval(tj−1, tj ] with covariatezk:
D•j,zk =∑
i:xi=zkDij
Vi may also express the likelihood as
L =∏
zk
J∏
j=1
[exp(αj + β′zk)
D•j,zk exp(−T•j,zk exp(αj + β′zk))]
thus as proportional with likelihood for Poisson-dataD•j,zk ∼ Po(T•j,zk exp(αj + β′zk))
Parametric regression models – p. 21/40
Use for Poisson-regression
Previously it was not possible to use Cox-regression on large
data sets (n > 500.000, f.ex.)
With a lot of time-dependent covariates this continues to bea
problem.
However, first using programs for å aggregating data one may
instead use Poisson-regression.
The future use of Poisson-regression for survival data is the
flexibility these models offer wrt. link functions, random
component models (frailty), smoothing techniques and multiple
time scales.
Parametric regression models – p. 22/40
Ex. Traditional use of Poisson-regression
Samuelsen, Magnus & Bakketeig, 1998, "Birth weight and
mortality in childhood in Norway", AJE:
• ca. 1.250.000 children born in Norway, 1967-1990.
• Follow-up from 1-15 year of age to or 1992.
• Covariates:
• Birth weight (≤ 2500, > 2500g)
• Lengt of pregnancy (< 37 or ≥ 37 weeks
• Sex
• Maternal age - grouped
• Previous births for (yes/no, paritet)
• Birth cohort: 1967-1975, 1976-1984, 1985-1990.
• Age (one year intervals) = time svariableParametric regression models – p. 23/40
Results for cancer mortality
Value Std. Error t value
(Intercept) -9.487010362 0.14513937 -65.36483242
vektkat < 5 -1.296055900 0.39594733 -3.27330379
factor(kjonn) -0.297092337 0.07739885 -3.83845908
factor(koho)2 -0.309147054 0.08575119 -3.60516350
factor(koho)3 -0.864394863 0.20015213 -4.31868928
factor(pari) -0.039848282 0.08261364 -0.48234509
factor(mald)2 0.013457555 0.15636110 0.08606716
factor(mald)3 0.274168093 0.12765561 2.14771672
factor(gest)2 0.038185223 0.21835460 0.17487712
factor(gest)3 -0.230881244 0.19316189 -1.19527326
factor(alder)2 0.157311796 0.17058730 0.92217766
factor(alder)3 0.234812145 0.16865818 1.39223694
factor(alder)4 0.165260296 0.17258180 0.95757660
factor(alder)5 0.188845360 0.17291289 1.09214161
factor(alder)6 -0.007195062 0.18305818 -0.03930478
factor(alder)7 -0.234062467 0.19694164 -1.18848644
factor(alder)8 -0.338458601 0.20586398 -1.64408849
factor(alder)9 -0.712831213 0.23678818 -3.01041726
factor(alder)10 -0.358587244 0.21339711 -1.68037532
factor(alder)11 -0.616976682 0.23701617 -2.60309957
factor(alder)12 -0.350074062 0.22032205 -1.58891980Parametric regression models – p. 24/40
Lexi-diagrams, time scales
Born from 1967, time until event / censoring according to exp.
distribution.
Every person is represented as a line through the diagram with
x−axis age andy−axis calendar time:
alder
kale
nder
tid
0 5 10 15 20 25
1970
1980
1990
2000
We may read off the figure exposure time in 5 year intervals and5-year calendar periods.
Parametric regression models – p. 25/40
Age-period-cohort (APC) problem
For one point in the Lexis-diagram
• x = age
• y = calendar time
• z = year of birth
Thenz + x = y, thus perfect linear dependency.
It would be tempting to include all ofx, y og z as covariates,
however we are not able to use all of them in one model -
without further restrictions.
The APC problem is thus non-identifiable.
Parametric regression models – p. 26/40
Accelerated failure time models
α(t|β′x, θ) = exp(β′x)α0(exp(β′x)t|θ)
1. Will only discuss fully parametric modelsα0(t|θ)2. Semi-parametric methods exist
Translation models: α(t|β′x, θ) = α0(t+ β′x|θ)1. possible inR
2. Semi-parametric methods exist
Parametric regression models – p. 27/40
Characterizations Accelerated failure time models
Characterization 1. Uncensored survival timeT
Y = log(T ) = µ+ γ′x+ σW
where
• W is a (standardized) random variable with specified
distribution
• σ degree of variation inY compared toW
• µ is the center in the distribution ofY if Z or γ = 0.
This is thus a log-linear regression model forT , but contrary to
standard linear regression regression models we assume a
paricular distribution for the error termσW
Parametric regression models – p. 28/40
Alternative charcterization
With S0() the survival function ofexp(σW + µ) (i.e. the survival
function withx = 0) we get
P(T > t) = P(exp(σW + µ+ γ′x) > t) = S0(t exp(−γ′x)).
Thus the
time scale forT givenx
equals the
timescale forT givenx = 0 multiplied with exp(−γ′x)
We may callexp(−γ′x) = exp(β′x) anacelleration factor(whereβ = −γ).
Parametric regression models – p. 29/40
Characterization 3. of Acc. failure time models
With A0() cumulative hazard forexp(σW + µ) we may
alternatively write
P(T > t) = S0(t exp(β′x)) = exp(−A0(t exp(β
′x)))
which leads to the hazard forT givenx by
α(t|x) = exp(β′x)α0(t exp(β′x))
whereα0(t) is the hazard ofexp(σW + µ).
Note: Rwill report γ = −β.
Parametric regression models – p. 30/40
For accelerated failure time models specify
the distribution of the error termσW . The following are
implemented inR
• σW extreme-value (Gumbel) distributed
andexp(σW ) Weibull-distributed
• Special case:exp(W ) exponential distributed,σ = 1.
• σW is logistic andexp(σW ) log-logistic
• σW is normal (gaussian) anexp(σW ) log-normal
The Weibull is the default for these parametric survival models inR.
Parametric regression models – p. 31/40
Acc. failure time mod. fitted by command survreg
> survreg(Surv(lifetime,dead)˜ulcer+logthick+age+sex ,data=mel)
Call:
survreg(formula = Surv(lifetime, dead) ˜ ulcer + logthick + age + sex, data
mel)
Coefficients:
(Intercept) ulcer logthick age sex
3.119864 0.8198126 -0.4289668 -0.0129206 -0.2883985
Scale= 0.8231279
Loglik(model)= -207.8 Loglik(intercept only)= -230.9
Chisq= 46.23 on 4 degrees of freedom, p= 2.2e-09
n= 205
Note. Opposite sign compared to Cox-regression,γ = −β!
Parametric regression models – p. 32/40
Mer info fra summary: Weibull=default
> summary(survreg(Surv(lifetime,dead)˜ulcer+logthick +age+sex,
dist="weibull"))
Value Std. Error z p
(Intercept) 3.1199 0.65819 4.74 2.14e-06
ulcer 0.8198 0.27659 2.96 3.04e-03
logthick -0.4290 0.15343 -2.80 5.18e-03
age -0.0129 0.00655 -1.97 4.84e-02
sex -0.2884 0.22478 -1.28 1.99e-01
Log(scale) -0.1946 0.11522 -1.69 9.12e-02
Scale= 0.823
Weibull distribution
Loglik(model)= -207.8 Loglik(intercept only)= -230.9
Chisq= 46.23 on 4 degrees of freedom, p= 2.2e-09
Number of Newton-Raphson Iterations: 6
n= 205
Parametric regression models – p. 33/40
Exponential
summary(survreg(Surv(lifetime,dead)˜ulcer+logthick+ age+sex,
dist="exponential",data=mel))
Value Std. Error z p
(Intercept) 3.272 0.79404 4.12 3.78e-05
ulcer 0.960 0.32419 2.96 3.05e-03
logthick -0.511 0.17762 -2.88 4.02e-03
age -0.013 0.00792 -1.64 1.00e-01
sex -0.341 0.27070 -1.26 2.08e-01
Scale fixed at 1
Exponential distribution
Loglik(model)= -209.1 Loglik(intercept only)= -231.1
Chisq= 44.05 on 4 degrees of freedom, p= 6.3e-09
May here test if exponetial model is sufficient.
LRT = 2( [Log-lik. Weibull] - [Log-lik. Exponential])= 2(209.1-207.8) = 2.6 i.e. insignificant (comparing withχ2
1).
Parametric regression models – p. 34/40
Accelerated failure times with Weibull distributed T
If exp(σW + µ) has a Weibull distribution with hazard
α0(t) = btk−1 then the hazard forT with covariatex becomes
α(t|x) = exp(β′x)α0(t exp(β′x)) = b exp(kβ′x)tk−1
and soT is also Weibull distributed. More important
α(t|x) = exp(kβ′x)α0(t)
i.e. we have a proportional hazard model forT with regression
parameterkβ.
There is in fact an equivalence here: If the distribution of theT ’scan be represented both by a proportional hazards and anaccelerated failure time model thenT is Weibull distributed (Cox& Oakes, 1984, Analysis of survival data).
Parametric regression models – p. 35/40
About parametrization.
After fitting an accelerated failure time model we may translatethe results into log-hazard ratios byβ = −γ/σ whereγ are theestimated regression parameters in the acc. failure time modelandσ the scale estimate:
> survregfit<-survreg(Surv(lifetime,dead)˜ulcer+logt hick+age+sex)
> -survregfit$coef[2:5]/survregfit$scale
ulcer logthick age sex
-0.9959723 0.5211423 0.01569695 0.350369
> coxfit<-coxph(Surv(lifetime,dead)˜ulcer+logthick+a ge+sex)
> coxfit$coef
ulcer logthick age sex
-0.9436516 0.5549527 0.01145059 0.363405
Thus there is a good correspondence between thesemi-parametric and parametric estimates.
Parametric regression models – p. 36/40
Likelihood and counting processes
We will now assume that the true survival times have hazard
αi(t; θ) possibly for some regression model.
We may have left-truncated and right-censored observations with
likelihood contributions
Li(θ) = exp(
∫log(αi(t; θ))dNi(t)−
∫Yi(t)αi(t; θ)dt)
(check that this corresponds to the formula for right-censored
data(Ti, Di) on the first page of these handouts.)
This gives a total log-likelihood
l(θ) =
n∑
i=1
log(Li(θ)) =
n∑
i=1
[
∫log(αi(t; θ))dNi(t)−
∫Yi(t)αi(t; θ)dt]
Parametric regression models – p. 37/40
Score function and martingale
To keep notation simple we will consider only a scalarθ. The
score function then becomes
U(θ) =∂l(θ)
∂θ=
n∑
i=1
[
∫α′
i(t; θ)
αi(t; θ)dNi(t)−
∫Yi(t)α
′
i(t; θ)dt]
whereα′
i(t; θ) =∂αi(t;θ)∂θ
. Then since
dNi(t) = Yi(t)αi(t; θ)dt+ dMi(t)
whereMi(t) is a martingale we get
U(θ) =
n∑
i=1
∫α′
i(t; θ)
αi(t; θ)dMi(t),
i.e. a sum of integrals wrt martingales and E[U(θ)] = 0 (thisexpectation is no surprise, sinceU(θ) is a score-function).Parametric regression models – p. 38/40
Var(U(θ)) and information I(θ) = −∂U(θ)∂θ
By standard martingale arguments we now get that
Var(U(θ)) = E[n∑
i=1
∫ (α′
i(t; θ)
αi(t; θ)
)2
Yi(t)αi(t; θ)dt]
But we also have that the observed information can be written
I(θ) = −n∑
i=1
[
∫{α
′′
i (t; θ)
αi(t; θ)−(α′
i(t; θ)
αi(t; θ)
)2
}dNi(t)−∫Yi(t)α
′′
i (t; θ)dt]
Again insertingdNi(t) = Yi(t)αi(t; θ)dt+ dMi(t) we get
I(θ) =n∑
i=1
∫ (α′
i(t; θ)
αi(t; θ)
)2
Yi(t)αi(t; θ)dt+ M
whereM is a sum of integrals wrt martingales and have exp.zero. Thus we obtain Var(U(θ)) = E(I(θ)).
Parametric regression models – p. 39/40
Estimated expected information
SincedNi(t) = Yi(t)αi(t; θ)dt+ dMi(t) where thedMi(t) are
martingale increments and since∂ log(αi(t;θ)∂θ
= α′
i(t; θ)/αi(t; θ)
we may estimate the expected information by
I(θ)) = Var(U(θ)) =n∑
i=1
∫(∂ log(αi(t; θ)
∂θ)2dNi(t)
where the MLEθ is inserted forθ.
The matrix version of this formula with is
I(θ)) = Var(U(θ)) =n∑
i=1
∫(∂ log(αi(t; θ)
∂θ)⊗2dNi(t)
Parametric regression models – p. 40/40