DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

25
DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit , London 10 September 2010

Transcript of DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Page 1: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

DIY fractional polynomials

Patrick RoystonMRC Clinical Trials Unit , London

10 September 2010

Page 2: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Overview

• Introduction to fractional polynomials• Going off-piste: DIY fractional polynomials• Examples

Page 3: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

3

Page 4: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Fractional polynomial models

• A fractional polynomial of degree 1 with power p1 is defined as FP1 = β1 X p1

• A fractional polynomial of degree 2 with powers (p1,p2) is defined as FP2 = β1 X p1 + β2 X p2

• Powers (p1,p2) are taken from a predefined set

S = {2, 1, 0.5, 0, 0.5, 1, 2, 3} where 0 means log X Also, there are ‘repeated’ powers FP2 models

Example: FP1 [power 0.5] = β1 X0.5

Example: FP2 [powers (0.5, 3)] = β1 X0.5 + β2 X

3

Example: FP2 [powers (3, 3)] = β1 X3 + β2 X

3lnX

Page 5: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Some examples of fractional polynomial (FP2) curves

(-2, 1) (-2, 2)

(-2, -2) (-2, -1)

Royston P, Altman DG (1994) Applied Statistics 43: 429-467.

Page 6: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

FP analysis for the prognostic effect of age in breast cancer

Page 7: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

FP function selection procedure

Simple functions are preferred. More complicated functions are accepted only if the fit is much better

Effect of age significant at 5% level?

χ2 df P-value

Any effect? Best FP2 versus null 17.61 4 0.0015

Linear function suitable?Best FP2 versus linear 17.03 3 0.0007

FP1 sufficient?Best FP2 vs. best FP1 11.20 2 0.0037

Page 8: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Fractional polynomials in Stata

• fracpoly command• Basic syntax:

. fracpoly [, fp_options]: regn_cmd [yvar] xvar1 [xvars] …

• xvar1 is a continuous predictor which may have a curved relationship with yvar

• xvars are other predictors, all modelled as linear• Can use the fp_option compare to compare the fit of

different FP models• uses the FP function selection procedure

Page 9: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Example (auto data)

• fracpoly, compare: regress mpg displacement

Fractional polynomial model comparisons:--------------------------------------------------------------------------displacement df Deviance Res. SD Dev. dif. P (*) Powers--------------------------------------------------------------------------Not in model 0 468.789 5.7855 70.818 0.000 Linear 1 417.801 4.12779 19.830 0.000 1m = 1 2 400.592 3.67467 2.621 0.284 -2m = 2 4 397.971 3.6355 -- -- -2 3--------------------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 2

model

• Show FP1 and FP2 models in Stata (+ fracplot)

Page 10: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

But what if fracpoly can’t fit my model … ?

• fracpoly supports only some of Stata’s rich set of regression-type commands

• Provided we know what the command we want to fit looks like with a transformed covariate, we can fit an FP model to the data

• We just create the necessary transformed covariate values, fit the model using them, and assess the fit

• A new, simple command fracpoly_powers helps by generating strings (local macros) with the required powers:

. fracpoly_powers [, degree(#) s(list_of_powers) ]

Page 11: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Fitting an FP2 model in the auto example

// Store FP2 powers in local macrosfracpoly_powers, degree(2)local np = r(np)forvalues j = 1 / `np' {

local p`j' `r(p`j')'}// Compute deviance for each model with covariate displacementlocal x displacementlocal y mpglocal devmin 1e30quietly forvalues j = 1 / `np' {

fracgen `x' `p`j'', replaceregress `y' `r(names)'local dev = -2 * e(ll)if `dev' < `devmin' {

local pbest `p`j''local devmin `dev'

}}di "Best model has powers `pbest', deviance = " `devmin'

Page 12: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

A real example: modelling fetal growth

• Prospective longitudinal study of n = 50 pregnant women• There are about 6 repeated measurements on each fetus at

different gestational ages (gawks)• gawks = gestational age in weeks

• Wish to model how y = log fetal abdominal circumference changes with gestational age

• There is considerable curvature!

Page 13: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

The raw data

44

.55

5.5

6L

og A

C

10 20 30 40Gestational age, wk

Page 14: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

A mixed model for fetal growth

Multilevel (mixed) model to fit this relationship:

. xtmixed y FP(gawks) || id: FP(gawks), covariance(unstructured)

But how do we implement “FP(gawks)” here?

We want the best-fitting FP function of gawks, with random effects for the parameters (β’s) of the FP model

Page 15: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Fitting an FP2 mixed model to the fetal AC data

[First run fracpoly_powers to create local macros with powers]// Compute deviance for each FP model with covariate gawksgen x = gawksgen y = ln(ac)local devmin 1e30forvalues j = 1 / `np' {

qui fracgen x `p`j'', replace adjust(mean)qui xtmixed y `r(names)' || id: `r(names)', ///

nostderr covariance(unstructured)local dev = -2 * e(ll)if `dev' < `devmin' {

local p `p`j''local devmin `dev'

}di "powers = `p`j''" _col(20) " deviance = " %9.3f `dev'

}di _n "Best model has powers `p', deviance = " `devmin'

Page 16: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Plots of some results

44.

55

5.5

6Lo

g A

C

10 20 30 40Gestational age, wk

Fitted curves at the individual level

-.2

-.1

0.1

.2R

esid

uals

10 20 30 40Gestational age, wk

Residuals at the individual level

-.2

-.1

0.1

.2R

esid

uals

10 20 30 40Gestational age, wk

Residuals and fitted residuals

Page 17: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

An “ignorant” example!

• I know almost nothing about “seemingly unrelated regression” (Stata’s sureg command)

• It fits a set of linear regression models which have correlated error terms

• The syntax therefore has a set of “equations”

. sureg (depvar1 varlist1) (depvar2 varlist2) ... (depvarN varlistN)

• There may be non-linearities lurking in these “equations”• How can we fit FP models to varlist1, varlist2, … ?

Page 18: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Example: modelling learning scores

Stata FAQ from UCLA(http://www.ats.ucla.edu/stat/stata/faq/sureg.htm):

What is seemingly unrelated regression and how can I perform it in Stata?

Example: High School and Beyond study

Page 19: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Example: modelling learning scores

Contains data from hsb2.dta obs: 200 highschool and beyond (200 cases) vars: 11 5 Jul 2010 13:23 size: 9,600 (99.9% of memory free)------------------------------------------------------------------------------- storage display valuevariable name type format label variable label-------------------------------------------------------------------------------id float %9.0g female float %9.0g fl race float %12.0g rl ses float %9.0g sl schtyp float %9.0g scl type of schoolprog float %9.0g sel type of programread float %9.0g reading scorewrite float %9.0g writing scoremath float %9.0g math scorescience float %9.0g science scoresocst float %9.0g social studies score-------------------------------------------------------------------------------

• [It is unclear to me what “ses” (low, middle, high) is]

Page 20: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Example (ctd.)

• As an example, suppose we wish to model 2 outcomes (read, math) as predicted by “socst female ses” and “science female ses” using sureg as follows:

. sureg (read socst female ses) (math science female ses)

• Are there non-linearities in read as a function of socst?In math as a function of science?

• For simplicity here, will restrict ourselves to FP1 functions of socst and science• not necessary in principle

• We fit the 8 × 8 = 64 FP1 models and look for the best-fitting combination

Page 21: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Stata

gen x1 = socstgen x2 = sciencegen y1 = readgen y2 = mathlocal devmin 1e30forvalues j = 1 / `np' {

qui fracgen x1 `p`j'', replace adjust(mean)local x1vars `r(names)'forvalues k = 1 / `np' {

qui fracgen x2 `p`k'', replace adjust(mean)local x2vars `r(names)'qui sureg (y1 `x1vars' female ses) (y2 `x2vars' female ses)local dev = -2 * e(ll)if `dev' < `devmin' {

local px1 `p`j''local px2 `p`k''local devmin `dev'

}}

}

[Run fpexample3.do in Stata]

Page 22: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Comments

• The results suggest that there is indeed curvature in both relationships

• Can reject the null hypothesis of linearity at the 1% significance level• FP1 vs linear: χ2 = 10.08 (2 d.f.), P = 0.0065

• Shows the importance of considering non-linearity

Page 23: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

read as a function of socst(adjusted female ses)

30

40

50

60

70

80

Pa

rtia

l pre

dic

tor+

resi

dua

l of r

ead

30 40 50 60 70social studies score

Fractional Polynomial (3),adjusted for covariates

Page 24: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

math as a function of science(adjusted female ses)

30

40

50

60

70

80

Pa

rtia

l pre

dic

tor+

resi

dua

l of m

ath

20 40 60 80science score

Fractional Polynomial (2),adjusted for covariates

Page 25: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.

Conclusions

• Fractional polynomial models are a simple yet very useful extension of linear functions and ordinary polynomials

• If you are willing to do some straightforward do-file programming, you can apply them in a bespoke manner to a wide range of Stata regression-type commands and get useful results

• For (much) more, see Royston & Sauerbrei (2008) book