Evaluang Fit of Modeled Covariances - Byrnes...

of 18/18
12/5/16 1 Assessing Fit & Comparing SEMs with Likelihood [email protected] E. K. Byrnes Outline 1. Assessing model fit: the χ 2 Related indices 2. EvaluaLng Residuals for Normality 3. AdjusLng for non-normality 4. Model comparison 5. TesLng mediaLon Outline 1. Assessing model fit: the χ 2 Related indices 2. EvaluaLng Residuals for Normality 3. AdjusLng for non-normality 4. Model comparison 5. TesLng mediaLon EvaluaLng Fit of Modeled Covariances Matrix The log likelihood raLo, F ML follows χ2 distribuLon such that χ 2 =(n-1)F ML Note scaling by sample size Large χ 2 implies LACK of fit
  • date post

    20-Aug-2020
  • Category

    Documents

  • view

    4
  • download

    0

Embed Size (px)

Transcript of Evaluang Fit of Modeled Covariances - Byrnes...

  • 12/5/16

    1

    AssessingFit&ComparingSEMswithLikelihood

    [email protected]

    Outline

    1. Assessingmodelfit:theχ2–  Relatedindices

    2. EvaluaLngResidualsforNormality3. AdjusLngfornon-normality4. Modelcomparison5. TesLngmediaLon

    Outline

    1. Assessingmodelfit:theχ2–  Relatedindices

    2. EvaluaLngResidualsforNormality3. AdjusLngfornon-normality4. Modelcomparison5. TesLngmediaLon

    EvaluaLngFitofModeledCovariancesMatrix

    TheloglikelihoodraLo,FMLfollowsχ2distribuLonsuchthat

    χ2=(n-1)FML

    •  Notescalingbysamplesize•  Largeχ2impliesLACKoffit

  • 12/5/16

    2

    Asusual,pvaluesdecreasewithhighern

    χ2=3.60with100samplesp=0.058

    χ2=7.24with200samplesp=0.007

    χ2=1.78with50samplesp=0.182

    TheClassicTestusingPValues

    p>0.05meansnodiscrepancybetweensampleandobservedcovariancematrix

    (1)ModelChi-Squarewithitsdfandp-value.-preferp-valuegreaterthan0.05(2)RootMeanSquareErrorofApproximaLon(RMSEA).-preferlower90%CItobe<0.05(3)ComparaLveFitIndex(CFI).-prefervaluegreaterthan0.90(4)StandardizedRootMeanSquareResidual(SRMR).-prefervaluelessthan0.10

    Kline(2012)recommends4measuresofmodelfit:

    Samples RMSEA LO90 HI90 PCLOSE50 .126 .000 .426 .208100 .162 .000 .356 .089200 .177 .074 .307 .024

    WearesLllaffectedbysamplesize/power.(whichisreasonable)

    Asoursamplesizeincreases,wecanexpectourdatatosupportmoreandmorecomplexmodels.

    RMSEAforOurExample

    CFI:usesCentralityofmodelχ250samples=0.96100samples=0.94200samples=0.94

    MeasuresofGoodnessofFitthatdon’tinvolvep-values

  • 12/5/16

    3

    EvaluaLngFitofAModel

    x

    y1

    y21.00.4 1.00.35 0.5 1.0

    rxy2expectedtobe0.2=(0.40x0.50)

    std.covariancematrix

    issue:shouldtherebeapathfromxtoy2?

    0.40 0.50

    standardizedresidual=0.35–0.2=0.15

    DiagnosingCausesofLackofFitwithResiduals(misspecificaLon)

    SampleCovarianceMatrixy1y2x------------------------y11.00y20.501.00x0.400.351.00

    ImpliedCovarianceMatrixy1y2x------------------------y11.00y20.501.00x0.400.201.00

    residual=0.15

    Buthowmuchwillincludingapathincreasemodelfit?

    ModificaLonIndices•  LagrangeMulRpliers:Theamountthatχ2woulddecreaseduetoincludingapath.

    •  WaldWstaRsRc:Howmuchχ2wouldincreaseifapathistrimmed.–  Droppingapathcanincreaseparametervariability

    •  Beverycarefulherefordatadredging.

    FullyMediatedFire

    fullMedModel

  • 12/5/16

    4

    FitoftheFullyMediatedModel

    > summary(fullMedSEM)lavaan (0.5-17) converged normally after 19 iterations

    Number of observations 90

    Estimator ML Minimum Function Chi-square 3.297 Degrees of freedom 1 P-value 0.069

    age cover

    firesev

    Fit-A-Paloozasummary(fullMedSEM,fit.measures=T)

    > summary(fullMedSEM, fit.measures=T)lavaan (0.4-12) converged normally after 21 iterations

    Number of observations 90

    Estimator ML Minimum Function Chi-square 3.297 Degrees of freedom 1 P-value 0.069

    Chi-square test baseline model:

    Minimum Function Chi-square 43.143

    Degrees of freedom 3 P-value 0.000

    Full model versus baseline model:

    Comparative Fit Index (CFI) 0.943 Tucker-Lewis Index (TLI) 0.828

    Loglikelihood and Information Criteria:

    Loglikelihood user model (H0) -531.341 Loglikelihood unrestricted model (H1) -529.693

    Number of free parameters 4 Akaike (AIC) 1070.683 Bayesian (BIC) 1080.682 Sample-size adjusted Bayesian (BIC) 1068.057

    Root Mean Square Error of Approximation:

    RMSEA 0.160 90 Percent Confidence Interval 0.000 0.365 P-value RMSEA summary(fullMedSEM, fit.measures=T)...

    Full model versus baseline model:

    Comparative Fit Index (CFI) 0.943 Tucker-Lewis Index (TLI) 0.828

    ...

    Number of free parameters 4 Akaike (AIC) 1070.683 Bayesian (BIC) 1080.682 Sample-size adjusted Bayesian (BIC) 1068.057

    Root Mean Square Error of Approximation:

    RMSEA 0.160 90 Percent Confidence Interval 0.000 0.365 P-value RMSEA fitMeasures(fullMedSEM) npar fmin chisq df 4.000 0.018 3.297 1.000 pvalue baseline.chisq baseline.df baseline.pvalue

    0.069 43.143 3.000 0.000 cfi tli nnfi rfi 0.943 0.828 0.828 0.771

    nfi pnfi ifi rni 0.924 0.308 0.945 0.943 logl unrestricted.logl aic bic -531.341 -529.693 1070.683 1080.682

    ntotal bic2 rmsea rmsea.ci.lower 90.000 1068.057 0.160 0.000 rmsea.ci.upper rmsea.pvalue rmr rmr_nomean

    0.365 0.101 0.245 0.245 srmr srmr_bentler srmr_bentler_nomean srmr_bollen 0.062 0.062 0.062 0.062 srmr_bollen_nomean srmr_mplus srmr_mplus_nomean cn_05

    0.062 0.062 0.062 105.849 cn_01 gfi agfi pgfi 182.093 0.977 0.859 0.163 mfi ecvi

    0.987 0.126

  • 12/5/16

    5

    [email protected]

    > inspect(fullMedSEM, "sample")$cov firesv cover age firesev 2.700 cover -0.227 0.100 age 9.319 -1.381 156.157

    $meanfiresev cover age 4.565 0.691 25.567

    age cover

    firesev> fitted(fullMedSEM)$cov firesv cover age firesev 2.700 cover -0.227 0.100 age 9.319 -0.782 156.157

    $meanfiresev cover age 0 0 0

    ResidualCovariance

    > residuals(fullMedSEM)$cov firesv cover age firesev 0.000 cover 0.000 0.000 age 0.000 -0.599 0.000

    $meanfiresev cover age 0 0 0

    age cover

    firesev

    ResidualCorrelaLon

    > residuals(fullMedSEM, type="cor") firesv cover age firesev 0.000 cover 0.000 0.000 age 0.000 -0.152 0.000

    age cover

    firesev

    ModificaLonIndices

    > modificationIndices(fullMedSEM, standardized=F) lhs op rhs mi epc1 firesev ~~ firesev 0.000 0.0002 firesev ~~ cover 3.238 0.1743 firesev ~~ age NA NA4 cover ~~ cover 0.000 0.0005 cover ~~ age 3.238 -0.7556 age ~~ age 0.000 0.0007 firesev ~ cover 3.238 2.1578 firesev ~ age 0.000 0.0009 cover ~ firesev 0.000 0.00010 cover ~ age 3.238 -0.00511 age ~ firesev 0.000 0.00012 age ~ cover 2.884 -8.351

    age cover

    firesev

  • 12/5/16

    6

    Exercise:DiagnosingMisspecificaLon

    distance richness

    hetero

    abioLc

    •  Fitandassessmodel•  LookatmeasuresofmisspecificaLon

    SoluLon:DiagnosingMisspecificaLon

    #Full MediationdistModel2

  • 12/5/16

    7

    SoluLon:LargeResidualrich->distancecorrelaLon

    #modification indices, with a trick to only see big ones> modI modI[which(modI$mi>3),] lhs op rhs mi epc1 rich ~~ hetero 15.181 -1.6902 rich ~~ abiotic 15.181 -76.2023 rich ~ distance 15.181 0.6624 abiotic ~ rich 3.811 -0.1965 distance ~ rich 10.672 0.251

    distance richness

    hetero

    abioLc

    1.  SEMfocusesonassessingoverallmodelfit•  Isyourmodeladequate?•  Areyoumissinganypaths?

    2.  WhenyouaremissingimportantpathsyourparameteresLmatesmaybeincorrect•  yourmodelismisspecified

    AddiLonalPointsaboutOverallModelFit

    Outline

    1. Assessingmodelfit:theχ2–  Relatedindices

    2. EvaluaLngResidualsforNormality3. AdjusLngfornon-normality4. Modelcomparison5. TesLngmediaLon

    ParLalMediaLonModel

    partialMedModel

  • 12/5/16

    8

    WhatisthedistribuLonofourresiduals?

    >source("./fitted_lavaan.R")

    > partialResid head(partialResid) firesev cover1 -1.9263673 0.47524312 -0.4811819 -0.21865213 -1.3343917 0.16423124 -1.0343917 0.41019565 -0.1118239 0.58425256 -0.4715029 0.4683961

    QQPlotsHelp

    par(mfrow=c(1,2))apply(partialResid, 2, function(x){

    qqnorm(x) qqline(x)})

    par(mfrow=c(1,1))

    MulLvariateShapiro-WilksTestlibrary(mvnormtest)

    > mshapiro.test(t(partialResid))

    Shapiro-Wilk normality test

    data: ZW = 0.96889, p-value = 0.02954

    OoentoosensiLveofatest

    FormalTestsfromMVNlibrary(MVN)

    mt

  • 12/5/16

    9

    mvnPlot(mt,type="persp",default=T) Outline

    1. Assessingmodelfit:theχ2–  Relatedindices

    2. EvaluaLngResidualsforNormality3. AdjusLngfornon-normality4. Modelcomparison5. TesLngmediaLon

    OurModelforCorrecLon

    #Full MediationdistModel2

  • 12/5/16

    10

    MulLvariateShapiro-WilksTest

    > library(mvnormtest)> mshapiro.test(t(res))

    Shapiro-Wilk normality test

    data: ZW = 0.98579, p-value = 0.4367

    Theseresidualsarefine

    •  CanbeoverlysensiLve•  Skewmostimportant

    Mardia’sMulLvariateSkew

    > library(semTools)> mardiaSkew(res) > mardiaSkew(res) b1d chi df p 0.5693772 8.5406580 10.0000000 0.5761788

    Thisisfine

    CorrecLngforViolaLonofNormality:TheSatorra-BentlerChiSquare

    MulLvariateSkew

    SensiLvitytoParameterChange Weightsderived

    FromCovMatrix

    CorrecRoncoefficientforχ2andStandardErrors

    distFitSB summary(distFit2SB)lavaan (0.5-17) converged normally after 44 iterations

    Number of observations 90

    Estimator ML Robust Minimum Function Test Statistic 17.831 17.854 Degrees of freedom 2 2 P-value (Chi-square) 0.000 0.000 Scaling correction factor 0.999

    distance richness

    hetero

    abioLc

  • 12/5/16

    11

    ViolaLonofMulLvariateNormality:TheBollen-SLneBootstrap

    Valueofχ2

    χ2distribuLonNaïvebootstrapTransformedbootstrap

    Togetaccuratebootstrap,youcancalculateabootstrappedχ2ontransformeddata

    Bollen-SLneBootstrapOutput

    > distFitBoot summary(distFitBoot)lavaan (0.5-17) converged normally after 37 iterations

    Number of observations 90

    Estimator ML Minimum Function Test Statistic 1.810 Degrees of freedom 1 P-value (Chi-square) 0.178 P-value (Bollen-Stine Bootstrap) 0.140

    distance richness

    hetero

    abioLc

    QuesLons?

  • 12/5/16

    12

    Outline

    1. Assessingmodelfit:theχ2–  Relatedindices

    2. EvaluaLngResidualsforNormality3. AdjusLngfornon-normality4. Modelcomparison5. TesLngmediaLon

    x

    y1

    y2

    Model1

    x

    y1

    y2

    Model2

    forn=50samples,χ2 DFpModel11.78 1 Model20.00 0 diff 1.78 1 0.18

    SuggestsModel1fitsaswellasmodel2withfewerpaths–parsimonywins!

    TheLikelihoodRaLoTestRevisitedforMediaLon

    • Previously,weusedaLRTtocompareasaturatedmodeltoanon-saturatedmodel.• WecanuseLRTstocompareanysetofnestedmodelsthatdifferinDF

    f(x)=“True”valueatpointxDiscrepancybetweenfitmodelandf(x)conveysinformaLonloss

    AICComparisons:BecauseYouWillOnlyEverKnowYourSampledPopulaLon

    Gi(x|θ)=esLmateofmodeliatpointxgivenparametersθ

    ModelsProvideVaryingDegreesofInformaLonaboutReality

  • 12/5/16

    13

    Kulback-LeiblerInformaLon

    I f ,g( ) = f (x)log f (x)g(x |θ)

    dx∫I(f,g)=informaLonlosswhengisusedtoapproximatef–integratedoverallvaluesofxNote:f(x)canbepulledoutasaconstantwhencomparingmulLplemodels!Noneedtoknowthetruevalueoff(x)

    LikelihoodandInformaLonForlikelihood,informaLonlossisconveyedby

    thefollowingwithK=#ofparameters:

    ThisgivesrisetoAkaike’sInformaLonCriterion–lowerAICmeanslessinformaLonislostbyamodel€

    log L( ˆ θ | data)( ) −K = constan t − I( f , ˆ g )

    AIC=-2log(L(θhat|data))+2K

    PrincipalofParsimony:Howmanyparametersdoesittaketofitan

    elephant?

  • 12/5/16

    14

    wheret=numberofesLmatedparametersinthemodelandn=thenumberofsamples

    AICc = AIC + 2t(t +1)n − t −1#

    $ %

    &

    ' (

    Note,thisisnotthe“consistentAIC”reportedasCAICbymanypiecesofsooware

    CorrecLngforSampleSize:theAICc ModelWeightstoCompareModels•  Inasetofmodels,thedifferencebetweenmodelIandthemodelwiththebestfitisΔi=AICi-AICmin

    •  WecanthendefinetherelaLvesupportforamodelasamodelweight

    •  N.B.modelweightssummedtogether=1

    wi =exp −

    12Δ i

    $

    % &

    '

    ( )

    exp −12Δ r

    $

    % &

    '

    ( )

    r=1

    R

    AICandSEM•  AIC–mostpredicLvemodelAIC=χ2+2K

    •  SmallSample-SizeAdjustedAICAICc=χ2+2K*(K-1)/(N-K-1)

    •  BayesianInformaLonCriterion–most‘correct’modelBIC=χ2-DF*log(N)

    AIC diff support for equivalency of models 0-2 substantial 4-7 weak > 10 none

    Burnham, K.P. and Anderson, D.R. 2002. Model Selection and Multimodel Inference. Springer Verlag. (second edition), p 70.

    AIC difference criteria

    Note:Modelsarenotrequiredtobenested,asinusingLRTtests

  • 12/5/16

    15

    1.SEMprovidesaframeworkthataidstheapplicaLonofscienLficjudgmenttoselecLnganappropriatemodeloftheworld

    2.GrowinginterestinaninformaLon-basedapproachthatfocusesonmodelselecLonandeffectsizes.

    3.ManyviewpointsonuLlityofNeyman-PearsonhypothesistesLng

    4.Thetwocanbeusedcomplementarily,however!

    LRTesLngv.AIC Outline

    1. Assessingmodelfit:theχ2–  Relatedindices

    2. EvaluaLngResidualsforNormality3. AdjusLngfornon-normality4. Modelcomparison5. TesLngmediaLon

    age cover

    firesev

    age cover

    firesev

    ParLallyMediated FullyMediated

    Saturated(Full)Model UnsaturatedModel

    FullyMediatedModel

    fullMedModel

  • 12/5/16

    16

    ParLallyMediatedModel

    partialMedModelChisq) partialMedSEM 0 1069.4 1081.9 0.0000 fullMedSEM 1 1070.7 1080.7 3.2974 3.2974 1 0.06939 .---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    age cover

    firesev

    age cover

    firesev

    Reproducessamecovariance

    matrix

    ComparingModelswithAICc

    > source("./lavaan.modavg.R")> aictab.lavaan(list(fullMedSEM, partialMedSEM),

    c("Full", "Partial"))

    Model selection based on AICc :

    K AICc Delta_AICc AICcWt Cum.Wt LLPartial 5 1069.66 0.00 0.64 0.64 -529.69Full 4 1070.82 1.16 0.36 1.00 -531.34

    age cover

    firesev

    age cover

    firesev

    ΔAICc=1.16,small

    Exercises

    PerformatestofmediaLonforthefollowingmodel

    Bonus:Calculatesummeddirectandindirecteffects

    distance rich

    hetero

    abioLc

  • 12/5/16

    17

    SoluLon:TheModels

    #Partial MediationdistModel

  • 12/5/16

    18

    WeShouldNothaveUsedtheFullyMediatedModelforAICAnalyses

    lavaan (0.5-17) converged normally after 36 iterations

    Number of observations 90

    Estimator ML Minimum Function Test Statistic 17.831 Degrees of freedom 2 P-value (Chi-square) 0.000

    distance richness

    hetero

    abioLc

    QuesLons?