Besag, York and Molie model O i is the number of cases in area i, ρ is the prevalence of the...

1
Besag, York and Molie model O i is the number of cases in area i, ρ is the prevalence of the disease, h i is a heterogeneity random effect and b i is a spatially correlated random effect. ) ( ~ i i Po O i i i b h ) log( ) , 0 ( ~ h i v N h ) (, ~ b i v CAR b es por esofago 1.5 a200 (161) 1.3 a 1.5 (262) 1.1 a 1.3 (892) 1 a 1.1 (404) 0.95 a 1.05 (1135) 0.91 a 0.95 (470) 0.77 a 0.91 (1853) 0.67 a 0.77 (1579) 0 a 0.67 (1436) RME. Esófago 1.5 a200 (1378) 1.3 a 1.5 (249) 1.1 a 1.3 (304) 1 a 1.1 (72) 0.95 a 1.05 (157) 0.91 a 0.95 (60) 0.77 a 0.91 (228) 0.67 a 0.77 (171) 0 a 0.67 (5573) A Comparison of INLA and MCMC for the Estimation of Smoothed Risk Maps in Epidemiology Rebeca Ramis 13 , Virgilio Gómez-Rubio 2 , Peter J. Diggle 1 & Gonzalo López- Abente 3 1 Division of Medicine, Lancaster University, UK. {r.ramis,p.diggle}@lancaster.ac.uk 2 Departamento de Matemáticas, Universidad de Castilla-La Mancha, Spain. [email protected] 3 Centro Nacional de Epidemiología, Instituto de Salud ’Carlos III’, Madrid, Spain. [email protected] As in many other Bayesian models, MCMC is often used to estimate the posterior distribution of the parameters of interest; with the main disadvantage of being computationally intensive. Specially when the number of areas is high. (We work with the 8068 Spanish towns) In contrast, INLA (Rue, Martino and Chopin,2009, JRSS-B, 71:319–392) has recently offered a different alternative with a negligible computational burden. Our aim is to compare the performance and accuracy of both techniques using a factorial experiment analysis over the real geographical distribution of 8068 small areas in which Spain in divided. Factorial experiment where: •i = 1,2,3 levels of the spatial autocorrelation term variance (v b ) j = 1,2,3 levels of the heterogeneity term variance (v h ) •k =1,2,3 levels of prevalence (ρ) We repeat the experiment with INLA and MCMC (WinBUGS) to assess and compare their performance. In Epidemiology, spatial disease mapping models are commonly used for the estimation of smoothed risk maps. In this context, the Besag, York and Mollie model has been widely used: k i=1 i=2 i=3 1 0.899 0.913 0.912 j=1 2 0.918 0.920 0.914 3 0.929 0.928 0.924 1 0.919 0.937 0.911 j=2 2 0.929 0.940 0.915 3 0.932 0.940 0.920 1 0.933 0.966 0.923 j=3 2 0.943 0.965 0.941 3 0.943 0.962 0.945 k i=1 i=2 i=3 1 0.895 0.905 0.904 j=1 2 0.901 0.899 0.904 3 0.901 0.899 0.902 1 0.895 0.908 0.894 j=2 2 0.895 0.906 0.897 3 0.897 0.901 0.899 1 0.945 0.925 0.873 j=3 2 0.939 0.910 0.905 3 0.932 0.911 0.909 Results Our results show that both techniques produce similar estimations. Here there are some examples of these results for some of the simulated data μ INLA and μ MCMC are very similar however the standard errors (se) show a different behaviours. For INLA estimations μ and se are independent however for MCMC estimations se increase with increasing value of μ. ^ ^ ^ ^ INLA y MCMC y y ijk = mean(90% CI Empirical Coverage for scenario ijk) We carry out a factorial experiment with 3 factors: ρ (prevalence), v h (variance of heterogeneity term) y v b , (variance of spatial autocorrelation term). We define 3 levels for each factor: low (1), medium (2) and high (3). These combinations of factors try to reproduce various real scenarios of chronic disease outcomes in Spain. We simulate 25 datasets for the 27 different scenarios and then we compute the 90% CI Empirical Coverage. We take the mean of the 25 replications in each scenario. • 90% CI Empirical Coverage for INLA estimations are superior to 90% for all combinations but 1.1.1 • MCMC coverage intervals are almost 90% for scenarios with j=1 and j=2, however for j=3 they are superior but 3.3.1 • INLA results show variation along factors levels. Increases in heterogeneity term variance (j) and in prevalence (k) produce increases in the 90% CI Empirical Coverage. But increases in spatial autocorrelation term variance (v h ) do not produce the same effect. • MCMC results are not affected for changes in the factors levels. C entro de Investigación B iom édica en red Epidem iología y Salud Pública c i ber e sp ... Background Concluding Remarks For situations with a high number of small areas, some remarks should be taken into account in order to choose a technique to estimate risk maps. INLA and MCMC techniques estimate similar smoothed risk maps. INLA standard errors are larger. • For scenarios with higher heterogeneity both techniques produce wider intervals. Scenario.1.1.1 (simulation 1) Scenario.2.2.2 (simulation 1) Scenario.3.3.3 (simulation 1) vs se INLA ˆ vs se MCMC ˆ INLA vs MCMC ˆ 1 2 3 4 5 6 0.2 0.4 0.6 0.8 1.0 M ean S tandard Error 1 2 3 4 5 6 0 1 2 3 4 5 6 INLA MCMC 0 1 2 3 4 5 6 0.0 0.5 1.0 1.5 2.0 M ean S tandard Error 0 1 2 3 4 0 1 2 3 4 INLA MCMC 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.15 0.20 0.25 0.30 0.35 0.40 0.45 M ean S tandard E rror 0 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0 1.2 M ean S tandard E rror 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 INLA MCMC 0.5 1.0 1.5 2.0 2.5 0.10 0.15 0.20 0.25 M ean S tandard E rror 1 2 3 4 0.0 0.1 0.2 0.3 0.4 0.5 M ean S tandard Error 90% CI Empirical Coverage

Transcript of Besag, York and Molie model O i is the number of cases in area i, ρ is the prevalence of the...

  • Slide 1

Besag, York and Molie model O i is the number of cases in area i, is the prevalence of the disease, h i is a heterogeneity random effect and b i is a spatially correlated random effect. A Comparison of INLA and MCMC for the Estimation of Smoothed Risk Maps in Epidemiology Rebeca Ramis 13, Virgilio Gmez-Rubio 2, Peter J. Diggle 1 & Gonzalo Lpez-Abente 3 1 Division of Medicine, Lancaster University, UK. {r.ramis,p.diggle}@lancaster.ac.uk 2 Departamento de Matemticas, Universidad de Castilla-La Mancha, Spain. [email protected] 3 Centro Nacional de Epidemiologa, Instituto de Salud Carlos III, Madrid, Spain. [email protected] As in many other Bayesian models, MCMC is often used to estimate the posterior distribution of the parameters of interest; with the main disadvantage of being computationally intensive. Specially when the number of areas is high. (We work with the 8068 Spanish towns) In contrast, INLA (Rue, Martino and Chopin,2009, JRSS-B, 71:319392) has recently offered a different alternative with a negligible computational burden. Our aim is to compare the performance and accuracy of both techniques using a factorial experiment analysis over the real geographical distribution of 8068 small areas in which Spain in divided. Factorial experiment where: i = 1,2,3 levels of the spatial autocorrelation term variance (v b ) j = 1,2,3 levels of the heterogeneity term variance (v h ) k =1,2,3 levels of prevalence () We repeat the experiment with INLA and MCMC (WinBUGS) to assess and compare their performance. In Epidemiology, spatial disease mapping models are commonly used for the estimation of smoothed risk maps. In this context, the Besag, York and Mollie model has been widely used: ki=1i=2i=3 1 0.8990.9130.912 j=1 2 0.9180.9200.914 3 0.9290.9280.924 1 0.9190.9370.911 j=2 2 0.9290.9400.915 3 0.9320.9400.920 1 0.9330.9660.923 j=3 2 0.9430.9650.941 3 0.9430.9620.945 ki=1i=2i=3 1 0.8950.9050.904 j=1 2 0.9010.8990.904 3 0.9010.8990.902 1 0.8950.9080.894 j=2 2 0.8950.9060.897 3 0.9010.899 1 0.9450.9250.873 j=3 2 0.9390.9100.905 3 0.9320.9110.909 Results Our results show that both techniques produce similar estimations. Here there are some examples of these results for some of the simulated data INLA and MCMC are very similar however the standard errors (se) show a different behaviours. For INLA estimations and se are independent however for MCMC estimations se increase with increasing value of . ^^ ^ ^ y ijk = mean(90% CI Empirical Coverage for scenario ijk) We carry out a factorial experiment with 3 factors: (prevalence), v h (variance of heterogeneity term) y v b, (variance of spatial autocorrelation term). We define 3 levels for each factor: low (1), medium (2) and high (3). These combinations of factors try to reproduce various real scenarios of chronic disease outcomes in Spain. We simulate 25 datasets for the 27 different scenarios and then we compute the 90% CI Empirical Coverage. We take the mean of the 25 replications in each scenario. 90% CI Empirical Coverage for INLA estimations are superior to 90% for all combinations but 1.1.1 MCMC coverage intervals are almost 90% for scenarios with j=1 and j=2, however for j=3 they are superior but 3.3.1 INLA results show variation along factors levels. Increases in heterogeneity term variance (j) and in prevalence (k) produce increases in the 90% CI Empirical Coverage. But increases in spatial autocorrelation term variance (v h ) do not produce the same effect. MCMC results are not affected for changes in the factors levels. Background Concluding Remarks For situations with a high number of small areas, some remarks should be taken into account in order to choose a technique to estimate risk maps. INLA and MCMC techniques estimate similar smoothed risk maps. INLA standard errors are larger. For scenarios with higher heterogeneity both techniques produce wider intervals. Scenario.1.1.1 (simulation 1) Scenario.2.2.2 (simulation 1) Scenario.3.3.3 (simulation 1) vs se INLA vs se MCMC INLA vs MCMC 90% CI Empirical Coverage