On Model Selection Criteria for Climate Change Impact Studies · Motivation Empirical Practice:...
-
Upload
trinhkhanh -
Category
Documents
-
view
213 -
download
0
Transcript of On Model Selection Criteria for Climate Change Impact Studies · Motivation Empirical Practice:...
On Model Selection Criteria for Climate Change Impact Studies
Xiaomeng Cui Dalia Ghanem Todd KuffnerUC Davis UC Davis WUSTL
December 8, 2017University of Southern California
1 / 47
Motivation
Empirical Literature
Setting: For i = 1, 2, . . . , n, t = 1, 2, . . . ,T , we observe a scalar outcome Yit ; inaddition, for each i and t, we observe a regressor Witτ at a higher frequencyτ = 1, 2, . . . ,H.
I Weather and Climate Change Impacts (Review: Dell, Jones and Olken, JEL2014)Climate Change and Agriculture: Deschenes and Greenstone (2007, AER),Fisher et al (2012, AER), Schlenker and Roberts (2009, PNAS), Schlenkerand Lobell (2010, ERL), Feng, Krueger and Oppenheimer (2010, PNAS),Lobell, Schlenker and Costa-Roberts (2011, Science), Cui (2017), ...Climate Change and Health: Deschenes, Greenstone and Guryan (2009,AER), Deschenes and Greenstone (2011, AEJ: Applied), ...
I Pollution ImpactsBoone et al (2013), Burney and Ramanathan (2014, PNAS), Carter et al(2016, Scientific Reports)
2 / 47
Motivation
Empirical Practice: CHOOSE summary statistics of T kit ({Witτ}Hτ=1) as
regressors in a linear fixed effects model
yit =K∑
k=1
βkT kit ({Witτ}Hτ=1) + ai + uit (1)
Note: straightforward to accommodate year fixed effects and other control variables
Annual Mortality Rate Daily Temperature
1970 1975 1980 1985
0.000
0.005
0.010
0.015
0.020
Year
Annua
l Mort
ality R
ate
1970 1975 1980 1985
020
4060
80
Year
Daily
Mean
Temper
ature
Examples of Summary Statistics
- temperature bins
- various degree day measures
- annual or seasonal means
- quadratic function of annual mean
- ...
3 / 47
Motivation
Empirical Practice: CHOOSE summary statistics of T kit ({Witτ}Hτ=1) as
regressors in a linear fixed effects model
yit =K∑
k=1
βkT kit ({Witτ}Hτ=1) + ai + uit (1)
Note: straightforward to accommodate year fixed effects and other control variables
Annual Mortality Rate Daily Temperature
1970 1975 1980 1985
0.000
0.005
0.010
0.015
0.020
Year
Annua
l Mort
ality R
ate
1970 1975 1980 1985
020
4060
80
Year
Daily
Mean
Temper
ature
Examples of Summary Statistics
- temperature bins
- various degree day measures
- annual or seasonal means
- quadratic function of annual mean
- ...
3 / 47
Motivation
Goals of the Literature
1. Causal Inferenceidentify the damage/response function that governs the relationshipbetween Yit and the high-frequency regressor {Witτ}Hτ=1
2. Predictionestimate the impact of future climate change on outcomes of interest
This paper
I importance of model selection in this problem
I special features of the model selection problem faced by empirical researchersI properties of cross-validation (CV) and information criteria (IC) as model
selection tools for climate change impact studies- importance of guarding against over-fitting- mean squared error as a loss function
Disclaimer: the paper will focus on linear fixed effects models only.
4 / 47
Outline of the Talk
Why Model Selection in Climate Change Impact Studies?
Model Selection Problem
Overview of Model Selection Criteria
Simulation Study
Empirical Illustration
5 / 47
Why Model Selection in Climate Change Impact Studies?An Economic Reason
Consider the following models:
I Annual Mean (A): Yit = −0.5Wit + ai + uitI Quadratic in Annual Mean (QinA): Yit = 0.2Wit − 0.02W 2
it + ai + uit
I Quarterly Mean (Q): Yit = 0.5WQ1it − 0.75WQ2
it + ai + uit
Different damages functions have different policy implications!
I Differences in responses across space and seasons- A: an additional degree in annual mean temperature will affect all areas in
the same way.- QinA: an additional degree in annual mean temperature will be more/less
harmful in warmer regions- Q: additional warming in the summer, but beneficial in the winter
I Different adaptation mechanismsE.g., suppose the outcome is agricultural yield,
- QinA: agriculture will move to historically cooler areas as they get warmer- Q: growing season can be shifted within the year in regions with warming
patterns in the winter
I Different predictions for future damages due to climate change
6 / 47
Why Model Selection in Climate Change Impact Studies?An Economic Reason
Consider the following models:
I Annual Mean (A): Yit = −0.5Wit + ai + uitI Quadratic in Annual Mean (QinA): Yit = 0.2Wit − 0.02W 2
it + ai + uit
I Quarterly Mean (Q): Yit = 0.5WQ1it − 0.75WQ2
it + ai + uit
Different damages functions have different policy implications!
I Differences in responses across space and seasons- A: an additional degree in annual mean temperature will affect all areas in
the same way.- QinA: an additional degree in annual mean temperature will be more/less
harmful in warmer regions- Q: additional warming in the summer, but beneficial in the winter
I Different adaptation mechanismsE.g., suppose the outcome is agricultural yield,
- QinA: agriculture will move to historically cooler areas as they get warmer- Q: growing season can be shifted within the year in regions with warming
patterns in the winter
I Different predictions for future damages due to climate change
6 / 47
Why Model Selection in Climate Change Impact Studies?A Statistical Reason
Simulation IllustrationDesignn = 100, T = 5, S = 500
I Wit is a random sample of counties from the NCDC temperature dataset forthe years 1968-1972
I “Fixed Effects”: ai |Wi ∼ N(0.5Wi , 1)
I Idiosyncratic Shocks: uit |Wi , ai are i.i.d. across i , mixture normal, mean 0,heteroskedastic and correlated across time
Design Details
I Outcome is generated using the following models:(i) Annual Mean: Yit = Wit + ai + uit
(ii) Quadratic in Annual Mean: Yit = 0.2Wit − 0.05W 2it + ai + uit
(iii) Quarterly Mean: Yit = −0.5WQ1it + 0.75W
Q2it + ai + uit
7 / 47
Why Model Selection in Climate Change Impact Studies?A Statistical Reason
Models Considered
- “Seasonal” Mean ModelsA Annual Mean ModelB Bi-Annual Mean ModelQ Quarterly Mean ModelM Monthly Mean Model
- Other ModelsQinA Quadratic in Annual Mean ModelBins 10◦ F bins
Simulation Summary StatisticsFor each model considered, we report:
- simulation Mean and S .D. of linear FE estimator
- probability of rejecting a 5% significance test (p0.05)
8 / 47
Why Model Selection in Climate Change Impact Studies?A Statistical Reason
DGP (Y ) Annual Mean Quadratic inAnnual Mean
Quarterly Mean
Model Mean S.D. p0.05 Mean S.D. p0.05 Mean S.D. p0.05
A A Mean 1 0.05 1 -5.50 0.05 1 -0.27 0.05 0.86
B B1 Mean 0.50 0.04 1 -2.81 0.04 1 -0.82 0.04 1B2 Mean 0.50 0.03 1 -2.69 0.03 1 0.30 0.03 1
Q Q1 Mean 0.25 0.02 1 -1.42 0.02 1 -0.50 0.02 1Q2 Mean 0.25 0.04 1 -1.27 0.04 1 0 0.04 0.06Q3 Mean 0.24 0.04 1 -1.31 0.04 1 0.75 0.04 1Q4 Mean 0.25 0.02 1 -1.36 0.02 1 0 0.02 0.04
M M1 Mean 0.08 0.02 1 -0.48 0.02 1 -0.17 0.02 1M2 Mean 0.08 0.02 0.92 -0.48 0.02 1 -0.17 0.02 1M3 Mean 0.08 0.01 1 -0.49 0.01 1 -0.17 0.01 1M4 Mean 0.08 0.03 0.89 -0.32 0.03 1 0 0.03 0.06M5 Mean 0.09 0.03 0.94 -0.54 0.03 1 0 0.03 0.07M6 Mean 0.08 0.03 0.78 -0.49 0.03 1 0 0.03 0.06M7 Mean 0.08 0.04 0.67 -0.34 0.04 1 0.25 0.04 1M8 Mean 0.08 0.03 0.82 -0.43 0.03 1 0.25 0.03 1M9 Mean 0.08 0.03 0.69 -0.46 0.03 1 0.25 0.03 1M10 Mean 0.08 0.03 0.84 -0.36 0.03 1 0 0.03 0.09M11 Mean 0.08 0.03 0.81 -0.53 0.03 1 0 0.03 0.05M12 Mean 0.09 0.02 0.99 -0.45 0.02 1 0 0.02 0.05
Notes: Qj denotes the jth quarter, Mj denotes the jth month.
9 / 47
Why Model Selection in Climate Change Impact Studies?A Statistical Reason
DGP(Y ) Annual Mean Quadratic inAnnual Mean
Quarterly Mean
Model Mean S.D. p0.05 Mean S.D. p0.05 Mean S.D. p0.05
QinA A Mean 1.01 0.32 0.89 0.21 0.32 0.14 1.91 0.32 0.95
A Mean2 0 0 0.06 -0.05 0 1 -0.02 0 0.99
Bins Bin (−∞, 10) -0.15 0.02 1 0.74 0.02 1 0.13 0.02 1Bin [10, 20) -0.11 0.01 1 0.55 0.01 1 0.14 0.01 1Bin [20, 30) -0.08 0.01 1 0.43 0.01 1 0.08 0.01 1Bin [30, 40) -0.06 0.01 1 0.29 0.01 1 0.02 0.01 0.80Bin [40, 50) -0.03 0.01 0.99 0.17 0.01 1 0.01 0.01 0.16Bin [60, 70) 0.03 0.01 0.96 -0.18 0.01 1 -0.04 0.01 1Bin [70, 80) 0.05 0.01 1 -0.31 0.01 1 0.01 0.01 0.01Bin [80, 90) 0.07 0.01 1 -0.42 0.01 1 0.04 0.01 0.75Bin [90,+∞) 0.24 0.18 0.37 -1.62 0.18 1 0.97 0.18 1
10 / 47
Why Model Selection in Climate Change Impact Studies?A Statistical Reason
DGP(Y ) Quadratic in Annual Mean
n =100 n=500 n=1000
A A Mean -5.50 0.05 1 -5.26 0.02 1 -5.25 0.02 1
QinA A Mean 0.21 0.32 0.14 0.19 0.15 0.26 0.20 0.11 0.46
A Mean2 -0.05 0 1 -0.05 0 1 -0.05 0 1
Q Q Mean 1 -1.42 0.02 1 -1.33 0.01 1 -1.33 0.01 1Q Mean 2 -1.27 0.04 1 -1.20 0.02 1 -1.22 0.01 1Q Mean 3 -1.31 0.04 1 -1.32 0.02 1 -1.32 0.01 1Q Mean 4 -1.36 0.02 1 -1.31 0.01 1 -1.31 0.01 1
Model Selection based on “Stars” (p-hacking)
I if one model is true, all other models will likely have significant coefficients
I less likely to choose QinA model when it is true
11 / 47
Model Selection Problem
Set of Models Considered: {Mj}Jj=1 where J <∞For each j , Mj
Yit =X jitβ
j + aji + ujit
where
- kj ≡ dim((X jit)′) = dim(βj )
- Wit ≡ {Witτ}Hτ=1
- X jit = fj (Wit), where X j
it is a 1× kj vector.
Remarks
I All models considered are linear in the parameters and separable in aji and ujitI We assume that the true model M∗ =Mj for some j .
I The key difference between the models is the included regressors.
I Additional regressors and fixed effects can be readily accommodated.
Special feature of this problem is that all models considered consist of differentsummary statistics of the same underlying high-frequency variables.
12 / 47
Model Selection Problem: Definitions
For two models M1 and M2, assume wlog k1 < k2.
I nested: if M1 is a restricted version of M2.Example: annual and quarterly mean model
I overlapping: if restricted versions of M1 and M2 yield the same model.Example: quadratic in annual mean and quarterly mean model
I strictly non-nested : are neither nested nor overlapping
13 / 47
Overview of Model Selection Criteria
I Cross ValidationMCCV (Monte Carlo Cross-Validation)
- Create b random subsamples (training sets) of the original sample- For each subsample, estimate each model under consideration (training)- Use the complement of each subsample (test data) to predict each model
out-of-sample- Calculate the mean squared error (MSE) across subsamples- Decision Rule: Choose model that minimizes the MSE of the MCCV
procedure
Source: https://stats.stackexchange.com/questions/51416/k-fold-vs-monte-carlo-cross-validation
14 / 47
Overview of Model Selection Criteria
I Information CriteriaFor linear models
GICλn = n log(MSEn) + λnkj (2)
- εi = yi − y ji
- MSEn =∑n
i=1 ε2i /n is the MSE of the original sample
- λn is the penalty term for the dimension of the model
AIC: λn = 2BIC: λn = log(n)
- kj is the number of parameters in Mj
- Decision Rule: Choose model that minimizes the information criterion.
15 / 47
Overview of Model Selection Criteria: Consistent Model Selection
Definition: P(choosing the true model)→ 1 as n→∞
I MC Cross-validation (Shao 1993)
- MCCV can deliver consistent model selection if training sample size (nc) isproportionately smaller with larger samples and proportionately more to thetesting sample.nc/n → 0
- Formal results only available/possible for homoskedastic and seriallyuncorrelated errors.
I Information Criteria (Vuong 1989, Shao 1997, Sin and White 1996, Hong andPreston 2012)
- AIC tends to overfit.- BIC is consistent, but not for non-nested model selection.
- GICλn with λn =√
n log log(n) is consistent for all cases.- Formal results can accommodate heteroskedasticity and serial correlation.
In practice, we never know the true model, so what?
I if a model that nests the true model is considered, then a consistentprocedure will choose that model.
I if no model nests the true model is considered, then a consistent procedurewill choose the model with the smallest mean squared error asymptotically.
16 / 47
Overview of Model Selection Criteria: Consistent Model Selection
Definition: P(choosing the true model)→ 1 as n→∞
I MC Cross-validation (Shao 1993)
- MCCV can deliver consistent model selection if training sample size (nc) isproportionately smaller with larger samples and proportionately more to thetesting sample.nc/n → 0
- Formal results only available/possible for homoskedastic and seriallyuncorrelated errors.
I Information Criteria (Vuong 1989, Shao 1997, Sin and White 1996, Hong andPreston 2012)
- AIC tends to overfit.- BIC is consistent, but not for non-nested model selection.
- GICλn with λn =√
n log log(n) is consistent for all cases.- Formal results can accommodate heteroskedasticity and serial correlation.
In practice, we never know the true model, so what?
I if a model that nests the true model is considered, then a consistentprocedure will choose that model.
I if no model nests the true model is considered, then a consistent procedurewill choose the model with the smallest mean squared error asymptotically.
16 / 47
Simulation Study
PreviewI When the true model is considered or the models considered nest it.
- AIC and MCCV-p tend to overfit, with the former fairing worse in thatregard than the latter.
- BIC, MCCV-Shao, SW1 and SW2 choose the true model/the mostparsimonious model that nests the true model.
I When the true model is not considered and the models considered do not nestit.
- AIC, MCCV-p, BIC and MCCV-Shao tend to overfit.- SW1 and SW2 choose the most parsimonious model with the smallest MSE.
However, in this case, minimizing MSE may be misleading in terms of whatare important aspects of the damage function.
What is interesting here is that the relationship between the considered modelsand the true model is what matters for consistency, not how models relate to eachother as in “classical model selection problems”.
17 / 47
Simulation Study
Back to the Previous Design
Model Selection ProblemsNested Models
A Annual Mean Model
B Bi-Annual Mean Model
Q Quarterly Mean Model
M Monthly Mean Model(X j
itβj =
∑12s=1 β
jsW
Msit , where W
Msit =
∑τ∈Ms
Witτ/|Ms |)
Overlapping/Non-nested Models
A Annual Mean Model
QinA Quadratic in Annual Mean Model
Q Quarterly Mean Model
Bins 10◦ F bins Model
I We will consider the choice between 2, 3 and all models in each problem.
18 / 47
Simulation Study
Model Selection CriteriaI MCCV (nc/n)
- p = 0.75 (MCCV-p)
- pn = n−1/4 (MCCV-Shao)
- λnT = 2 (AIC)- λnT = log(nT ) (BIC)
- λnT =√
nT log(log(nT )) (SW1)
- λnT =√
nT log(nT ) (SW2)
0 2000 4000 6000 8000 10000
050
100
150
N
Pena
lty
BIC
SW1SW2
19 / 47
Simulation Study
DGP=A (Annual Mean): Yit = Wit + ai + uit
Nested ModelsA B Q M
Sim MSE 0.583 0.584 0.584 0.583
n = 100 n = 500MCCV GICλnT
MCCV GICλnT
Models p Shao AIC BIC SW1 SW2 p Shao AIC BIC SW1 SW2
A 0.99 1 0.88 1 1 1 0.99 1 0.89 1 1 1M 0.01 0 0.12 0 0 0 0.01 0 0.11 0 0 0
A 0.88 0.95 0.78 0.98 1 1 0.88 0.98 0.80 0.98 1 1B 0.12 0.05 0.22 0.02 0 0 0.12 0.02 0.20 0.02 0 0
A 0.94 1 0.79 1 1 1 0.93 1 0.82 1 1 1Q 0.06 0 0.21 0 0 0 0.07 0 0.18 0 0 0
A 0.87 0.95 0.71 0.98 1 1 0.87 0.98 0.74 0.98 1 1B 0.12 0.05 0.19 0.02 0 0 0.12 0.02 0.17 0.02 0 0M 0.01 0 0.10 0 0 0 0.01 0 0.09 0 0 0
A 0.93 1 0.75 1 1 1 0.92 1 0.77 1 1 1Q 0.06 0 0.17 0 0 0 0.07 0 0.16 0 0 0M 0.01 0 0.07 0 0 0 0.01 0 0.08 0 0 0
A 0.85 0.95 0.68 0.97 1 1 0.84 0.98 0.72 0.98 1 1B 0.11 0.05 0.17 0.02 0 0 0.11 0.02 0.15 0.02 0 0Q 0.03 0 0.15 0 0 0 0.05 0 0.13 0 0 0
B 0.92 0.99 0.71 0.99 1 1 0.87 1 0.69 1 1 1Q 0.07 0.01 0.20 0.01 0 0 0.11 0 0.21 0 0 0M 0.01 0 0.09 0 0 0 0.02 0 0.10 0 0 0
A 0.84 0.95 0.64 0.97 1 1 0.83 0.98 0.68 0.98 1 1B 0.11 0.05 0.15 0.02 0 0 0.11 0.02 0.13 0.02 0 0Q 0.03 0 0.13 0 0 0 0.05 0 0.11 0 0 0M 0.01 0 0.07 0 0 0 0.01 0 0.08 0 0 0
I MCCV-p and AIC is not vanishing as n increases.
Definition. Sim MSE is the mean squared error for the full sample of 3074 counties using the
simulation “pseudo-true”, i.e. simulation mean of the estimated parameters of each model.20 / 47
Simulation Study
DGP=A (Annual Mean): Yit = Wit + ai + uit
Non-Nested ModelsA QinA Q Bins
Sim MSE 0.583 0.583 0.584 0.610
n = 100 n = 500MCCV GICλnT
MCCV GICλnT
Models p Shao AIC BIC SW1 SW2 p Shao AIC BIC SW1 SW2
A 0.87 0.95 0.77 0.95 1 1 0.88 0.98 0.76 0.97 1 1QinA 0.13 0.05 0.23 0.05 0 0 0.12 0.02 0.24 0.03 0 0
A 1 1 0.98 1 1 1 1 1 1 1 1 1Bins 0 0 0.02 0 0 0 0 0 0 0 0 0
QinA 0.99 1 0.98 1 1 1 1 1 1 1 1 1Bins 0.01 0 0.02 0 0 0 0 0 0 0 0 0
A 0.87 0.95 0.75 0.95 1 1 0.98 0.81 0.76 0.97 1 1QinA 0.13 0.05 0.23 0.05 0 0 0.02 0.19 0.24 0.03 0 0Bins 0 0 0.02 0 0 0 0 0 0 0 0 0
A 0.94 1 0.77 1 1 1 0.93 1 0.81 1 1 1Q 0.06 0 0.20 0 0 0 0.07 0 0.19 0 0 0
Bins 0 0 0.02 0 0 0 0 0 0 0 0 0
A 0.82 0.95 0.62 0.95 1 1 0.82 0.98 0.62 0.97 1 1QinA 0.13 0.05 0.20 0.05 0 0 0.11 0.02 0.22 0.03 0 0
Q 0.05 0 0.18 0 0 0 0.07 0 0.16 0 0 0
QinA 0.82 0.93 0.72 0.99 1 1 0.83 0.99 0.77 1 1 1Q 0.17 0.07 0.26 0.01 0 0 0.17 0.01 0.23 0 0 0
Bins 0.01 0 0.02 0 0 0 0 0 0 0 0 0
A 0.82 0.95 0.60 0.95 1 1 0.82 0.98 0.62 0.97 1 1QinA 0.13 0.05 0.20 0.05 0 0 0.11 0.02 0.22 0.03 0 0
Q 0.05 0 0.18 0 0 0 0.07 0 0.16 0 0 0Bins 0 0 0.02 0 0 0 0 0 0 0 0 0
I All models considered nest the true model, except the Bins, which has a larger MSE.
I MCCV-Shao, BIC, SW1 and SW2 exhibit consistency.
21 / 47
Simulation Study
DGP=Q (Quarterly Mean): Yit = −0.25WQ1it + 0.75WQ2
it + ai + uit
Nested ModelsA B Q M
Sim. MSE 1.57 1.22 0.58 0.58
n = 100 n = 500MCCV GICλnT
MCCV GICλnT
Models p Shao AIC BIC SW1 SW2 p Shao AIC BIC SW1 SW2
A 0 0 0 0 0 1 0 0 0 0 0 0M 1 1 1 1 1 0 1 1 1 1 1 1
A 0 0 0 0 0 0 0 0 0 0 0 0B 1 1 1 1 1 1 1 1 1 1 1 1
A 0 0 0 0 0 0 0 0 0 0 0 0Q 1 1 1 1 1 1 1 1 1 1 1 1
A 0 0 0 0 0 0 0 0 0 0 0 0B 0 0 0 0 0.01 1 0 0 0 0 0 0M 1 1 1 1 0.99 0 1 1 1 1 1 1
A 0 0 0 0 0 0 0 0 0 0 0 0Q 0.98 1 0.86 1 1 1 0.98 1 0.90 1 1 1M 0.02 0 0.14 0 0 0 0.02 0 0.10 0 0 0
A 0 0 0 0 0 0 0 0 0 0 0 0B 0 0 0 0 0 0 0 0 0 0 0 0Q 1 1 1 1 1 1 1 1 1 1 1 1
B 0 0 0 0 0 0 0 0 0 0 0 0Q 0.98 1 0.86 1 1 1 0.98 1 0.90 1 1 1M 0.02 0 0.14 0 0 0 0.02 0 0.10 0 0 0
A 0 0 0 0 0 0 0 0 0 0 0 0B 0 0 0 0 0 0 0 0 0 0 0 0Q 0.98 1 0.86 1 1 1 0.98 1 0.90 1 1 1M 0.02 0 0.14 0 0 0 0.02 0 0.10 0 0 0
I SW2 has a tendency to underfit for smaller sample sizes, but problem eventually subsides.
22 / 47
Simulation Study
DGP=Q (Quarterly Mean): Yit = −0.25WQ1it + 0.75WQ2
it + ai + uit
Non-nested ModelsA QinA Q Bins
Sim. MSE 1.57 1.57 0.58 1.14
n = 100 n = 500MCCV GICλnT
MCCV GICλnT
Models p Shao AIC BIC SW1 SW2 p Shao AIC BIC SW1 SW2
A 0.03 0.16 0.01 0.07 1 1 0.09 0.52 0.03 0.47 1 1QinA 0.97 0.84 0.99 0.93 0 0 0.91 0.48 0.97 0.53 0 0
A 0 0 0 0 1 1 0 0 0 0 0 1Bins 1 1 1 1 0 0 1 1 1 1 1 0
QinA 0 0 0 0 1 1 0 0 0 0 0 1Bins 1 1 1 1 0 0 1 1 1 1 1 0
A 0 0 0 0 1 1 0 0 0 0 0 1QinA 0 0 0 0 0 0 0 0 0 0 0 0Bins 1 1 1 1 0 0 1 1 1 1 1 0
A 0 0 0 0 0 0 0 0 0 0 0 0Q 1 1 1 1 1 1 1 1 1 1 1 1
Bins 0 0 0 0 0 0 0 0 0 0 0 0
A 0 0 0 0 0 0 0 0 0 0 0 0QinA 0 0 0 0 0 0 0 0 0 0 0 0
Q 1 1 1 1 1 1 1 1 1 1 1 1
QinA 0 0 0 0 0 0 0 0 0 0 0 0Q 1 1 1 1 1 1 1 1 1 1 1 1
Bins 0 0 0 0 0 0 0 0 0 0 0 0
A 0 0 0 0 0 0 0 0 0 0 0 0QinA 0 0 0 0 0 0 0 0 0 0 0 0
Q 1 1 1 1 1 1 1 1 1 1 1 1B 0 0 0 0 0 0 0 0 0 0 0 0
I BIC and MCCV-Shao exhibit inconsistency issues in (A vs. QinA).
I SW1 and SW2 underfit for smaller samples (A vs. Bins).
23 / 47
Simulation Study
DGP=Q (Quarterly Mean): Yit = −0.25WQ1it + 0.75WQ2
it + ai + uit
Non-nested ModelsA QinA Q Bins
Sim. MSE 1.57 1.57 0.58 1.14
n = 1000 n = 3000
Models AIC BIC SW1 SW2 AIC BIC SW1 SW2
A 0.02 0.40 1 1 0 0 1 1QinA 0.98 0.60 0 0 1 1 0 0
A 0 0 0 0.62 0 0 0 0Bins 1 1 1 0.38 1 1 1 1
QinA 0 0 0 0 0 0 0 0Bins 1 1 1 1 1 1 1 1
A 0 0 0 0.62 0 0 0 0QinA 0 0 0 0 0 0 0 0Bins 1 1 1 0.38 1 1 1 1
A 0 0 0 0 0 0 0 0Q 1 1 1 1 1 1 1 1
Bins 0 0 0 0 0 0 0 0
A 0 0 0 0 0 0 0 0QinA 0 0 0 0 0 0 0 0
Q 1 1 1 1 1 1 1 1
QinA 0 0 0 0 0 0 0 0Q 1 1 1 1 1 1 1 1
Bins 0 0 0 0 0 0 0 0
A 0 0 0 0 0 0 0 0QinA 0 0 0 0 0 0 0 0
Q 1 1 1 1 1 1 1 1Bins 0 0 0 0 0 0 0 0
I BIC and MCC-Shao exhibit remain inconsistent for larger samples (A vs. QinA).
I Underfitting problem of SW1 and SW2 subsides (A vs. Bins).
24 / 47
Simulation Study
DGP=QinA (Quadratic in Annual Mean): Yit = 0.2Wit − 0.05W 2it + ai + uit
Nested ModelsA B Q M
Sim MSE 1.05 1.06 1.04 0.97
n = 1000 n = 3000
Models AIC BIC SW1 SW2 AIC BIC SW1 SW2
A 0 0 1 1 0 0 1 1M 1 1 0 0 1 1 0 0
A 1 1 1 1 1 1 1 1B 0 0 0 0 0 0 0 0
A 0.05 0.44 1 1 0 0.01 1 1Q 0.95 0.56 0 0 1 0.99 0 0
A 0 0 1 1 0 0 1 1B 0 0 0 0 0 0 0 0M 1 1 0 0 1 1 0 0
A 0 0 1 1 0 0 1 1Q 0 0 0 0 0 0 0 0M 1 1 0 0 1 1 0 0
A 0.05 0.44 1 1 0 0.01 1 1B 0 0 0 0 0 0 0 0Q 0.95 0.56 0 0 1 0.99 0 0
B 0 0 1 1 0 0 1 1Q 0 0 0 0 0 0 0 0M 1 1 0 0 1 1 0 0
A 0 0 1 1 0 0 1 1B 0 0 0 0 0 0 0 0Q 0 0 0 0 0 0 0 0M 1 1 0 0 1 1 0 0
I BIC is not consistent in the comparison between A and Q, which is interesting becausethey are nested models, but neither one of them nests the true model.
I MSE criterion may not be ideal for this problem, note that choosing monthly mean modelis misleading from a policy perspective.
25 / 47
Simulation Study
DGP=QinA (Quadratic in Annual Mean): Yit = 0.2Wit − 0.05W 2it + ai + uit
Non-nested ModelsA Q inA Q Bins
Sim MSE 1.05 0.58 1.04 1.59
n = 1000 n = 3000
AIC BIC SW1 SW2 AIC BIC SW1 SW2
A 0 0 0 0 0 0 0 0QinA 1 1 1 1 1 1 1 1
A 1 1 1 1 1 1 1 1Bins 0 0 0 0 0 0 0 0
QinA 1 1 1 1 1 1 1 1Bins 0 0 0 0 0 0 0 0
A 0 0 0 0 0 0 0 0QinA 1 1 1 1 1 1 1 1Bins 0 0 0 0 0 0 0 0
A 0.05 0.44 1 1 0 0.01 1 1Q 0.95 0.56 0 0 1 0.99 0 0
Bins 0 0 0 0 0 0 0 0
A 0 0 0 0 0 0 0 0QinA 1 1 1 1 1 1 1 1
Q 0 0 0 0 0 0 0 0
QinA 1 1 1 1 1 1 1 1Q 0 0 0 0 0 0 0 0
Bins 0 0 0 0 0 0 0 0
A 0 0 0 0 0 0 0 0QinA 1 1 1 1 1 1 1 1
Q 0 0 0 0 0 0 0 0Bins 0 0 0 0 0 0 0 0
I BIC is inconsistent in the choice between A and Q.
26 / 47
Simulation Study
Summary of Recommendations
I Information criteria are consistent under the most general conditions.
I AIC (MCCV-p) will tend to overfit.
I BIC (MCCV-Shao) leads to consistent model selection, unless no modelconsidered nests the true model.Important to consider the physical relationship to ensure that we are close tothe true model.
I SW1/SW2 leads to consistent model selection in general, however it mayunderfit in some settings.
27 / 47
Empirical Illustration
Data: a subsample of US counties in Deschenes & Greenstone (2011)(3074 counties in 49 states from 1968-1988)
Regression:
yit = X jitβ
j + θPit + δs t + ai + γtεit
I yit- overall annual mortality rate (number of deaths per 100,000, not age-adjusted)- age-specific annual mortality rate (number of deaths per 100,000) for < 1,
1− 45, 45− 65, > 65
I X jit = fj (Tit): temperature variables constructed based on daily average
temperature Tit ≡ {Titτ}Hτ=1
I Pit : annual total precipitation
I δs : state-level linear trends
I γt : year fixed effects
Note: This is NOT a replication of Deschenes and Greenstone (2011). The dataset in Deschenes
and Greenstone (2011) spans 1968-2002, use precipitation bins and age-adjusted mortality rate.
28 / 47
Empirical Illustration
Temperature specifications:
- No temperature variables (N)
- Annual mean (A)
- Quarterly mean (Q)
- Monthly mean (M)
- Quadratic in annual mean (QinA)
- Annual 20◦F bins (10◦Bins)reference bin: 40-60◦F
- Annual 10◦F bins (20◦Bins)reference bin: 40-50◦F
29 / 47
Empirical Illustration: Estimation Results
A Q M
Annual Mean -0.82(0.55)
Q1 Mean -0.85***(0.22)
Q2 Mean 0.09(0.43)
Q3 Mean 0.92*(0.45)
Q4 Mean -0.08(0.29)
M1 Mean -0.50**(0.18)
M2 Mean -0.21(0.18)
M3 Mean -0.34(0.22)
M4 Mean 0.50(0.27)
M5 Mean -0.12(0.26)
M6 Mean -0.70**(0.26)
M7 Mean 0.78*(0.33)
M8 Mean 0.43(0.32)
M9 Mean 0.01(0.29)
M10 Mean 0.53*(0.25)
M11 Mean 0.37(0.22)
M12 Mean -0.46*(0.19)
Observations 64,554 64,554 64,554
Notes: All regressions control for total precipitation, state linear trends, county and year fixed effects.Standard errors (in parentheses) are clustered at county-level. Significance: * (p < 0.05), ** (p < 0.01),*** (p < 01). 30 / 47
Empirical Illustration: Estimation Results
A QinA 20◦FBins
10◦FBins
Annual Mean -0.82 -2.52(0.55) (3.29)
Annual Mean2 0.02(0.03)
[−∞,20] 0.21*(0.09)
[20,40] 0.16**(0.06)
[60,80] -0.01(0.06)
[80,+∞] 0.11(0.08)
[−∞,10] 0.29*(0.13)
[10,20] 0.19(0.13)
[20,30] 0.17*(0.08)
[30,40] 0.20**(0.07)
[50,60] 0.05(0.07)
[60,70] 0.01(0.08)
[70,80] 0.03(0.08)
[80,+∞] 0.15(0.09)
Observations 64,554 64,554 64,554 64,554
Notes: All regressions control for total precipitation, state linear trends, county and year fixed effects.
Standard errors (in parentheses) are clustered at county-level. Significance: * (p < 0.05), ** (p < 0.01),
*** (p < 01).
31 / 47
Empirical Illustration: Model Selection
Nested ModelsN A Q M
MSE 13,317 13,316 13,313 13,307
AIC 613,193 613,192 613,182 613,171
BIC 613,819 613,827 613,844 613,906
SW1 640,241 640,632 641,798 644,923
SW2 671,398 672,240 674,761 681,499
MCCV 13423 13423 13422 13421
(p=0.75,b=1000)
MCCV-Shao 11958 11959 11962 11971
(p=0.13,b=6148)
Non-Nested ModelsN A Q in A Bin 20F Bin 10F
MSE 13,317 13,316 13,316 13,313 13,313
AIC 613,193 613,192 613,194 613,184 613,192
BIC 613,819 613,827 613,838 613,847 613,891
SW1 640,241 640,632 641,025 641,800 643,376
SW2 671,398 672,240 673,086 674,763 678,145
MCCV 13423 13423 13424 13422 13424
(p=0.75,b=1000)
MCCV-Shao 11958 11959 11960 11961 11967
(p=0.13,b=6148)
N: No temperature variables, A: Annual mean, Q: Quarterly mean, M: Monthly mean,QinA: Quadratic in annual mean, Bin 10F: Annual 10◦F bins (reference bin: 40-50◦F),Bin 20F: Annual 20◦F bins (reference bin: 40-60◦F) 32 / 47
Empirical Illustration: Estimation Results (Age: >65)
A Q M
Annual Mean -16.84***(3.48)
Q1 Mean -10.27***(1.56)
Q2 Mean -2.79(2.58)
Q3 Mean 7.47**(2.86)
Q4 Mean -2.64(1.85)
M1 Mean -3.30**(1.13)
M2 Mean -4.14***(1.17)
M3 Mean -3.87**(1.43)
M4 Mean 0.81(1.72)
M5 Mean -0.25(1.59)
M6 Mean -3.60*(1.72)
M7 Mean 2.85(2.04)
M8 Mean 3.13(1.93)
M9 Mean 1.21(1.85)
M10 Mean -0.19(1.69)
M11 Mean 1.59(1.23)
M12 Mean -2.40*(1.17)
Observations 64,554 64,554 64,554
Notes: All regressions control for total precipitation, state linear trends, county and year fixed effects.Standard errors (in parentheses) are clustered at county-level. Significance: * (p < 0.05), ** (p < 0.01),*** (p < 01). 33 / 47
Empirical Illustration: Estimation Results (Age: >65)
A QinA 20◦FBins
10◦FBins
Annual Mean -16.84*** -42.68*(3.48) (18.79)
Annual Mean2 0.25(0.17)
[−∞,20] 2.74***(0.62)
[20,40] 0.84*(0.35)
[60,80] -0.95*(0.37)
[80,+∞] -0.31(0.49)
[−∞,10] 3.25***-0.78
[10,20] 1.91*-0.88
[20,30] 0.57(0.52)
[30,40] 0.65(0.45)
[50,60] -0.65(0.45)
[60,70] -1.46**(0.50)
[70,80] -1.16*(0.49)
[80,+∞] -0.61(0.57)
Observations 64,554 64,554 64,554 64,554
Notes: All regressions control for total precipitation, state linear trends, county and year fixed effects.Standard errors (in parentheses) are clustered at county-level. Significance: * (p < 0.05), ** (p < 0.01),*** (p < 01).
34 / 47
Empirical Illustration: Model Selection (Age: >65)
Nested ModelsN A Q M
MSE 469,982 469,728 469,403 469,321
AIC 843,242 843,209 843,171 843,175
BIC 843,869 843,845 843,833 843,911
SW1 870,290 870,649 871,787 874,927
SW2 901,447 902,258 904,750 911,503
MCCV 472,738 472,506 472,246 472,333
(p=0.75,b=1000)
MCCV-Shao 479,201 479,033 478,954 479,489
(p=0.13,b=6148)
Non-Nested ModelsN A Q in A Bin 20F Bin 10F
MSE 469,982 469,728 469,712 469,579 469,579
AIC 843,242 843,209 843,209 843,195 843,203
BIC 843,869 843,845 843,853 843,857 843,902
SW1 870,290 870,649 871,041 871,811 873,387
SW2 901,447 902,258 903,101 904,774 908,156
MCCV 472,738 472,506 472,508 472,458 472,491
(p=0.75,b=1000)
MCCV-Shao 479,201 479,033 479,085 479,141 479,376
(p=0.13,b=6148)
N: No temperature variables, A: Annual mean, Q: Quarterly mean, M: Monthly mean,QinA: Quadratic in annual mean, Bin 10F: Annual 10◦F bins (reference bin: 40-50◦F),Bin 20F: Annual 20◦F bins (reference bin: 40-60◦F)
35 / 47
Empirical Illustration: Summary
I For the mortality rate > 65, BIC/MCCV-Shao choose the quarterly meanmodel.The result is intuitive, since it shows warming in the winter reduces themortality rate but warming in the summer increases it.
I For overall mortality rate and for all other age groups, all consistent modelselection criteria (BIC, etc.) choose a model with no temperaturevariables/annual mean model (which is not significant).
I The results are qualitatively consistent with Deschenes and Greenstone(2011), even though we use a different sample size and specification.
36 / 47
Concluding Remarks
Summary
I The model selection problem examined here is relevant whether we areinterested in examining climate change impacts or adaptation to climatechange.
I Practical recommendations- BIC behaves similarly to MCCV-Shao, and it is less computationally
intensive.- BIC is inconsistent when at least one of the models under consideration nests
the true model. (This is different from the typical model selection setting.)- SW1 can guard against overfitting, but it can underfit in some cases.
Further Questions/Caveats
I Prediction of impacts of future climate change- Is MSE the right criterion?
I Post-selection inference- Data splitting?
I Nonseparability in unobservables- e.g. random coefficient models
I Infinite-dimensional true model
Thank you!
37 / 47
Concluding Remarks
Summary
I The model selection problem examined here is relevant whether we areinterested in examining climate change impacts or adaptation to climatechange.
I Practical recommendations- BIC behaves similarly to MCCV-Shao, and it is less computationally
intensive.- BIC is inconsistent when at least one of the models under consideration nests
the true model. (This is different from the typical model selection setting.)- SW1 can guard against overfitting, but it can underfit in some cases.
Further Questions/Caveats
I Prediction of impacts of future climate change- Is MSE the right criterion?
I Post-selection inference- Data splitting?
I Nonseparability in unobservables- e.g. random coefficient models
I Infinite-dimensional true model
Thank you!37 / 47