Identification of Static and Dynamic Models of Strategic …doubleh/eco273B/dynunobshet.pdf ·...

Post on 23-Aug-2021

6 views 0 download

Transcript of Identification of Static and Dynamic Models of Strategic …doubleh/eco273B/dynunobshet.pdf ·...

Identification of Static and Dynamic Models ofStrategic Interactions

Lecture notes by Han Hong

Department of EconomicsStanford University

5th June 2007

Identifying Dynamic Discrete Decision Processes: two periods

• Thierry Magnac and David Thesmar (2002)

• Two period model.

• i ∈ I = 1, . . . ,K.• State variable h = (x , ε).

• ε = ε1, . . . , εK.• Period 1 utility: ui (x , ε).

• Next period state variables:

h′ =(x ′, ε′

)drawn conditional on h = (x , ε).

Assumptions

• Additive separability

∀i ∈ I , ui (x , ε) = u∗i (x) + εi ,

ε is independent of x .

• Conditional independence: Random preference shocks at twoperiods ε′ and ε are independent and indepedent of x andd = i .

• Discrete support: The support of first period state variables x(resp. second period x ′) is X (resp. X ′). The joint supportX = X ∪ X ′ is discrete and finite, i.e.,

X = x1, . . . , x#X

• Transition matrix of the (x , ε) process

P(h′|h, d

)= P

(x ′, ε′|x , d

)= G

(ε′)P

(x ′|x , d

)• Bellman equation

vi (x , ε) = ui (x) + εi + βE

(max

jvj

(x ′, ε′

)|x , d = i

)• Decompose

vi (x , ε) = v∗i (x) + εi

where

v∗i (x) = u∗i (x) + βE

(max

j

(v∗j

(x ′

)+ ε′j

)|x , d = i

)

• What can be recovered from the data:

∀(d , d ′) ∈ I 2,∀

(x , x ′

)∈ X × X ′,

P(d ′, x ′, d |x

)= P

(d ′|x ′

)P

(x ′|x , d

)P (d |x)

• P (x ′|x , d) is nonparametrically specified and is exactlyidentified from the data.

• Can P (d ′|x ′) and P (d |x) be used to identify the structuralparameters:

b = u∗1 (X ) , . . . , u∗K (X ) , v∗1

(X ′) , . . . , v∗

K

(X ′) ,G , β

where f (X ) is a short-cut for f (x) ,∀x ∈ X. For example,

u∗1 (X ) = u∗1 (x) ,∀x ∈ X.

• The probability that the agent chooses d = i given thestructure b and observable state variable x :

∀ (x , i) ∈ X × I

pi (x ; b) = P

(v∗i (x ; b) + εi = max

j

(v∗j (x ; b) + εj

)|x , b

).

• Definition of identification

• Observable vector of choice probabilities:

p (x) = (p1 (x) , . . . , pK (x))

• Number of observable choice probabilities and number ofparameters that can be identified.

• Mapping between the observable choice probabilities and thevector of value functions:

∀ (x , i) ∈ X × I , v∗i (x) = v∗

K (x) + qi (p (x) ;G )

• There is no loss of generality in setting v∗K (X ′) = 0. Then for

i = 1, . . . ,K − 1:

v∗i

(X ′) = qi

(p

(X ′) ;G

)• There is also no loss of generality setting u∗K (x) = 0.

• Expected second period value function: for

v∗ (x ′

)= v∗

1

(x ′

), . . . , v∗

K

(x ′

)

DefineR

(v∗ (

x ′);G

)= EG max

i∈I

(v∗i

(x ′

)+ εi

)Decompose the first period total utility function

v∗i (x) = u∗i (x) + βE

[R

(v∗ (

x ′);G

)|x , d = i

]• Recall

qi (x) = v∗i (x)− v∗

K (x)

Now

v∗i (x) = ui (x) + βE

[R

(v∗ (

x ′);G

)|x , d = i

]and since u∗K (x) = 0:

v∗K (x) = βE

[R

(v∗ (

x ′);G

)|x , d = K

]

• Combine these relations:

qi (x) =v∗i (x)− v∗

K (x)

=u∗i (x) + βE[R

(v∗ (

x ′);G

)|x , d = i

]− βE

[R

(v∗ (

x ′);G

)|x , d = K

]• Therefore u∗i (x) can be recovered by

u∗i (x) = qi (x)− βE [R (v∗ (x ′) ;G ) |x , d = i ]

+βE [R (v∗ (x ′) ;G ) |x , d = K ]

• So given β, G , u∗K (x) = 0, v∗K (x ′) = 0, the other utility

functions u∗i (x) , i = 1, . . . ,K − 1 and the other second periodvalue functions v∗

i (x ′) , i = 1, . . . ,K − 1 can be identified.

• Estimation follows from identification.

• Exclusion restrictions might identify β: if

∃ (x1, x2) ∈ X 2,∃i ∈ I , such that x1 6= x2, u∗i (x1) = u∗i (x2)

but x1 and x2 still generate different transition probabilitesP (x ′|x , d), then qi (x1) should be different from qi (x2).

• Identifying β:

qi (x1)− βE[R

(v∗ (

x ′);G

)|x1, d = i

]+ βE

[R

(v∗ (

x ′);G

)|x1, d = K

]−

qi (x2)− βE

[R

(v∗ (

x ′);G

)|x2, d = i

]+ βE

[R

(v∗ (

x ′);G

)|x2, d = K

]= 0.

• Parametric restriction: They said it can be used to identify G.Is this true?

Single agent dynamic discrete choice model: infinite horizon

• Players are forward looking.

• Infinite Horizon, Stationary, Markov Transition

• Now players maximize expected discounted utility usingdiscount factor β.

Wi (s, εi ; σ) = maxai∈Ai

Πi (ai , s) + εi (ai )

+ β

Z Xa−i

Wi (s′, ε′i ; σ)g(s ′|s, ai , a−i )σ−i (a−i |s)f (ε′i )dε′i

ff

• Definition: A Markov Perfect Equilibrium is a collection ofδi (s, εi ), i = 1, ..., n such that for all i , all s and all εi , δi (s, εi )maximizes Wi (s, εi ; σi , σ−i ).

• Conditional independence:

• ε distributed i.i.d. over time.

• State variables evolve according to g (s ′|s, ai , a−i ) .

• Define choice specific value function

Vi (ai , s) = Πi (ai , s) + βEˆVi (s

′)|s, ai

˜.

• Players chooose ai to maximize Vi (ai , s) + εi (ai ),• Ex ante value function (Social surplus function)

Vi (s) =Eεi maxai

[Vi (ai , s) + εi (ai )]

=G (Vi (ai , s),∀ai = 0, . . . , K)

=G (Vi (ai , s) − Vi (0, s) ,∀ai = 1, . . . , K) + Vi (0, s)

• When the error terms are extreme value distributed

Vi (s) = logKX

k=0

exp (Vi (k, s))

= logKX

k=0

exp (Vi (k, s) − Vi (0, s)) + Vi (0, s) .

• Relationship between Πi (ai , s) and Vi (ai , s):

Vi (ai , s) =Πi (ai , s) + βEˆG

`Vi (ai , s

′),∀ai = 0, . . . , K´|s, ai

˜=Πi (ai , s) + βE

ˆG

`Vi (k, s ′) − Vi

`0, s ′

´,∀k = 1, . . . , K

´|s, ai

˜+ βE

ˆVi

`0, s ′

´|s, ai

˜• With extreme value distributed error terms

Vi (ai , s) =Πi (ai , s) + βE

"log

KXk=0

exp`Vi

`k, s ′

´− Vi

`0, s ′

´´|s, ai

#+ βE

ˆVi

`0, s ′

´|s, ai

˜

• Hotz and Miller (1993): one to one mapping between σi (ai |s)and differences in choice specific value functions:

(Vi (1, s) − Vi (0, s), ...Vi (K , s) − Vi (0, s)) = Ωi (σi (0|s), ..., σi (K |s))

• Example: i.i.d extreme value f (εi ):

σi (ai |s) =exp(Vi (ai , s) − Vi (0, s))PKk=0 exp(Vi (k, s) − Vi (0, s))

• Inverse mapping:

log (σi (k|s)) − log (σi (0|s)) = Vi (k, s) − Vi (0, s)

• Since we can recover Vi (k, s)− Vi (0, s), we only need toknow Vi (0, s) to recover Vi (k, s) ,∀k.

• If we know Vi (0, s), Vi (ai , s) and Πi (ai , s) is one to one.

• Identify Vi (0, s) first. Set ai = 0:

Vi (0, s) =Πi (0, s) + βE

[log

K∑k=0

exp (Vi (k, s ′)− Vi (0, s ′)) |s, 0

]+ βE [Vi (0, s ′) |s, 0]

• This is a single contraction mapping unique fixed pointiteration.

• Add Vi (0, s) to Vi (k, s)− Vi (0, s) to identify all Vi (k, s).

• Then all Πi (k, s) calculated from Vi (k, s) through

Πi (k, s) = Vi (k, s)− βE [Vi (s′) |s, k] .

• Why normalize Πi (0, s) = 0?

• Why not Vi (0, s) = 0?

• If a firm stays out of the market in period t, current profit 0,but option value of future entry might depend on market size,number of other firms, etc.

• These state variables might evolve stochastically.

• Rest of the identification arguments: identical to the staticmodel.

• Nonparametric and Semiparametric Estimation

• Hotz-Miller inversion recovers Vi (k, s)− Vi (0, s) instead ofΠi (k, s)− Πi (0, s).

• Nonparametrically compute Vi (0, s) using

Vi (0, s) =βE

[log

K∑k=0

exp(Vi (k, s ′)− Vi (0, s ′)

)|s, 0

]+ βE

[Vi (0, s ′) |s, 0

]• Obtain and Vi (k, s) and forward compute Πi (k, s).

• The rest is identical to the static model.

• In semiparametric models, θ converges at a T 1/2 rate and hasnormal asymptotics.

• Apply the results of Newey (1994)-derive appropriate“influence functions”.

• The asymptotic distribution is invariant to the choice ofmethod used to estimate the first stage.

• With proper weighting function (need to estimatenonparametrically), can achieve the same efficiency as fullinformation maximum likelihoood.

• These results hold for both static and dynamic models.

Discrete Games

• Dynamic and static discrete games.

• Private information assumption.

• No unobserved state variables.

• No distinction for identification purpose.

• Therefore suffices to study static games.

• Results immediately translated into dynamic results.

Notations

• Players, i = 1, ..., n.

• Actions ai ∈ 0, 1, . . . ,K.• A = 0, 1, . . . ,Kn and a = (a1, ..., an)..

• si ∈ Si : state for player i .

• S = ΠiSi and s = (s1, ..., sn) ∈ S .

• s is common knowledge and also observed by econometrician.

• For each agent i , K + 1 state variables εi (ai )

• εi (ai ): private information to each agent.

• εi = (εi (0) , . . . , εi (K )).

• Density f (εi ), i.i.d. across i = 1, . . . , n.

• Period utility for player i with action profile a:

ui (a, s, εi ; θ) = Πi (ai , a−i , s; θ) + εi (ai )

• Example: the period profit of firm i for entering the market.

• Generalizes a standard discrete choice model.

• Agents act in isolation in standard discrete choice models.

• Unlike a standard discrete choice model, a−i enters utility.

• Player i ’s decision rule is a function ai = δi (s, εi ).

• Note that ε−i does not enter.

• ε−i is private information of other players.

• Conditional choice probability σi (ai |s) for player i :

σi (ai = k|s) =

∫1 δi (s, εi ) = k f (εi )dεi .

• Choice probability is conditional s: public information.

• Choice specific expected payoff for player i :

Πi (ai , s; θ) =∑a−i

Πi (ai , a−i , s; θ)σ−i (a−i |s).

• Expected utility from choosing ai , excluding preference shock.

• The optimal action for player i satisfies:

σi (ai |s) = Prob

εi |Πi (ai , s; θ) + εi (ai )

> Πi (aj , s; θ) + εi (aj) for j 6= i .

• Πi (ai , a−i , s; θ) is often a linear function, e.g.:

Πi (ai , a−i , s) =

s ′ · β + δ∑j 6=i

1 aj = 1 if ai = 1

0 if ai = 0

• Mean utility from not entering normalized to zero.

• δ measures the influence of j ’s entry choice on i ’s profit.

• If firms compete with each other: δ < 0.

• β measure the impact of the state variables on profits.

• εi (ai ) capture shocks to the profitability of entry.

• Often εi (ai ) are assumed to be i.i.d. extreme value distributed:

f (εi (k)) = e−εi (k)ee−εi (k).

Nonparametric Identification

A1 Assume that the error terms εi (ai ) are distributed i.i.d. acrossactions ai and agents i , and come from a known parametricfamily.

• Not possible to allow nonparametric mean utility and errorterms at once, even in simple single agent problems (e.g. aprobit).

• In Bajari, Hong and Ryan (2005)- even a single agent model isnot identified without an independence assumption.

• Well known that Πi (0, s) are not identified.

• σi (ai |s) only functions of Πi (ai , s)− Πi (0, s).

• Suppose εi (ai ) is extreme value,

σi (ai |s) =exp(Πi (ai , s)− Πi (0, s))∑Kk=0 exp(Πi (k, s)− Πi (0, s))

A2 For all i and all a−i and s, Πi (ai = 0, a−i , s) = 0.

• Can only learn choice specific value functions up to a firstdifference. Need normalization

• Similar to “outside good” assumption in single agent model.

• Entry: the utility from not entering is normalized to zero.

• Hotz and Miller (1993) inversion, for any k, k ′:

log (σi (k|s))− log(σi (k

′|s))

= Πi (k, s)− Πi (k′, s).

• More generally let Γ : 0, ...,K × S → [0, 1]:

(σi (0|s), ..., σi (K |s)) = Γi (Πi (1, s)− Πi (0, s), ...,Πi (K , s)− Πi (0, s))

• And the inverse Γ−1:

(Πi (1, s)− Πi (0, s), ...,Πi (K , s)− Πi (0, s)) = Γ−1i (σi (0|s), ..., σi (K |s))

• Invert equilibrium choice probabilities to nonparametricallyrecover Πi (1, s)− Πi (0, s), ...,Πi (K , s)− Πi (0, s).

• Πi (ai , s) is known by our inversion and probabilites σi can beobserved by econometrician.

• Next step: how to recover Πi (ai , a−i , s) from Πi (ai , s).

• Requires inversion of the following system:

Πi (ai , s) =∑a−i

σ−i (a−i |s)Πi (ai , a−i , s),

∀i = 1, . . . , n, ai = 1, . . . ,K ..

• Given s, n×K × (K + 1)n−1 unknowns utilities of all agents.

• Only n × (K ) known expected utilities.

• Obvious solution: impose exclusion restrictions.

• Partition s = (si , s−i ), and suppose

Πi (ai , a−i , s) = Πi (ai , a−i , si )

depends only on the subvector si .

Πi (ai , s−i , si ) =∑a−i

σ−i (a−i |s−i , si )Πi (ai , a−i , si ).

• Identification: Given each si , the second moment matrix ofthe “regressors” σ−i (a−i |s−i , si ),

Eσ−i (a−i |s−i , si )σ−i (a−i |s−i , si )′

is nonsingular.

• Needs at least (K + 1)n−1 points in the support of theconditional distribution of s−i given si .

• Nonparametric estimation

• Semiparametric estimation

• Linear probability model

• fixed effect panel data

• Multiple equilibria computation.

• Unobserved heterogeneity

• Consider single agent dynamic model first.

• Random coefficient model:

Dan AckerbergA New Use of Importance Sampling to ReduceComputational Burden in Simulation Estimation.Working paper, 2001.

• Bayesian methods:

Imai et alBayesian Estimation of Dynamic Discrete Choice Models.Working paper, 2005.

Andriy NoretzDynamic Discrete Choice Models with serially correlatedunobservables .Working paper, 2006.

Ackerberg 2001

• Static utility: x ′i u + εi , εi extreme value.

• Forward looking agent with discounting

• Random coefficient: u ∼ g (xi , θ).

• Moments to match:

1

n

n∑i=1

[1 (yi = a)− P (yi = a|xi , θ)] t (xi ) .

• Conditional choice probabilities:

P (yi = a|xi , θ) =

∫eV (a,xi ,u)∑

a′∈A eV (a′,xi ,u)g (u|xi , θ) du

=

∫eV (a,xi ,u)∑

a′∈A eV (a′,xi ,u)

g (u|xi , θ)

q (u|xi )q (u|xi ) du

• Using S draws of us , s = 1, . . . ,S from q (u|xi ), theconditional choice probability can be simulated as:

P (yi = a|xi , θ) =1

S

S∑s=1

eV (a,xi ,uis)∑a′∈A eV (a′,xi ,uis)

g (uis |xi , θ)

q (uis |xi )

• Simulated moment conditions:

1

n

n∑i=1

[1 (yi = a)− P (yi = a|xi , θ)

]t (xi ) .

which is

1

n

n∑i=1

[1 (yi = a)− 1

S

S∑s=1

eV (a,xi ,uis)∑a′∈A eV (a′,xi ,uis)

g (uis |xi , θ)

q (uis |xi )

]t (xi ) .

• Separation of simulation and value function computation fromthe estimation step.

• Draw uis , i = 1, . . . , n, s = 1, . . . ,S before estimation starts

• Compute all V (a, xi , uis) for all i = 1, . . . , n and s = 1, . . . ,Sbefore hand.

• No need to recompute V (a, xi , uis) when estimating θ.

• θ only reweights the density

g (uis |xi , θ)

q (uis |xi )

during optimization estimation.

• Same logic applies to simulated MLE, and in Bayesiananalysis.

Noretz 2006

• Bayesian method: serially correlated unobserved statevariables

• st = (yt , xt). xt observed. yt not observed.

• Use Gibbs sampler:• Given θ, dt , xt , draw yt .• Given dt , xt , yt , draw θ

• Joint likelihood:

π (θ) p (yT ,i , xT ,i , dT ,i , . . . , y1,i , x1,i , d1,i |θ)=π (θ) ΠT

t=1p (dt,i |yt,i , xt,i , θ) f (xt,i , yt,i |xt−1,i , yt−1,i , dt−1,i ; θ) .

• conditional choice probability can be indicator:

p (di ,t |yi ,t , xt,i , θ)

=1 (V (yt,i , xt,i , dt,i ; θ) ≥ V (yt,i , xt,i , d ; θ) ,∀d ∈ D) .

• Break unobservables into serially independent and seriallydependent components: yt = (νt , εt),

f (xt+1, νv+1, εt+1|xt , νt , εt , d ; θ)

=p (νv+1|xt+1, εt+1; θ) p (xt+1, εt+1|xt , εt , d ; θ) .

• Joint likelihood becomes

π (θ)∏i,t

p (dt,i |Vt,i ) p (Vi,t |xi,t , εi,t ; θ) p (xt,i , εt,i |xt−1,i , εt−1,i , dt−1,i ; θ)

• Vit can be drawn “analytically” conditional on (xi ,t , εi ,t ; θ)subject to the constraints specified by the p (dt,i |Vt,i )indicators.

• θ and εi ,t are drawn using Metropolis-Hasting steps.

• “Analytic” drawing of Vt,i = Vt,d ,i = V (st,i , d ; θ) , d ∈ D,where s = (x , ε, ν), requires value function updating.

• For example:

u (st,i , d ; θ) = u (xt,i , d ; θ) + vt,d ,i + εt,d ,i .

• Then

V (st,i , d ; θ) =u (xt,i , d ; θ) + vt,d ,i + εt,d ,i

+ βE [V (st+1; θ) |εt,i , xt,i , d ; θ] .

• At every step θm, the expected value function

E[V (st+1; θ

m) |εmt,i , xmt,i , d ; θm

]is updated by averaging over near history point of θ on themcmc chain and over the importance sampling draws of ε’s.

• How to update the approximate value function V m(sm,j ; θm

)V m (s; θm) = max

d∈Du (s, d ; θm) + βE (m)

[V

(s ′; θm

)|s, d ; θm

].

• At each iteration m, Draw randow statessm,j , j = 1, . . . , N (m) from an i.i.d. density g (·) > 0.

• Each iteration m, only keep track of the history of lengthN (m):

θk ; sk,j ,V k(sk,j ; θk

), j = 1, . . . , N (k)m−1

k=m−N(m).

• In this history, find i = 1, . . . , N (m) cloest to θ parameterdraws.

• Only the value functions in importance sampling thatcorrespond to these nearest neighbors are used in theapproximation by averaging.

• Update value function as:

E (m)[V

(s ′; θ

)|s, d ; θ

]=

N(m)∑i=1

N(ki )∑j=1

V ki

(ski ,j ; θki

) f(ski ,j |s, d ; θ

)/g

(ski ,j

)∑N(m)

r=1

∑N(kr )q=1 f (skr ,q|s, d ; θ) /g (skr ,q) .

=

N(m)∑i=1

N(ki )∑j=1

V ki

(ski ,j ; θki

)Wki ,j ,m (s, d , θ)

• weights simplies with i.i.d. known unobservable components.

• expected max value function can be integrated out withextreme value errors.

Particle filtering

• Used in macro dynamic models by Fernandez and Rubio.

• Particle filtering is an importance sampling method.

• x is latent state, y is observable random variable:

• We are interested in the posterior distribution of x given y .

• Want to compute E (t (X ) |y)∫t (x) p (x |y) dx =

∫t (x)

p (y |x) p (x)

p (y)dx

=

∫t (x) p(y |x)p(x)

g(x) g (x) dx∫ p(y |x)p(x)g(x) g (x) dx

• If we have r = 1, . . . ,R draws from the density g (x), then wecan approximate

E (t (X ) |y) ≈ 1

R

R∑r=1

t (xr ) w (xr |y) /1

R

R∑r=1

w (xr |y) ,

where

w (xr |y) =p (y |xr ) p (xr )

g (xr ).

• Or one can write

E (t (X ) |y) ≈ 1

R

R∑r=1

t (xr ) w (xr |y)

where

w (xr |y) = w (xr |y) /

R∑r=1

w (xr |y) .

• In other words, given R draws from g (x), t (X ) is integratedagainst a discrete distribution that places weights w (xr ) oneach of the r = 1, . . . ,R points on the discrete support.

• Alternatively, one can compute

E (t (X ) |y) ≈ 1

R

R∑r=1

t (xr )

where xr , r = 1, . . . ,R are R draws from the weightedempirical distribution on the xr , r = 1, . . . ,R, where theweights are w (xr ).

• When g (x) = p (x), w (x |y) = p (y |x), and

w (xr |y) = p (y |xr ) /

R∑r=1

p (y |xr ) .

• In a Bayesian setup, the coefficients θ are the latent statevariables x . p (x) is then the prior distribution of θ.

• A random coefficient model is just like a hierachical Bayesianmodel where the µ are the hyper-parameters in the priordistribution, and the hyperparameters are to be estimated.The prior for θ can be made dependent on both µ andcovariates (state variables) s.

• Computing the weights p (y |θ) involves value functioniteration and is computationally difficult.

• If we use GMM objective function instead of the likelihood toform p (y |θ), there might be some computation savingsbecause only one simulation r is needed for each observation.

• A more interesting case is when there is dynamic unobservablestate variables.

• Suppose s = (x , v), such that x is observed but v is notobserved.

• We might be interested in the posterior distribution of theentire sample path of v in addition to those of θ, the“parameters”.

• Particle filtering can be used in this case.

• How can computation be maximally sped up?

• Do particles lend themselves to pararrell processing?

p (v0:t |y1:t , x0:t) =p (v0:t , y1:t , x0:t)

p (y1:t , x0:t)

=p (v0:t−1, y1:t−1, x0:t−1) p (vt , yt , xt |v0:t−1, y1:t−1, x0:t−1)

p (y1:t−1, x0:t−1) p (yt , xt |y1:t−1, x0:t−1)

=p (v0:t−1|y1:t−1, x0:t−1)p (vt , yt , xt |v0:t−1, y1:t−1, x0:t−1)

p (yt , xt |y1:t−1, x0:t−1)

=p (v0:t−1|y1:t−1, x0:t−1)p (yt |xt , vt) p (vt , xt |v0:t−1, y1:t−1, x0:t−1)

p (yt , xt |y1:t−1, x0:t−1)

=p (v0:t−1|y1:t−1, x0:t−1)p (yt |xt , vt) p (vt , xt |vt−1, yt−1, xt−1)

p (yt , xt |y1:t−1, x0:t−1)

The fourth equality follows from the conditional independenceassumption and the markovian structure, and the fifth equalityfollows from the markovian structure of the state variabletransition.

• Therefore we only need to make the following smallmodifications to the filtering algorithm:

• First, for each particle in period t − 1, with its associatedvalue of vt−1, simulate from

p (vt |xt , vt−1, yt−1, xt−1) =p (vt , xt |vt−1, yt−1, xt−1)

p (xt |vt−1, yt−1, xt−1).

• If vt and xt are conditionally independent givenvt−1, yt−1, xt−1, then we can directly simulate from

p (vt |vt−1, yt−1, xt−1) .

• Then reweight by p (yt |xt , vt).

• The weights are not easy to compute because they eitherinvolve value function iteration orbackward recursion.

• In summary, three components of recursive particle filteringmethod:

p (v0:t |x1:t , y1:t) ∝p (v0:t−1|y1:t−1, x0:t−1)

p (vt |xt , vt−1, yt−1, xt−1) p (yt |xt , vt)

• The first part p (v0:t−1|y1:t−1, x0:t−1), defines recursion.

• The second part p (vt |xt , vt−1, yt−1, xt−1) , defines theimportance sampling density.

• The third part p (yt |xt , vt), defines the weights on theparticles.

• Relation to the macro model of Jesus Fernandez-Villaverdeand Juan Rubio-Ramırez.

• Their paper: “Estimating Macroeconomic Models: ALikelihood Approach”, forthcoming, Review of EconomicStudies.

• Basically, very similar.

• They also have feedback from yt to the latent state variablesxt+1, vt+1:

p (vt , xt |v0:t−1, y1:t−1, x0:t−1) = p (vt , xt |vt−1, yt−1, xt−1)

unlike stochastic volatility models, where transitions of vt , xt

are automonous.

• They introduce this dependence by allowing for singularity inthe measurement equation:

Yt = g (St ,Vt ; γ)

• The states in their transition equation are not necessarily alllatent:

St = f (St−1,Wt ; γ)

• Feedback from Yt to latent states is allowed when Yt is asubcomponent of St and is directly observed, such that thatparticular component of g (·) has no noise Vt .

• In their assumption 2, they assume that both Y and V arecontinuously distributed and that g (·) (f (·) as well) isinvertable.

• Then the conditional density of Yt (which basically is used tocalculate the weights) can be determined from the density ofVt and the jacobian of the transformation.

• They do mention though, in one brief sentence, that thisassumption can possibly be relaxed as long as the weights canbe computed.

• Our discrete Yt model falls into this extension where there isno invertibility.

• The weights can still be computed through the logitprobability form and the value function iterations (orbackward recursion).

• Calculating value functions by backward induction:

• Assuming binary choice model.

• At time T :

VT (s) = log [exp (Π (s, 1)) + exp (Π (s, 0))]

• Suppose Vt+1 (s) is known, then at time t:

Vt (s, 1) =Π (s, 1) + βE[Vt+1

(s ′

)|s, 1

]Vt (s, 0) =Π (s, 0) + βE

[Vt+1

(s ′

)|s, 0

]Vt (s) = log [exp (Vt (s, 1)) + exp (Vt (s, 0))] .