Lecture IV Value Function Iteration with Discretization · 2015-09-17 · Lecture IV Value Function...

Lecture IV

Value Function Iteration with Discretization

Gianluca Violante

New York University

Quantitative Macroeconomics

G. Violante, ”VFI with Discretization” p. 1 /13

Stochastic growth model with leisure

• Consider the planner’s problem:

V (k, z) = maxc,k′,h

u (c) + v (1− h) + β∑

z′∈Z

π (z, z′)V (k′, z′)

s.t.

c+ k′ = ezf (k, h) + (1− δ) k

c ≥ 0

k′ ≥ 0

h ∈ [0, 1]

• We want to solve for the policy functions gc(z, k), gk(z, k), gh(z, k)

• We do it by VFI + discretization of the state space (Z,K)

• V is a function defined only on grid points (a 2× 2 matrix)


Setting up the grids

• Grid of Nz points z1, ..., zS for productivity shock (e.g., Tauchen)

• Evenly spaced grid of Nk points for capital:

ki = k1 + (i− 1) η , for i = 1, ...Nk where η =kNk−

k1

Nk − 1

but better: more points where decision rules have curvature

• How do we choose bounds k1, kNk? Solve for SS system:

fk (k∗, 1− h∗) =

(

1

β− 1 + δ

)

c∗ + δk∗ = f (k∗, h∗)

uc (c∗) fh (k

∗, h∗) = −vh (1− h∗)

• k1, kNk must be below and above k⋆. Grid size depends on the

size and persistence of z. Keep k1 > 0.


Dealing with choice of leisure

• At every (ki, kj, zs) corresponding to a grid point for the variables(k, k′, z) we have the intratemporal FOC:

uc (ezsf (ki, h) + (1− δ) ki − kj) · e

zsfh (ki, h) = −vh (1− h)

• Note that the LHS is decreasing in h and the RHS is increasing

• h = 0 ruled out by Inada condition on f and h = 1 ruled out byInada condition on v. Interior solution between (0, 1).

• Use bisection method to solve for h

• Call the solution gh (ki, kj , zs), but not yet a policy functionbecause it depends on k′


Algorithm

1. Define a three dimensional array R of size (Nk, Nk, Nz) withtypical element (ki, kj, zs) containing the return function

R (ki, kj , zs) = u (ezsf (ki) + (1− δ) ki − kj)+v (1− gh (ki, kj , zs))

Check whether the argument of u at point (i, j, s) is negative: if so,set R (ki, kj, zs) to a very large negative number.

2. Start with a guess of the (Nk, Nz) matrix V 0, say the null array sothe first iteration is a static problem. Or

V 0 (ki, zs) =u (zsf (ki, h

∗)− δki) + v (1− h∗)

1− β

Denote the iteration number by t

3. You enter each iteration t with an (Nk, Nz) matrix V t with genericelement V t (kj , zs)


Algorithm

4. Compute the (Nk, Nz) matrix Vt that represents the conditional

expected value with generic element

Vt (kj , zs) =

S∑

s′=1

π (zs, zs′)Vt (kj , zs′)

5. Update value function by selecting

V t+1 (ki, zs) = maxj

R (ki, kj , zs) + βVt (kj , zs)

Careful here! With the max operator you are squeezing a 3-dimarray into a 2-dim array. In Matlab use command called squeeze

Exploit monotonicity of the policy function kj(ki, zs) whensearching for the max over successive grid points!


Algorithm

6. Store the argmax, i.e., the decision rules:

gt+1

k (ki, zs) = argmaxj

R (ki, kj , zs) + βVt (kj , zs)

gt+1

h (ki, zs) = gh(

ki, gt+1

k (ki, zs) , zs)

gt+1c (ki, zs) = ezsf

(

ki, gt+1

h

)

+ (1− δ) ki − gt+1

k (ki, zs)

7. Howard improvement step here (see later)

8. Check convergence. If ||V t+1 − V t|| < ε, stop and report success,otherwise go back to step [3.] and update V t with V t+1


Additional checks

1. Check that the policy function isn’t constrained by the discretestate space. If gk is equal to the highest or lowest value of capitalin the grid for some i, relax the bounds of the grid for k and redothe value function iteration. Especially likely for upper bound.

2. Check that the error tolerance is small enough. If a smallreduction in ε results in large changes in the value or policyfunctions, the tolerance is too high. Reduce ε until the solution tothe value and policy functions are insensitive to further reductions.

3. Check that grid size is dense enough. If an increase in number ofgrid points results in a substantially different solution, the gridmight be too sparse. Keep increasing grid size until the value andpolicy functions are insensitive to further increases.

Discretization is considerably slower compared with other methods thatyou will learn (curse of dimensionality), but it’s the most robust method


Howard Improvement Step: idea

• Selecting maximizer in step [5.] to compute the policy functions isthe most time-consuming step in the value function iteration

• Howard’s improvement reduces the number of times we updatethe policy function relative to the number of times we update thevalue function

• Idea: on some iterations, we simply use the current approximationto the policy function to update the value function, i.e. we do notupdate the policy function at each VFI

• Updating the value function using the current estimate of thepolicy function will bring the value function closer to the true valuesince the policy function tends to converge faster than the valuefunction


Why VFI slower than PFI?

• Convergence rate of VFI (in infinite horizon problems)approximately β and convergence (from any initial condition) isassured if it is a contraction mapping

• Den Haan example: consider the problem

maxx1,x2

x1−γ1 + x

1−γ2

s.t.

x1 + x2 = 2

with solution(

x1

x2

)γ

= 1

• At x∗

1 = x∗

2 = 1, MRS∗ = 1 and V ∗ = 2

• Assume γ = 0.9, consider a huge deviation: x1 = 2, x2 = 0

• ˆMRS = ∞ but V = 1.87, so V comparatively much flatter


Howard Improvement Step: implementation

• Let Vt+1

0 (ki, zs) = V t+1 (ki, zs) and iterate h = 1, ..., H times:

Vt+1

h+1(ki, zs) = R

(

ki, gt+1

k (ki, zs) , zs)

+β

Nz∑

s′=1

π (zs, zs′)Vt+1

h

(

gt+1

k (ki, zs) , zs′)

Note: gk stays the same at each iteration h

• Need to choose H. Too high H may result in a value functionmoving further from the true one since the policy function is notthe optimal policy.

1. A good idea is to increase H after each iteration

2. Or, use the HIS only after a few steps of the VFI

3. Or stop when ||Vh+1 − Vh|| improvement is small enough


Policy function iteration

1. At iteration t, gtk is given. Need an initial guess of policy: getsmart!

2.a Solve the system of Nk ×Nz equations for each element of V t:

V t (ki, zs) = Rt(

ki, gtk(ki, zs), zs

)

+β

S∑

s′=1

π (zs, zs′)Vt(

gtk(ki, zs), zs′)

2.b Iterate h = 1, ... until convergence of V t (slower than 2.a but morestable).

V th+1 (ki, zs) = Rt

(

ki, gtk(ki, zs), zs

)

+β

S∑

s′=1

π (zs, zs′)Vth

(

gtk(ki, zs), zs′)

This is basically an HIS algorithm with H = ∞


Policy function iteration

3 Update the policy function:

gt+1

k (ki, zs) = argmaxj

R (ki, kj , zs) + β

S∑

s′=1

π (zs, zs′)Vt (kj , zs′)

4. Go back to step 2. with new policy


Lecture V

Linear-Quadratic Approximations

Gianluca Violante

New York University

Quantitative Macroeconomics

G. Violante, ”LQ Approximations” p. 1 /15

LQ approximation methods

• The LQ method locally approximates the period utility functionaround the steady state using a quadratic function

• If u is (approximated as) quadratic, the value function is quadratic,too. The decision rules are linear in the state variables.

• LQ approximation and the method of linearizing FOCs around thesteady state are equivalent. In both methods, the optimal decisionrules are going to be linear in state variables

• Since both methods rely on the local approximation, both methodsare valid only locally around the steady state of the model

• The method is easy to implement where the welfare theoremshold and can solve the Social Planner’s problem. For solvingequilibrium of wider class of economies with distortions,linearization of FOCs is simpler.


Stochastic growth model with leisure

V (z, k) = maxc,k′,h

u (c, 1− h) + β∑

z′∈Z

π (z, z′)V (k′, z′)

where:

u (c, 1− h) =

[

cθ (1− h)1−θ]1−γ

1− γ

y = ezkαh1−α

y = c+ k′ − (1− δ) k

z follows an AR(1) with mean zero


Steps for LQ Approximation

1. Solve for the steady state

2. Identify the endogenous and exogenous states, and the choicevariables

3. Redefine the utility as a function of (endogenous and exogenous)state variables and choice variables

4. Then, approximate the utility function around the steady state,using a 2nd order Taylor approximation

5. Use value function iteration to find the optimal value function


Step 1: Solve for SS(

k, c, h)

1. Euler equation:

1 = β(

1 + αkα−1h1−α − δ)

2. Intratemporal FOC:

θ(

1− h)

· (1− α) kαh−α = (1− θ) c

3. From the resource constraint:

c+ δk = kαh1−α

substituting c out using the last two equations: (2+3) →

θ(

1− h)

· (1− α) kαh−α = (1− θ)(

kαh1−α − δk)

2 equations in 2 unknowns (k, h). Use root finding method to find SS


Steps 2/3: Identify variables and redefine u

• Exogenous state variables: z

• Endogenous state variables: k

• (2× 1) vector of control/decision variables: d = (k′, h)

• Eliminate c from utility function as u (z, k, d) by substitutingaggregate resource constraint

• Compute the number u ≡ u(

z, k, d)

, period utulity evaluated at SS

• Define the (4× 1) vector:

w =

z

k

d

=

z

k

k′

h


Steps 4: Approximate u with quadratic function

• Using a second order approximation of u around the SS:

u (z, k, d) ≃ u+ (w − w)T J +1

2(w − w)T H (w − w)

where J and H are Jacobian and Hessian evaluated at(

z, k, d)

.

u (z, k, d) ≃ u+ (w − w)T J +1

2(w − w)T H (w − w)

= u− wT J +1

2wT Hw + wT

(

J − Hw)

+1

2wHw

=[

1 wT

]

u− wT J + 1

2wT Hw 1

2

(

J − Hw)T

1

2

(

J − Hw)

1

2H

1

w

=[

1 wT

]

Q

1

w

A quadratic form, where the matrix of coefficients Q is (5× 5) .


Step 5: VFI

• We can write the VFI at iteration t as:

Vt+1 (z, k) = maxd

[

1 wT]

Q

[

1

w

]

+ β∑

z′

π (z, z′)Vt (z′, k′)

• We know the value function has the same quadratic form as u.Define the vector of states:

s =

1

z

k

• We postulate a quadratic form:

Vt (z, k) = sTPts

where P is negative semi-definite and symmetric. Need to find P!


Step 5: VFI

• The approximated Bellman equation looks like

sTPt+1s = maxd

[

1 wT]

Q

[

1

w

]

+ βE[

(s′)TPts

′

]

where the law of motion for the state can be written as a functionof the vector w as:

s′ =

1

z′

k′

=

1 0 0 0 0

0 ρ 0 0 0

0 0 0 1 0

1

z

k

k′

h

+

0

ε′

0

s′ = B

[

1

w

]

+

0

ε′

0


Step 5: VFI

• This is useful because we can rewrite:

sTPt+1s = maxd

[

1 wT

]

Q

1

w

+ β[

1 wT

]

BTPtB

1

w

+βE[

0 ε′ 0]

Pt

0

ε′

0

= maxd

[

1 wT

]

Q

1

w

+ β[

1 wT

]

BTPtB

1

w

+ βσ2εPt,22

• You can see the certainty equivalence property here: the FOC wrtd (a component of w) will not depend on σε


Step 5: VFI

• Collecting terms

sTPt+1s = maxd

[

1 wT]

[Q+Mt]

[

1

w

]

where

Mt = βBTPtB +

βσ2εPt,22 0 ... 0

0 ... ...

...

0 ... 0


Step 5: VFI

• Let’s rewrite the Bellman equation as follows:

sTPt+1s = maxd

[

sT dT]

[

Qss QTsd

Qsd Qdd

]

+

[

Mt,ss MTt,sd

Mt,sd Mt,dd

][

s

d

]

• Recall that s is a (3× 1) vector of constant plus states and d a(2× 1) vector of decisions

• The Q and Mt matrices are (5× 5). Qs· and Mt,s· are (2× 3) andQdd and Mt,dd are (2× 2)

• Thus, multiplying through:

sTPt+1s = maxd

sT (Qss +Mt,ss) s+ 2dT (Qsd +Mt,sd) s

+ dT (Qdd +Mt,dd) d


Step 5: VFI

• It is a concave program, so the FOC is sufficient

• The FOC wrt d yields:

0 = 2 (Qsd +Mt,sd) s+ 2 (Qdd +Mt,dd) d

d = − (Qdd +Mt,dd)−1

(Qsd +Mt,sd) s

d = ΩTt s

where Ω is (3× 2) given by:

Ωt = −(

QTsd +MT

t,sd

)

(Qdd +Mt,dd)−1


Step 5: VFI

• If we use the FOC to substitute out d in the Bellman equation:

sTPt+1s = sT (Qss +Mt,ss) s+ 2sTΩt

(

Qsd +Mt,sd

)

s+ sTΩt

(

Qdd +Mt,dd

)

ΩTt s

• Using the expression for Ωt:

sTPt+1s = sT[

Qss +Mt,ss −

(

QTsd +MT

t,sd

)

(

Qdd +Mt,dd

)

−1 (

Qwd +Mt,sd

)

]

s

• This suggests a recursion for V :

1. Given Pt (needed to construct Mt):

Pt+1 = Qss +Mt,ss −(

QTsd +MT

t,sd

)

(Qdd +Mt,dd)−1

(Qsd +Mt,sd)

2. Use Pt+1 to construct Mt+1 and go back to step 1 til convergence


Summary of LQ-VFI procedure

1. Guess P0. Since the value function is concave, we guess anegative semidefinite matrix, for example P0 = −I .

2. Given Pt, update the value function using recursion above andobtain Pt+1. The Q matrix is defined by approximating the returnfunction and the Mt matrix by the formula we obtained above

3. Compare Pt and Pt+1. If the distance (measured in sup norm) issmaller than the predetermined tolerance level, stop. Otherwisego back to the updating step (step 2) with Pt+1.

4. With the optimal P ∗, we can compute the decision rules. Checksolution is correct by evaluating decision rules at SS.

5. Use the decision rules to simulate the economy.


Lecture IV Value Function Iteration with Discretization · 2015-09-17 · Lecture IV Value Function...

Documents

Transcript of Lecture IV Value Function Iteration with Discretization · 2015-09-17 · Lecture IV Value Function...