Stochastic Optimal Control with FinanceApplications
Tomas Bjork,Department of Finance,
Stockholm School of Economics,
KTH, February, 2010
Tomas Bjork, 2010 1
Contents
• Dynamic programming.
• Investment theory.
• The martingale approach.
• Filtering theory.
• Optimal investment with partial information.
Tomas Bjork, 2010 2
1. Dynamic Programming
• The basic idea.
• Deriving the HJB equation.
• The verification theorem.
• The linear quadratic regulator.
Tomas Bjork, 2010 3
Problem Formulation
maxu
E
[∫ T
0
F (t, Xt, ut)dt + Φ(XT )
]subject to
dXt = µ (t, Xt, ut) dt + σ (t, Xt, ut) dWt
X0 = x0,
ut ∈ U(t, Xt), ∀t.
We will only consider feedback control laws, i.e.controls of the form
ut = u(t, Xt)
Terminology:
X = state variable
u = control variable
U = control constraint
Note: No state space constraints.
Tomas Bjork, 2010 4
Main idea
• Embedd the problem above in a family of problemsindexed by starting point in time and space.
• Tie all these problems together by a PDE–theHamilton Jacobi Bellman equation.
• The control problem is reduced to the problem ofsolving the deterministic HJB equation.
Tomas Bjork, 2010 5
Some notation
• For any fixed vector u ∈ Rk, the functions µu, σu
and Cu are defined by
µu(t, x) = µ(t, x, u),
σu(t, x) = σ(t, x, u),
Cu(t, x) = σ(t, x, u)σ(t, x, u)′.
• For any control law u, the functions µu, σu, Cu(t, x)and F u(t, x) are defined by
µu(t, x) = µ(t, x,u(t, x)),
σu(t, x) = σ(t, x,u(t, x)),
Cu(t, x) = σ(t, x,u(t, x))σ(t, x,u(t, x))′,
F u(t, x) = F (t, x,u(t, x)).
Tomas Bjork, 2010 6
More notation
• For any fixed vector u ∈ Rk, the partial differentialoperator Au is defined by
Au =n∑
i=1
µui (t, x)
∂
∂xi+
12
n∑i,j=1
Cuij(t, x)
∂2
∂xi∂xj.
• For any control law u, the partial differentialoperator Au is defined by
Au =n∑
i=1
µui (t, x)
∂
∂xi+
12
n∑i,j=1
Cuij(t, x)
∂2
∂xi∂xj.
• For any control law u, the process Xu is the solutionof the SDE
dXut = µ (t, Xu
t ,ut) dt + σ (t, Xut ,ut) dWt,
whereut = u(t, Xu
t )
Tomas Bjork, 2010 7
Embedding the problem
For every fixed (t, x) the control problem P(t, x) isdefined as the problem to maximize
Et,x
[∫ T
t
F (s,Xus , us)ds + Φ (Xu
T )
],
given the dynamics
dXus = µ (s,Xu
s ,us) ds + σ (s,Xus ,us) dWs,
Xt = x,
and the constraints
u(s, y) ∈ U, ∀(s, y) ∈ [t, T ]×Rn.
The original problem was P(0, x0).
Tomas Bjork, 2010 8
The optimal value function
• The value function
J : R+ ×Rn × U → R
is defined by
J (t, x,u) = E
[∫ T
t
F (s,Xus ,us)ds + Φ (Xu
T )
]
given the dynamics above.
• The optimal value function
V : R+ ×Rn → R
is defined by
V (t, x) = supu∈U
J (t, x,u).
• We want to derive a PDE for V .
Tomas Bjork, 2010 9
Assumptions
We assume:
• There exists an optimal control law u.
• The optimal value function V is regular in the sensethat V ∈ C1,2.
• A number of limiting procedures in the followingarguments can be justified.
Tomas Bjork, 2010 10
The Bellman Optimality Principle
Dynamic programming relies heavily on the followingbasic result.
Proposition: If u is optimal on the time interval [t, T ]then it is also optimal on every subinterval [s, T ] witht ≤ s ≤ T .
Proof: Iterated expectations.
Tomas Bjork, 2010 11
Basic strategy
To derive the PDE do as follows:
• Fix (t, x) ∈ (0, T )×Rn.
• Choose a real number h (interpreted as a “small”time increment).
• Choose an arbitrary control law u.
Now define the control law u? by
u?(s, y) =
u(s, y), (s, y) ∈ [t, t + h]×Rn
u(s, y), (s, y) ∈ (t + h, T ]×Rn.
In other words, if we use u? then we use the arbitrarycontrol u during the time interval [t, t + h], and thenwe switch to the optimal control law during the rest ofthe time period.
Tomas Bjork, 2010 12
Basic idea
The whole idea of DynP boils down to the followingprocedure.
• Given the point (t, x) above, we consider thefollowing two strategies over the time interval [t, T ]:
I: Use the optimal law u.II: Use the control law u? defined above.
• Compute the expected utilities obtained by therespective strategies.
• Using the obvious fact that Strategy I is least asgood as Strategy II, and letting h tend to zero, weobtain our fundamental PDE.
Tomas Bjork, 2010 13
Strategy values
Expected utility for strategy I:
J (t, x, u) = V (t, x)
Expected utility for strategy II:
• The expected utility for [t, t + h) is given by
Et,x
[∫ t+h
t
F (s,Xus ,us) ds
].
• Conditional expected utility over [t + h, T ], given(t, x):
Et,x
[V (t + h, Xu
t+h)].
• Total expected utility for Strategy II is
Et,x
[∫ t+h
t
F (s,Xus ,us) ds + V (t + h, Xu
t+h)
].
Tomas Bjork, 2010 14
Comparing strategies
We have trivially
V (t, x) ≥ Et,x
[∫ t+h
t
F (s,Xus ,us) ds + V (t + h, Xu
t+h)
].
RemarkWe have equality above if and only if the control lawu is an optimal law u.
Now use Ito to obtain
V (t + h, Xut+h) = V (t, x)
+∫ t+h
t
∂V
∂t(s,Xu
s ) +AuV (s,Xus )
ds
+∫ t+h
t
∇xV (s,Xus )σudWs,
and plug into the formula above.
Tomas Bjork, 2010 15
We obtain
Et,x
[∫ t+h
t
[F (s,Xu
s ,us) +∂V
∂t(s,Xu
s ) +AuV (s,Xus )]
ds
]≤ 0.
Going to the limit:Divide by h, move h within the expectation and let h tend to zero.We get
F (t, x, u) +∂V
∂t(t, x) +AuV (t, x) ≤ 0,
Tomas Bjork, 2010 16
Recall
F (t, x, u) +∂V
∂t(t, x) +AuV (t, x) ≤ 0,
This holds for all u = u(t, x), with equality if and onlyif u = u.
We thus obtain the HJB equation
∂V
∂t(t, x) + sup
u∈UF (t, x, u) +AuV (t, x) = 0.
Tomas Bjork, 2010 17
The HJB equation
Theorem:Under suitable regularity assumptions the follwing hold:
I: V satisfies the Hamilton–Jacobi–Bellman equation
∂V
∂t(t, x) + sup
u∈UF (t, x, u) +AuV (t, x) = 0,
V (T, x) = Φ(x),
II: For each (t, x) ∈ [0, T ] × Rn the supremum in theHJB equation above is attained by u = u(t, x).
Tomas Bjork, 2010 18
Logic and problem
Note: We have shown that if V is the optimal valuefunction, and if V is regular enough, then V satisfiesthe HJB equation. The HJB eqn is thus derived asa necessary condition, and requires strong ad hocregularity assumptions.
Problem: Suppose we have solved the HJB equation.Have we then found the optimal value function andthe optimal control law?
Answer: Yes! This follows from the Verificationtehorem.
Tomas Bjork, 2010 19
The Verification Theorem
Suppose that we have two functions H(t, x) and g(t, x), such
that
• H is sufficiently integrable, and solves the HJB equation8><>:∂H
∂t(t, x) + sup
u∈UF (t, x, u) +Au
H(t, x) = 0,
H(T, x) = Φ(x),
• For each fixed (t, x), the supremum in the expression
supu∈U
F (t, x, u) +AuH(t, x)
is attained by the choice u = g(t, x).
Then the following hold.
1. The optimal value function V to the control problem is given
by
V (t, x) = H(t, x).
2. There exists an optimal control law u, and in fact
u(t, x) = g(t, x)
.
Tomas Bjork, 2010 20
Handling the HJB equation
1. Consider the HJB equation for V .
2. Fix (t, x) ∈ [0, T ] × Rn and solve, the static optimization
problem
maxu∈U
[F (t, x, u) +AuV (t, x)] .
Here u is the only variable, whereas t and x are fixed
parameters. The functions F , µ, σ and V are considered as
given.
3. The optimal u, will depend on t and x, and on the function
V and its partial derivatives. We thus write u as
u = u (t, x; V ) . (1)
4. The function u (t, x; V ) is our candidate for the optimal
control law, but since we do not know V this description is
incomplete. Therefore we substitute the expression for u into
the PDE , giving us the PDE
∂V
∂t(t, x) + F
u(t, x) +Au
(t, x) V (t, x) = 0,
V (T, x) = Φ(x).
5. Now we solve the PDE above! Then we put the solution V
into expression (1). Using the verification theorem we can
identify V as the optimal value function, and u as the optimal
control law.
Tomas Bjork, 2010 21
Making an Ansatz
• The hard work of dynamic programming consists insolving the highly nonlinear HJB equation
• There are no general analytic methods availablefor this, so the number of known optimal controlproblems with an analytic solution is very smallindeed.
• In an actual case one usually tries to guess asolution, i.e. we typically make a parameterizedAnsatz for V then use the PDE in order to identifythe parameters.
• Hint: V often inherits some structural propertiesfrom the boundary function Φ as well as from theinstantaneous utility function F .
• Most of the known solved control problems have,to some extent, been “rigged” in order to beanalytically solvable.
Tomas Bjork, 2010 22
The Linear Quadratic Regulator
minu∈Rk
E
[∫ T
0
X ′tQXt + u′tRut dt + X ′
THXT
],
with dynamics
dXt = AXt + But dt + CdWt.
We want to control a vehicle in such a way that it staysclose to the origin (the terms x′Qx and x′Hx) whileat the same time keeping the “energy” u′Ru small.
Here Xt ∈ Rn and ut ∈ Rk, and we impose no controlconstraints on u.
The matrices Q, R, H, A, B and C are assumed to beknown. We may WLOG assume that Q, R and H aresymmetric, and we assume that R is positive definite(and thus invertible).
Tomas Bjork, 2010 23
Handling the Problem
The HJB equation becomes∂V
∂t(t, x) + infu∈Rk x′Qx + u′Ru + [∇xV ](t, x) [Ax + Bu]
+ 12
∑i,j
∂2V∂xi∂xj
(t, x) [CC ′]i,j = 0,
V (T, x) = x′Hx.
For each fixed choice of (t, x) we now have to solve the static unconstrainedoptimization problem to minimize
u′Ru + [∇xV ](t, x) [Ax + Bu] .
Tomas Bjork, 2010 24
The problem was:
minu
u′Ru + [∇xV ](t, x) [Ax + Bu] .
Since R > 0 we set the gradient to zero and obtain
2u′R = −(∇xV )B,
which gives us the optimal u as
u = −12R−1B′(∇xV )′.
Note: This is our candidate of optimal control law,but it depends on the unkown function V .
We now make an educated guess about the shape ofV .
Tomas Bjork, 2010 25
From the boundary function x′Hx and the term x′Qxin the cost function we make the Ansatz
V (t, x) = x′P (t)x + q(t),
where P (t) is a symmetric matrix function, and q(t) isa scalar function.
With this trial solution we have,
∂V
∂t(t, x) = x′P x + q,
∇xV (t, x) = 2x′P,
∇xxV (t, x) = 2P
u = −R−1B′Px.
Inserting these expressions into the HJB equation weget
x′
P + Q− PBR−1B′P + A′P + PA
x
+q + tr[C ′PC] = 0.
Tomas Bjork, 2010 26
We thus get the following matrix ODE for PP = PBR−1B′P −A′P − PA−Q,
P (T ) = H.
and we can integrate directly for q:q = −tr[C ′PC],
q(T ) = 0.
The matrix equation is a Riccati equation. Theequation for q can then be integrated directly.
Final Result for LQ:
V (t, x) = x′P (t)x +∫ T
t
tr[C ′P (s)C]ds,
u(t, x) = −R−1B′P (t)x.
Tomas Bjork, 2010 27
2. Portfolio Theory
• Problem formulation.
• An extension of HJB.
• The simplest consutmption-investment problem.
• The Merton fund separation results.
Tomas Bjork, 2010 28
Recap of Basic Facts
We consider a market with n assets.
Sit = price of asset No i,
hit = units of asset No i in portfolio
wit = portfolio weight on asset No i
Xt = portfolio value
ct = consumption rate
We have the relations
Xt =n∑
i=1
hitS
it, wi
t =hi
tSit
Xt,
n∑i=1
uit = 1.
Basic equation:Dynamics of self financing portfolio in terms of relativeweights
dXt = Xt
n∑i=1
wit
dSit
Sit
− ctdt
Tomas Bjork, 2010 29
Simplest modelAssume a scalar risky asset and a constant short rate.
dSt = αStdt + σStdWt
dBt = rBtdt
We want to maximize expected utility over time
maxw0,w1,c
E
[∫ T
0
F (t, ct)dt + Φ(XT )
]Dynamics
dXt = Xt
[u0
tr + w1t α]dt− ctdt + w1
t σXtdWt,
Constraints
ct ≥ 0, ∀t ≥ 0,
w0t + w1
t = 1, ∀t ≥ 0.
Nonsense!
Tomas Bjork, 2010 30
What are the problems?
• We can obtain umlimited uttility by simplyconsuming arbitrary large amounts.
• The wealth will go negative, but there is nothing inthe problem formulations which prohibits this.
• We would like to impose a constratin of type Xt ≥ 0but this is a state constraint and DynP does notallow this.
Good News:DynP can be generalized to handle (some) problemsof this kind.
Tomas Bjork, 2010 31
Generalized problem
Let D be a nice open subset of [0, T ]×Rn and considerthe following problem.
maxu∈U
E
[∫ τ
0
F (s,Xus ,us)ds + Φ (τ,Xu
τ )]
.
Dynamics:
dXt = µ (t, Xt, ut) dt + σ (t, Xt, ut) dWt,
X0 = x0,
The stopping time τ is defined by
τ = inf t ≥ 0 |(t, Xt) ∈ ∂D ∧ T.
Tomas Bjork, 2010 32
Generalized HJB
Theorem: Given enough regularity the follwing hold.
1. The optimal value function satisfies∂V
∂t(t, x) + sup
u∈UF (t, x, u) +AuV (t, x) = 0, ∀(t, x) ∈ D
V (t, x) = Φ(t, x), ∀(t, x) ∈ ∂D.
2. We have an obvious verification theorem.
Tomas Bjork, 2010 33
Reformulated problem
maxc≥0, w∈R
E
[∫ τ
0
F (t, ct)dt + Φ(XT )]
whereτ = inf t ≥ 0 |Xt = 0 ∧ T.
with notation:
w1 = w,
w0 = 1− w
Thus no constraint on w.
Dynamics
dXt = wt [α− r]Xtdt + (rXt − ct) dt + wσXtdWt,
Tomas Bjork, 2010 34
HJB Equation
∂V
∂t+ sup
c≥0,w∈R
(F (t, c) + wx(α− r)
∂V
∂x+ (rx− c)
∂V
∂x+
1
2x
2w
2σ
2∂2V
∂x2
)= 0,
V (T, x) = 0,
V (t, 0) = 0.
We now specialize (why?) to
F (t, c) = e−δt
cγ,
so we have to maximize
e−δt
cγ+ wx(α− r)
∂V
∂x+ (rx− c)
∂V
∂x+
1
2x
2w
2σ
2∂2V
∂x2,
Tomas Bjork, 2010 35
Analysis of the HJB Equation
In the embedde static problem we maximize, over cand w,
e−δtcγ +wx(α− r)∂V
∂x+(rx− c)
∂V
∂x+
12x2w2σ2∂
2V
∂x2,
First order conditions:
γcγ−1 = eδtVx,
w =−Vx
x · Vxx· α− r
σ2,
Ansatz:V (t, x) = e−δth(t)xγ,
Because of the boundary conditions, we must demandthat
h(T ) = 0. (2)
Tomas Bjork, 2010 36
Given a V of this form we have (using · to denote thetime derivative)
∂V
∂t= e−δthxγ − δe−δthxγ,
∂V
∂x= γe−δthxγ−1,
∂2V
∂x2= γ(γ − 1)e−δthxγ−2.
giving us
w(t, x) =α− r
σ2(1− γ),
c(t, x) = xh(t)−1/(1−γ).
Plug all this into HJB!
Tomas Bjork, 2010 37
After rearrangements we obtain
xγ
h(t) + Ah(t) + Bh(t)−γ/(1−γ)
= 0,
where the constants A and B are given by
A =γ(α− r)2
σ2(1− γ)+ rγ − 1
2γ(α− r)2
σ2(1− γ)− δ
B = 1− γ.
If this equation is to hold for all x and all t, then wesee that h must solve the ODE
h(t) + Ah(t) + Bh(t)−γ/(1−γ) = 0,
h(T ) = 0.
An equation of this kind is known as a Bernoulliequation, and it can be solved explicitly.
We are done.
Tomas Bjork, 2010 38
Merton’s Mutal Fund Theorems
1. The case with no risk free asset
We consider n risky assets with dynamics
dSi = Siαidt + SiσidW, i = 1, . . . , n
where W is Wiener in Rk. On vector form:
dS = D(S)αdt + D(S)σdW.
where
α =
α1...
αn
σ =
σ1...
σn
D(S) is the diagonal matrix
D(S) = diag[S1, . . . , Sn].
Wealth dynamics
dX = Xw′αdt− cdt + Xw′σdW.
Tomas Bjork, 2010 39
Formal problem
maxc,w
E
[∫ τ
0
F (t, ct)dt
]given the dynamics
dX = Xw′αdt− cdt + Xw′σdW.
and constraints
e′w = 1, c ≥ 0.
Assumptions:
• The vector α and the matrix σ are constant anddeterministic.
• The volatility matrix σ has full rank so σσ′ is positivedefinite and invertible.
Note: S does not turn up in the X-dynamics so V isof the form
V (t, x, s) = V (t, x)
Tomas Bjork, 2010 40
The HJB equation is∂V
∂t(t, x) + sup
e′w=1, c≥0
F (t, c) +Ac,wV (t, x) = 0,
V (T, x) = 0,
V (t, 0) = 0.
where
Ac,wV = xw′α∂V
∂x− c
∂V
∂x+
12x2w′Σw
∂2V
∂x2,
and where the matrix Σ is given by
Σ = σσ′.
Tomas Bjork, 2010 41
The HJB equation is8>>>><>>>>:Vt(t, x) + sup
w′e=1, c≥0
F (t, c) + (xw
′α− c)Vx(t, x) +
1
2x
2w′ΣwVxx(t, x)
ff= 0,
V (T, x) = 0,
V (t, 0) = 0.
where Σ = σσ′.
If we relax the constraint w′e = 1, the Lagrange function for the staticoptimization problem is given by
L = F (t, c) + (xw′α− c)Vx(t, x) +12x2w′ΣwVxx(t, x) + λ (1− w′e) .
Tomas Bjork, 2010 42
L = F (t, c) + (xw′α− c)Vx(t, x)
+12x2w′ΣwVxx(t, x) + λ (1− w′e) .
The first order condition for c is
∂F
∂c(t, c) = Vx(t, x).
The first order condition for w is
xα′Vx + x2Vxxw′Σ = λe′,
so we can solve for w in order to obtain
w = Σ−1
[λ
x2Vxxe− xVx
x2Vxxα
].
Using the relation e′w = 1 this gives λ as
λ =x2Vxx + xVxe′Σ−1α
e′Σ−1e,
Tomas Bjork, 2010 43
Inserting λ gives us, after some manipulation,
w =1
e′Σ−1eΣ−1e +
Vx
xVxxΣ−1
[e′Σ−1α
e′Σ−1ee− α
].
We can write this as
w(t) = g + Y (t)h,
where the fixed vectors g and h are given by
g =1
e′Σ−1eΣ−1e,
h = Σ−1
[e′Σ−1α
e′Σ−1ee− α
],
whereas Y is given by
Y (t) =Vx(t, X(t))
X(t)Vxx(t, X(t)).
Tomas Bjork, 2010 44
We hadw(t) = g + Y (t)h,
Thus we see that the optimal portfolio is movingstochastically along the one-dimensional “optimalportfolio line”
g + sh,
in the (n − 1)-dimensional “portfolio hyperplane” ∆,where
∆ = w ∈ Rn |e′w = 1 .
If we fix two points on the optimal portfolio line, saywa = g + ah and wb = g + bh, then any point w onthe line can be written as an affine combination of thebasis points wa and wb. An easy calculation showsthat if ws = g + sh then we can write
ws = µwa + (1− µ)wb,
where
µ =s− b
a− b.
Tomas Bjork, 2010 45
Mutual Fund Theorem
There exists a family of mutual funds, given byws = g + sh, such that
1. For each fixed s the portfolio ws stays fixed overtime.
2. For fixed a, b with a 6= b the optimal portfolio w(t)is, obtained by allocating all resources between thefixed funds wa and wb, i.e.
w(t) = µa(t)wa + µb(t)wb,
3. The relative proportions (µa, µb) of wealth allocatedto wa and wbare given by
µa(t) =Y (t)− b
a− b,
µb(t) =a− Y (t)
a− b.
Tomas Bjork, 2010 46
The case with a risk free asset
Again we consider the standard model
dS = D(S)αdt + D(S)σdW (t),
We also assume the risk free asset B with dynamics
dB = rBdt.
We denote B = S0 and consider portfolio weights(w0, w1, . . . , wn)′ where
∑n0 wi = 1. We then
eliminate w0 by the relation
w0 = 1−n∑1
wi,
and use the letter w to denote the portfolio weightvector for the risky assets only. Thus we use thenotation
w = (w1, . . . , wn)′,
Note: w ∈ Rn without constraints.
Tomas Bjork, 2010 47
HJB
We obtain
dX = X · w′(α− re)dt + (rX − c)dt + X · w′σdW,
where e = (1, 1, . . . , 1)′.
The HJB equation now becomesVt(t, x) + sup
c≥0,w∈RnF (t, c) +Ac,wV (t, x) = 0,
V (T, x) = 0,
V (t, 0) = 0,
where
AcV = xw′(α− re)Vx(t, x) + (rx− c)Vx(t, x)
+12x2w′ΣwVxx(t, x).
Tomas Bjork, 2010 48
First order conditions
We maximize
F (t, c) + xw′(α− re)Vx + (rx− c)Vx +12x2w′ΣwVxx
with c ≥ 0 and w ∈ Rn.
The first order conditions are
∂F
∂c(t, c) = Vx(t, x),
w = − Vx
xVxxΣ−1(α− re),
with geometrically obvious economic interpretation.
Tomas Bjork, 2010 49
Mutual Fund Separation Theorem
1. The optimal portfolio consists of an allocationbetween two fixed mutual funds w0 and wf .
2. The fund w0 consists only of the risk free asset.
3. The fund wf consists only of the risky assets, andis given by
wf = Σ−1(α− re).
4. At each t the optimal relative allocation of wealthbetween the funds is given by
µf(t) = − Vx(t, X(t))X(t)Vxx(t, X(t))
,
µ0(t) = 1− µf(t).
Tomas Bjork, 2010 50
3. The Martingale Approach
• Decoupling the wealth profile from the portfoliochoice.
• Lagrange relaxation.
• Solving the general wealth problem.
• Example: Log utility.
• Example: The numeraire portfolio.
• Computing the optimal portfolio.
• The Merton fund separation theorems from amartingale perspective..
Tomas Bjork, 2010 51
Problem Formulation
Standard model with internal filtration
dSt = D(St)αtdt + D(St)σtdWt,
dBt = rBtdt.
Assumptions:
• Drift and diffusion terms are allowed to be arbitraryadapted processes.
• The market is complete.
• We have a given initial wealth x0
Problem:maxh∈H
EP [Φ(XT )]
whereH = self financing portfolios
given the initial wealth X0 = x0.
Tomas Bjork, 2010 52
Some observations
• In a complete market, there is a unique martingalemeasure Q.
• Every claim Z satisfying the budget constraint
e−rTEQ [Z] = x0,
is attainable by an h ∈ H and vice versa.
• We can thus write our problem as
maxZ
EP [Φ(Z)]
subject to the constraint
e−rTEQ [Z] = x0.
• We can forget the wealth dynamics!
Tomas Bjork, 2010 53
Basic Ideas
Our problem was
maxZ
EP [Φ(Z)]
subject toe−rTEQ [Z] = x0.
Idea I:
We can decouple the optimal portfolio problem:
• Finding the optimal wealth profile Z.
• Given Z, find the replicating portfolio.
Idea II:
• Rewrite the constraint under the measure P .
• Use Lagrangian techniques to relax the constraint.
Tomas Bjork, 2010 54
Lagrange formulation
Problem:max
ZEP [Φ(Z)]
subject toe−rTEP [LTZ] = x0.
Here L is the likelihood process, i.e.
LT =dQ
dP, on FT
The Lagrangian of this is
L = EP [Φ(Z)] + λx0 − e−rTEP [LTZ]
i.e.
L = EP[Φ(Z)− λe−rTLTZ
]+ λx0
Tomas Bjork, 2010 55
The optimal wealth profile
Given enough convexity and regularity we now expect,given the dual variable λ, to find the optimal Z bymaximizing
L = EP[Φ(Z)− λe−rTLTZ
]+ λx0
over unconstrained Z, i.e. to maximize∫Ω
Φ(Z(ω))− λe−rTLT (ω)Z(ω)
dP (ω)
This is a trivial problem!
We can simply maximize Z(ω) for each ω separately.
maxz
Φ(z)− λe−rTLTz
Tomas Bjork, 2010 56
The optimal wealth profile
Our problem:
maxz
Φ(z)− λe−rTLTz
First order condition
Φ′(z) = λe−rTLT
The optimal Z is thus given by
Z = G(λe−rTLT
)where
G(y) = [Φ′]−1 (y).
The dual varaiable λ is determined by the constraint
e−rTEP[LT Z
]= x0.
Tomas Bjork, 2010 57
Example – log utility
Assume thatΦ(x) = ln(x)
Then
g(y) =1y
Thus
Z = G(λe−rTLT
)=
1λerTL−1
T
Finally λ is determined by
e−rTEP[LT Z
]= x0.
i.e.
e−rTEP
[LT
1λerTL−1
T
]= x0.
so λ = x−10 and
Z = x0erTL−1
T
Tomas Bjork, 2010 58
The Numeraire Portfolio
Standard approach:
• Choose a fixed numeraire (portfolio) N .
• Find the corresponding martingale measure, i.e. find QN s.t.
B
N, and
S
N
are QN -martingales.
Alternative approach:
• Choose a fixed measure Q.
• Find numeraire N such that Q = QN .
Special case:
• Set Q = P
• Find numeraire N such that QN = P i.e. such that
B
N, and
S
N
are QN -martingales under the objective measure P .
• This N is the numeraire portfolio.
Tomas Bjork, 2010 59
Log utility and the numeraire portfolio
Definition:The growth optimal portfolio (GOP) is the portfoliowhich is optimal for log utility (for arbitrary terminaldate T .
Theorem:Assume that X is GOP. Then X is the numeraireportfolio.
Proof:We have to show that the process
Yt =St
Xt
is a P martingale. From above we know that
XT = x0erTL−1
T .
We also have (why?)
Xt = e−r(T−t)EQ [XT | Ft] = e−r(T−t)EP
[XTLT
Lt
∣∣∣∣Ft
]Tomas Bjork, 2010 60
Thus
Xt = e−r(T−t)EP
[XTLT
Lt
∣∣∣∣Ft
]= e−r(T−t)EQ
[x0e
rTLT
LTLt
∣∣∣∣Ft
]= x0e
rtL−1t .
as expected.
Thus
St
Xt= x−1
0 e−rtStLt
which is a P martingale, since x−10 e−rtSt is a Q
martingale.
Tomas Bjork, 2010 61
Back to general case: Computing LT
We recallZ = G
(λe−rTLT
).
The likelihood process L is computed by usingGirsanov. We recall
dSt = D(St)αtdt + D(St)σtdWt,
We know from Girsanov that
dLt = Ltϕ?tdWt
sodWt = ϕtdt + dWQ
t
where WQ is Q-Wiener.
Thus
dSt = D(St) αt + σtϕt dt + D(St)σtdWQt ,
Tomas Bjork, 2010 62
Computing LT , continued
Recall
dSt = D(St) αt + σtϕt dt + D(St)σtdWQt ,
The kernel ϕ is determined by the martingale measurecondition
αt + σtϕt = rwhere
r =
r...r
Market completeness implies that σt is invertible so
ϕt = σ−1t r− αt
and
LT = exp
(∫ T
0
ϕtdWt −12
∫ T
0
‖ϕt‖2dt
)
Tomas Bjork, 2010 63
Finding the optimal portfolio
• We can easily compute the optimal wealth profile.
• How do we compute the optimal portfolio?
Recall:
dSt = D(St)αtdt + D(St)σtdWt,
wealth dynamics
dXt = hBt dBt + hS
t dSt
ordXt = Xtu
Bt rdt + Xtu
St D(St)−1dSt
where
hS = (h1, . . . , hn), uS = (u1, . . . , un)
Assume for simplicity that r = 0 or consider normalizedprices.
Tomas Bjork, 2010 64
Recall wealth dynamics
dXt = hSt dSt
alternatively
dXt = hSt D(St)σtdWQ
t
alternativelydXt = Xtu
St σtdWQ
t
Obvious facts:
• X is a Q martingale.
• XT = Z
Thus the optimal wealth process is determined by
Xt = EQ[Z∣∣∣Ft
]
Tomas Bjork, 2010 65
RecallXt = EQ
[Z∣∣∣Ft
]Martingale representation theorem gives us
dXt = ξtdWQt ,
but also
dXt = XtuSt σtdWQ
t
dXt = hSt D(St)σtdWQ
t
Thus uS and hSt are determined by
uSt =
1Xt
ξtσ−1t
hSt = ξtσ
−1t D(St)−1.
anduB
t = 1− uSt e, hB
t = Xt − hSt St
Tomas Bjork, 2010 66
How do we find ξ?
Recall
Xt = EQ[Z∣∣∣Ft
]dXt = ξtdWQ
t ,
We need to compute ξ.
In a Markovian framework this follows direcetly fromthe Ito formula.
RecallZ = H(LT ) = G (λLT )
whereG = [Φ′]−1
and
dLt = Ltϕ?tdWt,
dW = ϕdt + dWQt
sodLt = Lt‖ϕt‖2dt + Ltϕ
?tdWQ
t
Tomas Bjork, 2010 67
Finding ξ
If the model is Markovian we have
αt = α(St), σt = σ(St), ϕt = σ(St)−1 α(St)− r
so
Xt = EQ [H(LT )| Ft]
dSt = D(St)σ(St)dWQt ,
dLt = Lt‖ϕ(St)‖2dt + Ltϕ(St)?dWQt
Thus we have
Xt = F (t, St, Lt)
where, by the Kolmogorov backward equation
Ft + L‖ϕ‖2FL +12L2‖ϕ‖2FLL +
12tr σFssσ
? = 0,
F (T, s, L) = H(L)
Tomas Bjork, 2010 68
Finding ξ, contd.
We had
Xt = F (t, St, Lt)
and Ito gives us
dXt = FSD(St)σ(St) + FLLtϕ?(St) dWt
Thus
ξt = FSD(St)σ(St) + FLLtϕ?(St).
and
uSt =
1Xt
ξtσ(St)−1
hSt = ξtσ(St)−1D(St)−1.
Tomas Bjork, 2010 69
Mutual Funds – Martingale Version
We now assume constant parameters
α(s) = α, σ(s) = σ, ϕ(s) = ϕ
We recall
Xt = EQ [H(LT )| Ft]
dLt = Lt‖ϕ‖2dt + Ltϕ?dWQ
t
Now L is Markov so we have (without any S)
Xt = F (t, Lt)
Thus
ξt = FLLtϕ?, uS
t =FLLt
Xtϕ?σ−1
and we have fund separation with the fixed risky fundgiven by
w = ϕ?σ−1 = r? − α? σσ?−1.
Tomas Bjork, 2010 70
3. Filtering theory
• Motivational problem.
• The Innovations process.
• The non-linear FKK filtering equations.
• The Wonham filter.
• The Kalman filter.
Tomas Bjork, 2010 71
An investment problem with stochasticrate of return
Model:
dSt = Stα(Yt)dt + StσdWt
W is scalar and Y is some factor process. We assumethat (S, Y ) is Markov and adapted to the filtration F.
Wealth dynamics
dXt = Xt [r + ut (α− r)] dt + utXtσdWt
Objective:max
uEP [Φ(XT )]
Information structure:
• Complete information: We observe S and Y , sou ∈ F
• Incomplete information: We only observe S, sou ∈ FS. We need filtering theory.
Tomas Bjork, 2010 72
Filtering Theory – Setup
Given some filtration F :
dYt = atdt + dMt
dZT = btdt + dWt
Here all processes are F adapted and
Y = signal process,
Z = observation process,
M = martingale w.r.t. F
W = Wiener w.r.t. F
We assume (for the moment) that M and W areindependent.
Problem:Compute (recursively) the filter estimate
Yt = Πt [Y ] = E[Yt| FZ
t
]Tomas Bjork, 2010 73
The innovations process
Recall:dZT = btdt + dWt
Our best guess of bt is bt, so the genuinely newinformation should be
dZt − btdt
The innovations process ν is defined by
νt = dZt − btdt
Theorem: The process ν is FZ-Wiener.
Proof: By Levy it is enough to show that
• ν is an FZ martingale.
• ν2t − t is an FZ martingale.
Tomas Bjork, 2010 74
I. ν is an FZ martingale:
From definition we have
dνt =(bt − bt
)dt + dWt (3)
so
EZs [νt − νs] =
∫ t
s
EZs
[bu − bu
]du + EZ
s [Wt −Ws]
=∫ t
s
EZs
[EZ
u
[bu − bu
]]du + EZ
s [Es [Wt −Ws]] = 0
I. ν2t − t is an FZ martingale:
From Ito we have
dν2t = 2νtdνt + (dνt)
2
Here dν is a martingale increment and from (3) itfollows that (dνt)
2 = dt.
Tomas Bjork, 2010 75
Remark 1:The innovations process gives us a Gram-Schmidtorthogonalization of the increasing family of Hilbertspaces
L2(FZt ); t ≥ 0.
Remark 2:The use of Ito above requires general semimartingaleintegration theory, since we do not know a priori thatν is Wiener.
Tomas Bjork, 2010 76
Filter dynamics
From the Y dynamics we guess that
dYt = atdt + martingale
Definition: dmt = dYt − atdt.
Proposition: m is an FZt martingale.
Proof:
EZs [mt −ms] = EZ
s
[Yt − Ys
]− EZ
s
[∫ t
s
audu
]= EZ
s [Yt − Ys]− EZs
[∫ t
s
audu
]= EZ
s [Mt −Ms]− EZs
[∫ t
s
(au − au) du
]= EZ
s [Es [Mt −Ms]]− EZs
[∫ t
s
EZu [au − au] du
]= 0
Tomas Bjork, 2010 77
Filter dynamics
We now have the filter dynamics
dYt = atdt + dmt
where m is an FZt martingale.
If the innovations hypothesis
FZt = Fν
t
is true, then the martingale representation theoremwould give us an FZ
t adapted process h such that
dmt = htdνt (4)
The innovations hypothesis is not generally correct butFKK have proved that in fact (4) is always true.
Tomas Bjork, 2010 78
Filter dynamics
We thus have the filter dynamics
dYt = atdt + htdνt
and it remains to determine the gain process h.
Proposition: The process h is given by
ht = Ytbt − Ytbt
We give a slighty heuristic proof.
Tomas Bjork, 2010 79
Proof scetch
From Ito we have
d (YtZt) = Ytbtdt + YtdWt + Ztatdt + ZtdMt
usingdYt = atdt + htdνt
anddZt = btdt + dνt
we have
d(YtZt
)= Ytbtdt + Ytdνt + Ztatdt + Zthtdνt + htdt
Formally we also should have
E[d (YtZt)− d
(YtXt
)∣∣∣FZt
]= 0
which gives us(Ytbt − Ytbt − ht
)dt = 0.
Tomas Bjork, 2010 80
The filter equations
For the model
dYt = atdt + dMt
dZT = btdt + dWt
where M and W are independent, we have the FKKnon-linear filter equations
dYt = atdt +
Ytbt − Ytbt
dνt
dνt = dZt − btdt
Remark: It is easy to see that
ht = E[(
Yt − Yt
)(bt − bt
)∣∣∣FZt
]
Tomas Bjork, 2010 81
The general filter equations
For the model
dYt = atdt + dMt
dZT = btdt + σtdWt
where
• The process σ is FZt adapted and positive.
• There is no assumption of independence betweenM and W .
we have the filter
dYt = atdt +[Dt +
1σt
Ytbt − Ytbt
]dνt
dνt =1σt
dZt − btdt
dDt =
d〈M,W 〉tdt
Tomas Bjork, 2010 82
Comment on 〈M,W 〉
This requires semimartingale theory but there are twosimple cases
• If M is Wiener then
d〈M,W 〉t = dMtdWt
with usual multiplication rules.
• If M is a pure jump process then
d〈M,W 〉t = 0.
Tomas Bjork, 2010 83
Filtering a Markov process
Assume that Y is Markov with generator G. Wewant to compute Πt [f(Yt)], for some nice function f .Dynkin’s formula gives us
df(Yt) = (Gf) (Yt)dt + dMt
Assume that the observations are
dZt = b(Yt)dt + dWt
where W is independent of Y .
The filter equations are now
dΠt [f ] = Πt [Gf ] dt + Πt [fb]−Πt [f ] Πt [b] dνt
dνt = dZt −Πt [b] dt
Remark: To obtain dΠt [f ] we need Πt [fb] and Πt [b].This leads generically to an infinite dimensional systemof filter equations.
Tomas Bjork, 2010 84
On the filter dimension
dΠt [f ] = Πt [Gf ] dt + Πt [fb]−Πt [f ] Πt [b] dνt
• To obtain dΠt [f ] we need Πt [fb] and Πt [b].
• Thus we apply the FKK equations to Gf and b.
• This leads to new filter estimates to determine andgenerically to an infinite dimensional system offilter equations.
• The filter equations are really equations for theentire conditional distribution of Y .
• You can only expect the filter to be finitewhen the conditional distribution of Y is finitelyparameterized.
• There are only very few examples of finitedimensional filters.
• The most well known finite filters are the Wonhamand the Kalman filters.
Tomas Bjork, 2010 85
The Wonham filter
Assume that Y is a continuous time Markov chain onthe state space 1, . . . , n with (constant) generatormatrix H. Define the indicator processes by
δi(t) = I Yt = i , i = 1, . . . , n.
Dynkin’s formula gives us
dδit =
∑j
H(j, i)δjdt + dM it , i = 1, . . . , n.
Observations are
dZt = b(Yt)dt + dWt.
Filter equations:
dΠt [δi] =∑
j
H(j, i)Πt [δj] dt+Πt [δib]−Πt [δi] Πt [b] dνt
dνt = dZt −Πt [b] dt
Tomas Bjork, 2010 86
We note that
b(Yt) =∑
i
b(i)δi(t)
so
Πt [δib] = b(i)Πt [δi] ,
Πt [b] =∑
j
b(j)Πt [δj]
We finally have the Wonham filter
dδi =∑
j
H(j, i)δjdt +
b(i)δi − δi
∑j
b(j)δj
dνt,
dνt = dZt −∑
j
b(j)δjdt
Tomas Bjork, 2010 87
The Kalman filter
dYt = aYtdt + cdVt,
dZt = Ytdt + dWt
W and V are independent Wiener
FKK gives us
dΠt [Y ] = aΠt [Y ] dt +
Πt
[Y 2]− (Πt [Y ])2
dνt
dνt = dZt −Πt [Y ] dt
We need Πt
[Y 2], so use Ito to get write
dY 2t =
2aY 2
t + c2
dt + 2cYtdVt
From FKK:
dΠt
[Y 2]
=2aΠt
[Y 2]+ c2
dt
+Πt
[Y 3]−Πt
[Y 2]Πt [Y ]
dνt
Now we need Πt
[Y 3]! Etc!
Tomas Bjork, 2010 88
Define the conditional error variance by
Ht = Πt
[(Πt [Yt]−Πt [Y ])2
]= Πt
[Y 2]− (Πt [Y ])2
Ito gives us
d (Πt [Y ])2 =[2a (Πt [Y ])2 + H2
]dt + 2Πt [Y ]Hdνt
and Ito again
dHt =2aHt + c2 −H2
t
dt
+
Πt
[Y 3]− 3Πt
[Y 2]Πt [Y ] + 2 (Πt [Y ])3
dνt
In this particular case we know (why?) that thedistribution of Y conditional on Z is Gaussian!
Thus we have
Πt
[Y 3]
= 3Πt
[Y 2]Πt [Y ]− 2 (Πt [Y ])3
so H is deterministic (as expected).
Tomas Bjork, 2010 89
The Kalman filter
Model:
dYt = aYtdt + cdVt,
dZt = Ytdt + dWt
Filter:
dΠt [Y ] = aΠt [Y ] dt + Htdνt
Ht = 2aHt + c2 −H2t
dνt = dZt −Πt [Y ] dt
Ht = Πt
[(Πt [Yt]−Πt [Y ])2
]
Remark: Because of the Gaussian structure, theconditional distribution evolves on a two dimensionalsubmanifold. Hence a two dimensional filter.
Tomas Bjork, 2010 90
Optimal investment with stochastic rateof return
• A market model with a stochastic rate of return.
• Optimal portolios under complete information.
• Optimal portfolios under partial information.
Tomas Bjork, 2010 91
An investment problem with stochasticrate of return
Model:
dSt = Stα(Yt)dt + StσdWt
W is scalar and Y is some factor process. We assumethat (S, Y ) is Markov and adapted to the filtration F.
Wealth dynamics
dXt = Xt r + ut [α(Yt)− r] dt + utXtσdWt
Objective:max
uEP [Xγ
T ]
We assume that Y is a Markov process. with generatorA. We will treat several cases.
Tomas Bjork, 2010 92
A. Full information, Y and W
independent.
The HJB equation for F (t, x, y) becomes
Ft+supu
u [α− r]xFx + rxFx +
12u2x2σ2Fxx
+AF = 0
where G operates on the y-variable. Obvious boundarycondition
F (t, x, y) = xγ
First order condition gives us:
u =r − α
xσ2· Fx
Fxx
Plug into HJB:
Ft −(α− r)2
2σ2
F 2x
Fxx+ rxFx +AF = 0
Tomas Bjork, 2010 93
Ansatz:F (T, x, y) = xγG(t, y)
Ft = xγGt, Fx = γxγ−1G, Fxx = γ(γ − 1)xγ−2G
Plug into HJB:
xγGt + xγ (α− r)2
2σ2βG + rγxγG + xγAG = 0
where β = γ/(γ − 1).
Gt(t, y) + H(y)G(t, y) +AG(t, y) = 0,
G(T, y) = 1.
here
H(y) = rγ − [α(y)− r]2
2σ2β
Kolmogorov gives us
H(y) = Et,y
[e
R Tt H(Ys)ds
]
Tomas Bjork, 2010 94
B. Full information, Y and W dependent.
Now we allow for dependence but restrict Y todynamics of the form.
dSt = Stα(Yt)dt + StσdWt
dXt = Xt r + ut [α(Yt)− r] dt + utXtσdWt
dYt = a(Yt)dt + b(Yt)dWt
with the same Wiener process W driving both S andY . The imperfectly correlated case is a bit more messybut can also be handled.
HJB Equation for F (t, x, y):
Ft + aFy +1
2b2Fyy + rxFx
+ supu
u [α− r] xFx +
1
2u
2x
2σ
2Fxx + uxbσFxy
ff= 0.
Tomas Bjork, 2010 95
Ansatz.
HJB:
Ft + aFy +12b2Fyy + rxFx
+supu
u [α− r]xFx +
12u2x2σ2Fxx + uxbσFxy
= 0.
After a lot of thinking we make the Ansatz
F (t, x, y) = xγh1−γ(t, y)
and then it is not hard to see that h satisfies a standardparabolic PDE. (Plug Ansatz into HJB).
See Zariphopoulou for details and more complicatedcases.
Tomas Bjork, 2010 96
C. Partial information.
Model:
dSt = Stα(Yt)dt + StσdWt
Assumption: Y cannot be observed directly.
Reruirement: The control u must be FSt adapted.
We thus have a partially observed system.
Idea: Project the S dynamics onto the smaller FSt
filtration and add filter equations in order to reducethe problem to the completely observable case.
Set Z = lnS and note (why?) that FZt = FS
t . Wehave
dZt =
α(Yt)−12σ2
dt + σdWt
Tomas Bjork, 2010 97
Projecting onto the S-filtration
dZt =
α(Yt)−12σ2
dt + σdWt
From filtering theory we know that
dZt =
Πt [α]− 12σ2
dt + σdνt
where ν is FSt -Wiener and
Πt [α] = E[α(Yt)| FS
t
]We thus have the following S dynamics on the Sfiltration
dSt = StΠt [α] dt + Stσdνt
and wealth dynamics
dXt = Xt r + ut (Πt [α]− r) dt + utXtσdνt
Tomas Bjork, 2010 98
Reformulated problem
We now have the problem
maxu
E [XγT ]
for Z-adapted controls given wealth dynamics
dXt = Xt r + ut (Πt [α]− r) dt + utXtσdνt
If we now can model Y such that the (linear!)observation dynamics for Z will produce a finite filtervector π, then we are back in the completelyobservable case with Y replaced by π.and observationequation
We neeed a finite dimensional filter!
Two choices for Y
• Linear Y dynamics. This will give us the Kalmanfilter. See Brendle
• Y as a Markov chain. This will give us the Wonhamfilter. See Bauerle and Rieder.
Tomas Bjork, 2010 99
Kalman case (Brendle)
Assume that
dSt = YtStdt + StσdWt
with Y dynamics
dYt = aYtdt + cdVt
where W and V are independent. Observations:
dZt =
Yt −12σ2
dt + σdWt
We have a standard Kalman filter.
Wealth dynamics
dXt = Xt
r + ut
(Yt − r
)dt + utXtσdνt
Tomas Bjork, 2010 100
Kalman case, solution.
maxu
E [XγT ]
dXt = Xt
r + ut
(Yt − r
)dt + utXtσdνt
dYt =
Yt −12σ2
dt + Htσdνt
where H is deterministic and given by a Riccattiequation.
We are back in standard completly observable casewith state variables X and Y .
Thus the optimal value function is of the form
F (t, x, y) = xγh1−γ(t, y)
where h solves a parabolic PDE and can be computedexplicitly.
Tomas Bjork, 2010 101
Top Related