Probabilistic Analysis of MFGs III. Master Equations, Games with … · 2016-10-28 ·...

Post on 15-Jul-2020

2 views 0 download

Transcript of Probabilistic Analysis of MFGs III. Master Equations, Games with … · 2016-10-28 ·...

PROBABILISTIC ANALYSIS OF MFGSIII. MASTER EQUATIONS, GAMES WITH

COMMON NOISE, AND WITH MAJOR ANDMINOR PLAYERS

René Carmona

Department of Operations Research & Financial EngineeringPACM

Princeton University

Minerva Lectures, Columbia U October 27, 2016

RECALL THE JOINT CHAIN RUILE

I If u is smoothI If dξt = ηt dt + γt dWt

I If dXt = bt dt + σt dWt and µt = PXt

u(t , ξt , µt ) = u(0, ξ0, µ0) +

∫ t

0∂x u(s, ξs, µs) ·

(γsdWs

)+

∫ t

0

(∂t u(s, ξs, µs) + ∂x u(s, ξs, µs) · ηs +

12

trace[∂2

xx u(s, ξs, µs)γsγ†s])

ds

+

∫ t

0E[∂µu(s, ξs, µs)(Xs) · bs

]ds +

12

∫ t

0E[trace

(∂v[∂µu(s, ξs, µs)

](Xs)σsσ

†s)]

ds

where the process (Xt , bt , σt )0≤t≤T is an independent copy of the process(Xt , bt , σt )0≤t≤T , on a different probability space (Ω, F , P)

DERIVING THE MASTER EQUATION

I If we assume a form of Dynamic Programing PrincipleI If (t , x , µ) → U(t , x , µ) = Vµ(t , x) is the master fieldI We expect(

U(t ,Xt , µt )−∫ t

0f(s,Xs, µs, α(s,Xs, µs,Ys)

)ds)

0≤t≤T

to be a martingale whenever (Xt ,Yt ,Zt )0≤t≤T is the solution of theFBSDE characterizing the optimal path under (µt )0≤t≤T .

I So compute its Itô differential and set the drift to 0

AN EXAMPLE OF DERIVATION

dXt = b(t ,Xt , µt , αt )dt + dWt

H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)

α(t , x , µ, y) = arg infα

H(t , x , µ, y , α)

Itô’s Formula with µt = PXt

(set αt = α(t ,Xt , µt , ∂U(t ,Xt , µt )) and bt = b(t ,Xt , µt , αt ))

dU(t ,Xt , µt ) =(∂tU(t ,Xt , µt ) + bt · ∂xU(t ,Xt , µt ) +

12

trace[∂2xxU(t ,Xt , µt )] + f

(t , x , µ, αt

))dt

+ E[bt · ∂µU(t ,Xt , µt )(Xt ) +

12∂v∂µU(t ,Xt , µt )

](Xt )]

]dt + ∂xU(t ,Xt , µt )dWt

THE ACTUAL MASTER EQUATION

∂tU(t , x , µ) + b(t , x , µ, α(t , x , µ, ∂U(t , x , µ))

)· ∂xU(t , x , µ)

+12

trace[∂2

xxU(t , x , µ)]

+ f(t , x , µ, α(t , x , µ, ∂U(t , x , µ))

)+

∫Rd

[b(t , x ′, µ, α(t , x , µ, ∂U(t , x , µ))

)· ∂µU(t , x , µ)(x ′)

+12

trace(∂v∂µU(t , x , µ)(x ′)

)]dµ(x ′) = 0,

for (t , x , µ) ∈ [0,T ]× Rd × P2(Rd ), with the terminal conditionV (T , x , µ) = g(x , µ).

MEAN FIELD GAMES WITH A COMMON NOISE

Starting with a finite player game, i.e.

Simultaneous Minimization of

J i (α) = E

∫ T

0f (t ,X i

t , µNt , α

it )dt + g(XT , µ

NT )

, i = 1, · · · ,N

under constraints (dynamics of players private states)

dX it = b(t ,X i

t , µNt , α

it )dt + σ(t ,X i

t , µNt , α

it )dW i

t + σ0(t ,X it , µ

Nt , α

it )dW 0

t

for i.i.d. Wiener processes W kt for k = 0,1, · · · ,N.

LARGE GAME ASYMPTOTICS (CONT.)

Conditional Law of Large NumbersI If we consider exchangeable equilibriums,(α1

t , · · · , αNt ), then

I By de Finetti LLNlim

N→∞µN

t = PX1t |F

0t

I Dynamics of player 1 (or any other player) becomes

dX 1t = b(t ,X 1

t , µt , α1t )dt +σ(t ,X 1

t , µt , α1t )dWt +σ0(t ,X 1

t , µt , α1t )dW 0

t ;

with µt = PX1t |F

0t.

I Cost to player 1 (or any other player) becomes

E∫ T

0f (t ,Xt , µt , α

1t )dt + g(XT , µT )

MFG WITH COMMON NOISE PARADIGM

1. For each Fixed measure valued (F0t )-adapted process µ = (µt ) in P(R), solve

the standard stochastic control problem

α = arg infα

E

∫ T

0f (t ,Xt , µt , αt )dt + g(XT , µT )

subject to

dXt = b(t ,Xt , µt , αt )dt + σ(t ,Xt , µt , αt )dWt + σ0(t ,Xt , µt , αt )dW 0t ;

2. Fixed Point Problem: determine µ = (µt ) so that

∀t ∈ [0,T ], PXt |F0t

= µt a.s.

Once this is done one expects that, if αt = φ(t ,Xt ), for N player game,

αj∗t = φ∗(t ,X j

t ), j = 1, · · · ,N

form an approximate Nash equilibrium for the game with N players.

PONTRYAGIN STOCHASTIC MAXIMUM PRINCIPLE

Freeze µ = (µt )0≤t≤T , write (reduced) Hamiltonian

H(t , x , µ, y , α) = b(t , x , µ, α) · y + f (t , x , µ, α)

Standard definition

Given an admissible control α = (αt )0≤t≤T and the corresponding controlledstate process Xα = (Xα

t )0≤t≤T , any couple (Yt ,Zt )0≤t≤T satisfying:dYt = −∂x H(t ,Xα

t , µt ,Yt , αt )dt + ZtdWt + Z 0t dW 0

t

YT = ∂x g(XαT , µT )

is called a set of adjoint processes

STOCHASTIC CONTROL STEP SOLUTION

Determineα(t , x , µ, y) = arg inf

αH(t , x , µ, y , α)

Inject in FORWARD and BACKWARD dynamics and SOLVEdXt = b(t ,Xt , µt , α(t ,Xt , µt ,Yt ))dt + σ(t ,Xt )dWt + σ0(t ,Xt )dW 0

t ,

dYt = −∂xHµt (t ,X ,Yt , α(t ,Xt , µt ,Yt ))dt + ZtdWt + Z 0t dW 0

t

with X0 = x0 and YT = ∂xg(XT , µT )

Standard FBSDE (for each fixed t → µt )

FIXED POINT STEP

Solve the fixed point problem

(µt )0≤t≤T −→ (Xt )0≤t≤T −→ (PXt |F0t)0≤t≤T

Note: if we enforce µt = PXt |F0t

for all 0 ≤ t ≤ T in FBSDE we havedXt = b(t ,Xt ,PXt |F0

t, α

PXt |F0t (t ,Xt ,Yt ))dt + σ(t ,Xt )dWt + σ(t ,Xt ) dW 0

t ,

dYt = −∂xHPXt |F

0t (t ,Xα

t ,Yt , αPXt |F

0t (t ,Xt ,Yt ))dt + ZtdWt + Z 0

t dW 0t ,

withX0 = x0 and YT = ∂xg(XT ,PXT |F0

T)

FBSDE of Conditional McKean-Vlasov type !!!

Very difficult

SEVERAL APPROCHES

I Relaxed Controls (R.C. - Delarue - Lacker)

I FBSDEs of Conditional McKean-Vlasov Type (RC - Delarue)I Serious measurability difficulties

I e.g. may need extra martingale terms (and jumps) in BSDE iffiltrations not Brownian

I Develop TheoryI SDEs of Conditional McKean-Vlasov Type (baby steps in RC - Zhu)I Conditional Propagation of Chaos (baby steps in RC - Zhu)

I Introduce notions of strong and weak solutions to MFG problemI Extend Yamada-Watanabe theory to MFG set-upI Existence for a finite common noise (Schauder Theorem)I Weak Solutions by Limiting argumentsI Uniqueness via Monotonicity or Strong ConvexityI Strong Solutions via extension of Yamada-Watanabe

MFGs with Major andMinor Players

R.C. - G. Zhu, R.C. - P. Wang

MFG WITH MAJOR AND MINOR PLAYERS. I

State equationsdX 0

t = b0(t ,X 0t , µt , α

0t )dt + σ0(t ,X 0

t , µt , α0t )dW 0

t

dXt = b(t ,Xt , µt ,X 0t , αt , α

0t )dt + σ(t ,Xt , µt ,X 0

t , αt , α0t dWt ,

Costs J0(α0,α) = E

[∫ T0 f0(t ,X 0

t , µt , α0t )dt + g0(X 0

T , µT )]

J(α0,α) = E[∫ T

0 f (t ,Xt , µNt ,X

0t , αt , α

0t )dt + g(XT , µT )

],

OPEN LOOP VERSION OF THE MFG PROBLEM

The controls used by the major player and the representative minor playerare of the form:

α0t = φ0(t ,W 0

[0,T ]), and αt = φ(t ,W 0[0,T ],W[0,T ]), (1)

for deterministic progressively measurable functions

φ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0

andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A

THE MAJOR PLAYER BEST RESPONSE

Assume representative minor player uses the open loop control given byφ : (t ,w0,w) 7→ φ(t ,w0,w),

Major player minimizes

Jφ,0(α0) = E[∫ T

0f0(t ,X 0

t , µt , α0t )dt + g0(X 0

T , µT )]

under the dynamical constraints:dX 0

t = b0(t ,X 0t , µt , α

0t )dt + σ0(t ,X 0

t , µt , α0t )dW 0

t

dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), α0t )dt

+σ(t ,Xt , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), α0t )dWt ,

µt = L(Xt |W 0[0,t]) conditional distribution of Xt given W 0

[0,t].

Major player problem as the search for:

φ0,∗(φ) = arg infα0

t =φ0(t,W 0[0,T ]

)Jφ,0(α0) (2)

Optimal control of the conditional McKean-Vlasov type!

THE REP. MINOR PLAYER BEST RESPONSE

System against which best response is sought comprises

I a major playerI a field of minor players different from the representative minor playerI Major player uses strategy α0

t = φ0(t ,W 0[0,T ])

I Representative of the field of minor players uses strategyαt = φ(t ,W 0

[0,T ],W[0,T ]).

State dynamicsdX 0

t = b0(t ,X 0t , µt , φ

0(t ,W 0[0,T ]))dt + σ0(t ,X 0

t , µt , φ0(t ,W 0

[0,T ]))dW 0t

dXt = b(t ,Xt , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), φ0(t ,W 0

[0,T ]))dt+σ(t ,Xt , µt ,X 0

t , φ(t ,W 0[0,T ],W[0,T ]), φ

0(t ,W 0[0,T ]))dWt ,

where µt = L(Xt |W 0[0,t]) is the conditional distribution of Xt given W 0

[0,t].

Given φ0 and φ, SDE of (conditional) McKean-Vlasov type

THE REP. MINOR PLAYER BEST RESPONSE (CONT.)

Representative minor player chooses a strategy αt = φ(t ,W 0[0,T ],W[0,T ]) to

minimize

Jφ0,φ(α) = E

[∫ T

0f (t ,X t ,X 0

t , µt , αt , φ0(t ,W 0

[0,T ]))dt + g(X T , µt )],

where the dynamics of the virtual state X t are given by:

dX t = b(t ,X t , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), φ0(t ,W 0

[0,T ]))dt

+ σ(t ,X t , µt ,X 0t , φ(t ,W 0

[0,T ],W[0,T ]), φ0(t ,W 0

[0,T ]))dW t ,

for a Wiener process W = (W t )0≤t≤T independent of the other Wienerprocesses.

I Optimization problem NOT of McKean-Vlasov type.I Classical optimal control problem with random coefficients

φ∗(φ0, φ) = arg inf

αt =φ(t,W 0[0,T ]

,W[0,T ])Jφ

0,φ(α)

NASH EQUILIBRIUM

Search for Best Response Map Fixed Point

(φ0, φ) =(φ0,∗(φ), φ∗(φ0, φ)

).

CLOSED LOOP VERSIONS OF THE MFG PROBLEM

I Closed Loop VersionControls of the major player and the representative minor player are ofthe form:

α0t = φ0(t ,X 0

[0,T ], µt ), and αt = φ(t ,X[0,T ], µt ,X 0[0,T ]),

for deterministic progressively measurable functionsφ0 : [0,T ]× C([0,T ];Rd0 ) 7→ A0 andφ : [0,T ]× C([0,T ];Rd )× C([0,T ];Rd ) 7→ A.

I Markovian VersionControls of the major player and the representative minor player are ofthe form:

α0t = φ0(t ,X 0

t , µt ), and αt = φ(t ,Xt , µt ,X 0t ),

for deterministic feedback functions φ0 : [0,T ]× Rd0 ×P2(Rd ) 7→ A0 andφ : [0,T ]× Rd × P2(Rd )× Rd0 7→ A.

LINEAR QUADRATIC MODELS

State dynamicsdX 0

t = (L0X 0t + B0α

0t + F0Xt )dt + D0dW 0

t

dXt = (LXt + Bαt + FXt + GX 0t )dt + DdWt

where Xt = E[Xt |F0t ], (F0

t )t≥0 filtration generated by W0

Costs

J0(α0,α) = E[∫ T

0[(X 0

t − H0Xt − η0)†Q0(X 0t − H0Xt − η0) + α0†

t R0α0t ]dt

]J(α0,α) = E

[∫ T

0[(Xt − HX 0

t − H1Xt − η)†Q(Xt − HX 0t − H1Xt − η) + α†t Rαt ]dt

]in which Q, Q0, R, R0 are symmetric matrices, and R, R0 are assumed to bepositive definite.

EQUILIBRIA

I Open Loop VersionI Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation

I Closed Loop VersionI Fixed point step more difficultI Search limited to controls of the form

α0t = φ0(t ,X 0

t , Xt ) = φ00(t) + φ0

1(t)X 0t + φ0

2(t)Xt

αt = φ(t ,Xt ,X 0t , Xt ) = φ0(t) + φ1(t)Xt + φ2(t)X 0

t + φ3(t)Xt

I Optimization problems + fixed point =⇒ large FBSDEI affine FBSDE solved by a large matrix Riccati equation

Solutions are not the same !!!!

APPLICATION TO FLOCKINGInspired by Nourian-Caines-Malhame generalization of the basic Cucker-Smalemodel.

I β = 0, so state reduces to velocity as cost depends only on velocity

I V 0,Nt velocity of the (major player) leader at time t

I V i,Nt the velocity of the i-th follower, i = 1, · · · ,N at time t

I Linear dynamics dV 0,N

t = α0t dt + Σ0dW 0

tdV i,N

t = αit dt + ΣdW i

t

I Minimization of Quadratic costs

J0 = E[∫ T

0

(λ0‖V 0,N

t − νt‖2 + λ1‖V 0,Nt − V N

t ‖2 + (1− λ0 − λ1)‖α0

t ‖2)dt

]

I V Nt := 1

N

∑Ni=1 V i,N

t the average velocity of the followers,I deterministic function [0,T ] 3 t → νt ∈ Rd (leader’s free will)I λ0 and λ1 are positive real numbers satisfying λ0 + λ1 ≤ 1

J i = E[∫ T

0

(l0‖V i,N

t − V 0,Nt ‖2 + l1‖V i,N

t − V Nt ‖

2 + (1− l0 − l1)‖αit‖

2)dt]

l0 ≥ 0 and l1 ≥ 0, l0 + l1 ≤ 1.

SAMPLE TRAJECTORIES IN EQUILIBRIUM

ν(t) := [−2π sin(2πt),2π cos(2πt)]

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5

x

y

k0 = 0.80 k1 = 0.19 l0 = 0.19 l1 = 0.80

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5

x

y

k0 = 0.80 k1 = 0.19 l0 = 0.80 l1 = 0.19

FIGURE: Optimal velocity and trajectory of follower and leaders

SAMPLE TRAJECTORIES IN EQUILIBRIUM

ν(t) := [−2π sin(2πt),2π cos(2πt)]

0.0

0.5

1.0

1.5

2.0

0.0 0.5 1.0 1.5 2.0

x

y

k0 = 0.19 k1 = 0.80 l0 = 0.19 l1 = 0.80

0.0

0.5

1.0

1.5

2.0

−0.5 0.0 0.5 1.0 1.5

x

y

k0 = 0.19 k1 = 0.80 l0 = 0.80 l1 = 0.19

FIGURE: Optimal velocity and trajectory of follower and leaders

CONDITIONAL PROPAGATION OF CHAOS

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 5

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 10

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 20

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 50

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

N = 100

FIGURE: Conditional correlation of 5 followers’ velocities