1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of...

27
1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

Transcript of 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of...

Page 1: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

1

Game TheorySequential bargaining

and Repeated Games

Univ. Prof.dr. M.C.W. JanssenUniversity of ViennaWinter semester 2010-11Week 46 (November 14-15)

Page 2: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

2

Sequential Bargaining

Ultimatum game is a sequential bargaining game with one round. SPE we know

Consider then a sequential bargaining game with two rounds and alternating offers, and players discounting future pay-off with δ. SPE pay-offs are (1-δ, δ) Player 2 can propose to keep everything in last round and

this will be accepted. Thus, by refusing in the first round he can guarantee himself δ

Player 1 should give him at least δ in first round if 2 is about to accept; he can get at most 1-δ

Page 3: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

3

Alternating offers (Rubinstein, Stahl) Same stage game, but repeated infinitely often. What are

equilibrium profits? Define v (v*) as lowest (highest) pay-off you can get if you make an

offer Because of infinite horizon and equal discount factors, period 1

analysis is the same as period 2 analysis v ≥1- δv*: lowest pay-off player 1 can guarantee himself is

remaining of highest discounted pay-off player 2 can guarantee himself in the next round

v* ≤1- δv : highest pay-off player 1 can guarantee himself is remaining of lowest discounted pay-off player 2 can guarantee himself in the next round

v ≥1/(1+δ) and v* ≤ 1/(1+δ). Hence, equalities have to hold Player 1 is better off as he makes first proposal, but advantage

disappears when δ gets close to 1. Intuitive First offer such that it is immediately accepted! Why to bother about

rest of the game? Unique subgame perfect equilibrium strategies

Page 4: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

4

What if δ’s differ across players Period 1 analysis is similar to period 3 analysis,

but not anymore to period 2 analysis Define vi (vi*) as lowest (highest) pay-off player i can

get if she makes an offer v1 ≥ 1- δ2v2*; by symmetry, the same thing holds for

player 2. v1* ≤ 1- δ2v2; by symmetry, the same thing holds for

player 2. v1 ≥ (1- δ2)/(1- δ1δ2 ) and v*1 ≤ (1- δ2)/(1- δ1δ2) Hence, equalities have to hold; additional advantage

for player with highest δ.

Page 5: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

5

Notation in repeated games Define history of play as follows. Let a0 = (a0

1 , a02 ,…,a0

n ) be the action profile that is

played in stage 0, i.e., the actions played by all players

History at the beginning of period 1, h1 = a0

History at the beginning of stage t+1, ht+1 = (a0,…,at) The set Ht is the set of all possible histories ht and

Ai(ht) is the set of actions that player i can choose after history ht and Ai(Ht) is the union of this set over all possible histories

Strategy σi of player i is a sequence of mappings {σki}

where each σki maps Hk to mixed actions.

Note that you cannot condition on the random events

Page 6: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

6

Subgame perfection and the one-stage deviation principle in finitely repeated games One stage deviation principle: No player can deviate by deviating

in a single period and then returning back to the (equilibrium) strategy There is no player i and strategy s’(i) that is equal to s*(i) apart

from the action in one period given one history h, such that ui

(s’i ,s*-i ) > ui (s*i ,s*-i ) given that history h Prop. In finite horizon games, a strategy combination s* is a SPE

if, and only if, it satisfies “one stage deviation principle”. Only if: clear, otherwise there is an immediate violation of SPE

definition If: suppose to the contrary, s* satisfies the principle but is not

SPE. Then there is a stage t and a history ht s.t. at least one player has a strategy s’i(ht)≠s*i(ht) and s’i(ht) is a better response. Continuation next slide

Page 7: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

7

Proof one stage deviation principle Let t’ be the last period in which s’i(ht’)≠s*i(ht’) Because of the one-stage-deviation principle t’ > t Period t’ is defined such that for all t” > t’ s’i(ht”)=s*i(ht”) Define then another strategy sI that is such that it

coincides with s’I up to t’ and coincides with s*I at t’ and afterwards.

Because of the one-stage-deviation principle and since s’i(ht”)=si(ht”) for all t” > t’, si is as good a response given history ht

If t’ = t+1, then si only differs in one period from s*, and therefore the one stage deviation principle implies that si cannot be strictly better

If t’ > t+1, similar argument applies (details page 109)

Page 8: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

8

Additional equilibria in repeated games Main interest in repeated games is what type

of equilibrium outcomes can be supported that cannot be supported in a static game Repetition of static equilibrium is always an

equilibrium in a repeated game; not so interesting Thus, what else? Consider an example

Page 9: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

9

A Static Game

Strategy L C RU 12,12 1, 16 -1, 15M 16, 1 7,7 0, 5D 15, -1 5, 0 2, 2

Page 10: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

10

Multiple Equilibria

Strategy L C RU 12,12 1, 16 -1, 15M 16, 1 7, 7 0, 5B 15, -1 5, 0 2, 2

Nash Equilibria

Page 11: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

11

Can non-Nash outcomes of the static game be supported in equilibrium if the game is repeated 2 times?

Strategy L C RU 12,12 1, 16 -1, 15M 16, 1 7,7 0, 5D 15, -1 5, 0 2, 2

Page 12: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

12

Last period analysis

• In the last period they cannot choose for (U,L)

• As both firms have an incentive to “cheat” as 16 is a higher pay-off than 12

• Punishment is not possible (as it is the last period)

Page 13: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

13

First-period analysis

• But: in the first period they can choose for (U,L)

• Strategy:- Choose “U (L) ” in period 1- Choose “M (C)” in period 2 when other chooses “L

(U)” in period 1- Choose “B (R)” in period 2 when other chooses

somthing else in period 1

• Punishment is part of strategy

• Is this an equilibrium? Is it a SPE

Page 14: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

14

Pay-offs in infinitely repeated games Overall pay-offs ui; stage game pay-offs gi,

continuation pay-off from period t onwards

Want to have an expression where one can easily compare stage game pay-offs and repeated game pay-offs, i.e., normalisation:

Time averaging is sometimes used for the case of complete patience

t it hg

))((

t it hg

))(()1(

T

t iT hg1

))(( (1/T) inflim

Page 15: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

15

Folk Theorem I

If players are sufficiently patient, then any feasible, individually rational pay-offs can be enforced by an equilibrium Individually rational pay-offs: minimax pay-off

vi = mj

i is action player j chooses to minimax player i Feasible pay-offs is the convex hull V of the static

game pay-offs, i.e., V = convex hull {v / there is an a A such that g(a) =v}

Both terms need some explanation

),(maxmin iiiaa aagii

Page 16: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

16

Minimax pay-offs

What are the Nash equilibria of this game? Denote by q the probability player

2 chooses L In a mixed strategy eq ⅓≤q≤⅔,

pay-offs 0 and 1 Minimax for player 1

u(U) = -3q+1 u(M) = 3q-2 U(D) = 0 Minimax is 0

Minimax for player 2 is also 0 By 1 choosing (½,½,0)

Thus, minimax pay-offs can be lower than Nash eq. pay-offs

L R

U -2,2 1,-2

M 1,-2 -2,2

D 0,1 0,1

Page 17: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

17

Feasible pay-offs

Equilibrium pay-offs are (2,1), (1,2) and (⅔, ⅔)

Convex hull of eq. pay-offs is triangle connecting the three points (also e.g. (1½,1½))

V connects (2,1), (1,2) and (0, 0) But (1½,1½) cannot be obtained

by independent mixing, only as correlated eq

Correlated mixing can happen in repeated setting by alternating between playing two equilibria (and time averaging pay-offs or δ close to 1)

F B

B 0,0 2,1

F 1,2 0,0

Eq. pay-offs

Page 18: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

18

Folk Theorem II

Prop. For every feasible pay-off vector v with vi > vi, there exist a δ < 1 such that for all δ > δ there exist a Nash equilibrium of the infinitely repeated game with pay-off v. Pay-offs in repeated game cannot only be larger,

but also smaller than static Nash eq pay-offs!! Basic idea: if players are sufficiently patient, then

any finite gain in a one period deviation is nothing compared to a small, but permanent loss in future pay-offs (punishment by minimaxing a player)

Page 19: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

19

“Proof” “Nash Folk Theorem” Consider feasible pay-off v and action profile g(a)=v

If there is no action profile a that yields v, you may choose a sequence of actions such that v is (close to) average (discounted) pay-offs (or a public randomization)

Consider strategy: start by playing ai; play ai as long as others do, if one player j deviates minimax him forever, i.e., choose mj

i

Deviation in period t yields normalised pay-off

which is smaller than vi if δ is larger than δi, where δi solves

it

it

it vagv 1)(max )1()1(

iiiii vvag )(max )1(

Page 20: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

20

Is the threat of Minimaxing credible? If we restrict analysis to static “Nash threats”,

then Friedman shows that only pay-offs larger than the static Nash equilibrium pay-offs can be supported

Others show that in games where the minimax pay-offs are lower than the static equilibrium pay-offs, even worse outcomes can be compatible with a SPE of the infinitely repeated game.

Page 21: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

21

Basic idea of SPE with minimax pay-offs

time averaging After a deviation, play the minimax pay-off for N

periods, where N is chosen for all players s.t. After N periods return back to “cooperative”

mood (finite) N ensures that no player has an

incentive to deviate Cost of punishment is extremely small as with

time averaging pay-offs in a finite number of periods “do not make a difference” Average pay-off to player j when i is punished is v j

iiii NvagvNag )(min )(max

Page 22: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

22

Basic idea of SPE with minimax pay-offs

discounted pay-offs

Previous strategies (for time averaging pay-offs) do not work as it may be that minimaxing another player gives a player a lower pay-off than his own minimax pay-off.

Reward punishers, instead of punishing them if they don’t punish

Choose a vector in the interior of V such that for each i you can still give a higher pay-off. V needs to be of “full dimension”

Play in three phases: Initial cooperative phase Punishment phase where players minimax for N(j) periods the

deviator j (as before); switch to punishment phase for player i if i deviates in one of the N(j)

periods. Reward phase after the punishment phase is fully completed

Page 23: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

23

Renegotiation proofness in repeated games Is SPE the best notion of a credible threat? Suppose you cooperate for some time in the PD and

then someone defects, by chance. Should you go back immediately to always defect? Or should players “renegotiate”? It is in both players interest to revert back to the

cooperative outcome In any subgame the equilibrium played must not be Pareto-

dominated. Pareto-optimality as an assumption and the critique that is

possible (risk dominance and Pareto-dominance) Deviations are accidents and unlikely to be repated?

“Bygones are bygones”

Page 24: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

24

Pareto perfection only applies in two-player games Two Nash equilibria in

pure strategies: (U,L,A) and (D,R,B)

ULA is Pareto-efficient Natural candidate? Suppose players 1 and 2

expect matrix chooser to choose A. Then they can renegotiate and gain by playing (D,R)

L R

U 0,0,10 -5,-5,0

D -5,-5,0 1,1,-5

A

L R

U -2,-2,0 -5,-5,0

D -5,-5,0 -1,-1,5

B

Page 25: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

25

Definition of Pareto perfect equilibrium

Fix stage game g and play it for T periods. Let P(T) the set of pay-offs of pure strategy SPE of

G(T) R(t) is the set of strongly efficient points of P(t), i.e.,

this is the set of points such that there is not another pay-off point where no player is worse off and some player is better off.

Set Q(1) = P(1) For any t, let Q(t) be the set of pay-offs of pure

strategy SPE that can be enforced with continuation pay-offs in R(t-1)

A SPE is Pareto perfect if for every possible history and in every time period t, the continuation pay-offs are in R(T-t)

Page 26: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

26

Pareto perfection restricts threats Some efficient equilibria cannot be supported

anymore under Pareto-perfection It restricts the set of threats, and thereby it is

more difficult to keep players on the equilibrium path

Example

Page 27: 1 Game Theory Sequential bargaining and Repeated Games Univ. Prof.dr. M.C.W. Janssen University of Vienna Winter semester 2010-11 Week 46 (November 14-15)

27

Example Pareto-perfection

c1 c2 c3 c4

R1 0,0 1,4 0,0 6,0

R2 4,1 0,0 0,0 0,0

R3 0,0 0,0 3,3 0,0

R4 0,6 0,0 0,0 5,5

Three pure strategies in G(1) with pay-offs (4,1), (1,4) and (3,3)

In G(2) without discounting pay-off of 8 is possible. Unique element in R(2)

Without restriction to Pareto perfection in G(3) pay-off of 13 possible

With Pareto perfection in first period of G(3) no threat possible; one has to play stage game equilibrium

Equilibrium play alternates between odd and even periods under Pareto perfection