1
Game TheorySequential bargaining
and Repeated Games
Univ. Prof.dr. M.C.W. JanssenUniversity of ViennaWinter semester 2010-11Week 46 (November 14-15)
2
Sequential Bargaining
Ultimatum game is a sequential bargaining game with one round. SPE we know
Consider then a sequential bargaining game with two rounds and alternating offers, and players discounting future pay-off with δ. SPE pay-offs are (1-δ, δ) Player 2 can propose to keep everything in last round and
this will be accepted. Thus, by refusing in the first round he can guarantee himself δ
Player 1 should give him at least δ in first round if 2 is about to accept; he can get at most 1-δ
3
Alternating offers (Rubinstein, Stahl) Same stage game, but repeated infinitely often. What are
equilibrium profits? Define v (v*) as lowest (highest) pay-off you can get if you make an
offer Because of infinite horizon and equal discount factors, period 1
analysis is the same as period 2 analysis v ≥1- δv*: lowest pay-off player 1 can guarantee himself is
remaining of highest discounted pay-off player 2 can guarantee himself in the next round
v* ≤1- δv : highest pay-off player 1 can guarantee himself is remaining of lowest discounted pay-off player 2 can guarantee himself in the next round
v ≥1/(1+δ) and v* ≤ 1/(1+δ). Hence, equalities have to hold Player 1 is better off as he makes first proposal, but advantage
disappears when δ gets close to 1. Intuitive First offer such that it is immediately accepted! Why to bother about
rest of the game? Unique subgame perfect equilibrium strategies
4
What if δ’s differ across players Period 1 analysis is similar to period 3 analysis,
but not anymore to period 2 analysis Define vi (vi*) as lowest (highest) pay-off player i can
get if she makes an offer v1 ≥ 1- δ2v2*; by symmetry, the same thing holds for
player 2. v1* ≤ 1- δ2v2; by symmetry, the same thing holds for
player 2. v1 ≥ (1- δ2)/(1- δ1δ2 ) and v*1 ≤ (1- δ2)/(1- δ1δ2) Hence, equalities have to hold; additional advantage
for player with highest δ.
5
Notation in repeated games Define history of play as follows. Let a0 = (a0
1 , a02 ,…,a0
n ) be the action profile that is
played in stage 0, i.e., the actions played by all players
History at the beginning of period 1, h1 = a0
History at the beginning of stage t+1, ht+1 = (a0,…,at) The set Ht is the set of all possible histories ht and
Ai(ht) is the set of actions that player i can choose after history ht and Ai(Ht) is the union of this set over all possible histories
Strategy σi of player i is a sequence of mappings {σki}
where each σki maps Hk to mixed actions.
Note that you cannot condition on the random events
6
Subgame perfection and the one-stage deviation principle in finitely repeated games One stage deviation principle: No player can deviate by deviating
in a single period and then returning back to the (equilibrium) strategy There is no player i and strategy s’(i) that is equal to s*(i) apart
from the action in one period given one history h, such that ui
(s’i ,s*-i ) > ui (s*i ,s*-i ) given that history h Prop. In finite horizon games, a strategy combination s* is a SPE
if, and only if, it satisfies “one stage deviation principle”. Only if: clear, otherwise there is an immediate violation of SPE
definition If: suppose to the contrary, s* satisfies the principle but is not
SPE. Then there is a stage t and a history ht s.t. at least one player has a strategy s’i(ht)≠s*i(ht) and s’i(ht) is a better response. Continuation next slide
7
Proof one stage deviation principle Let t’ be the last period in which s’i(ht’)≠s*i(ht’) Because of the one-stage-deviation principle t’ > t Period t’ is defined such that for all t” > t’ s’i(ht”)=s*i(ht”) Define then another strategy sI that is such that it
coincides with s’I up to t’ and coincides with s*I at t’ and afterwards.
Because of the one-stage-deviation principle and since s’i(ht”)=si(ht”) for all t” > t’, si is as good a response given history ht
If t’ = t+1, then si only differs in one period from s*, and therefore the one stage deviation principle implies that si cannot be strictly better
If t’ > t+1, similar argument applies (details page 109)
8
Additional equilibria in repeated games Main interest in repeated games is what type
of equilibrium outcomes can be supported that cannot be supported in a static game Repetition of static equilibrium is always an
equilibrium in a repeated game; not so interesting Thus, what else? Consider an example
9
A Static Game
Strategy L C RU 12,12 1, 16 -1, 15M 16, 1 7,7 0, 5D 15, -1 5, 0 2, 2
10
Multiple Equilibria
Strategy L C RU 12,12 1, 16 -1, 15M 16, 1 7, 7 0, 5B 15, -1 5, 0 2, 2
Nash Equilibria
11
Can non-Nash outcomes of the static game be supported in equilibrium if the game is repeated 2 times?
Strategy L C RU 12,12 1, 16 -1, 15M 16, 1 7,7 0, 5D 15, -1 5, 0 2, 2
12
Last period analysis
• In the last period they cannot choose for (U,L)
• As both firms have an incentive to “cheat” as 16 is a higher pay-off than 12
• Punishment is not possible (as it is the last period)
13
First-period analysis
• But: in the first period they can choose for (U,L)
• Strategy:- Choose “U (L) ” in period 1- Choose “M (C)” in period 2 when other chooses “L
(U)” in period 1- Choose “B (R)” in period 2 when other chooses
somthing else in period 1
• Punishment is part of strategy
• Is this an equilibrium? Is it a SPE
14
Pay-offs in infinitely repeated games Overall pay-offs ui; stage game pay-offs gi,
continuation pay-off from period t onwards
Want to have an expression where one can easily compare stage game pay-offs and repeated game pay-offs, i.e., normalisation:
Time averaging is sometimes used for the case of complete patience
t it hg
))((
t it hg
))(()1(
T
t iT hg1
))(( (1/T) inflim
15
Folk Theorem I
If players are sufficiently patient, then any feasible, individually rational pay-offs can be enforced by an equilibrium Individually rational pay-offs: minimax pay-off
vi = mj
i is action player j chooses to minimax player i Feasible pay-offs is the convex hull V of the static
game pay-offs, i.e., V = convex hull {v / there is an a A such that g(a) =v}
Both terms need some explanation
),(maxmin iiiaa aagii
16
Minimax pay-offs
What are the Nash equilibria of this game? Denote by q the probability player
2 chooses L In a mixed strategy eq ⅓≤q≤⅔,
pay-offs 0 and 1 Minimax for player 1
u(U) = -3q+1 u(M) = 3q-2 U(D) = 0 Minimax is 0
Minimax for player 2 is also 0 By 1 choosing (½,½,0)
Thus, minimax pay-offs can be lower than Nash eq. pay-offs
L R
U -2,2 1,-2
M 1,-2 -2,2
D 0,1 0,1
17
Feasible pay-offs
Equilibrium pay-offs are (2,1), (1,2) and (⅔, ⅔)
Convex hull of eq. pay-offs is triangle connecting the three points (also e.g. (1½,1½))
V connects (2,1), (1,2) and (0, 0) But (1½,1½) cannot be obtained
by independent mixing, only as correlated eq
Correlated mixing can happen in repeated setting by alternating between playing two equilibria (and time averaging pay-offs or δ close to 1)
F B
B 0,0 2,1
F 1,2 0,0
Eq. pay-offs
18
Folk Theorem II
Prop. For every feasible pay-off vector v with vi > vi, there exist a δ < 1 such that for all δ > δ there exist a Nash equilibrium of the infinitely repeated game with pay-off v. Pay-offs in repeated game cannot only be larger,
but also smaller than static Nash eq pay-offs!! Basic idea: if players are sufficiently patient, then
any finite gain in a one period deviation is nothing compared to a small, but permanent loss in future pay-offs (punishment by minimaxing a player)
19
“Proof” “Nash Folk Theorem” Consider feasible pay-off v and action profile g(a)=v
If there is no action profile a that yields v, you may choose a sequence of actions such that v is (close to) average (discounted) pay-offs (or a public randomization)
Consider strategy: start by playing ai; play ai as long as others do, if one player j deviates minimax him forever, i.e., choose mj
i
Deviation in period t yields normalised pay-off
which is smaller than vi if δ is larger than δi, where δi solves
it
it
it vagv 1)(max )1()1(
iiiii vvag )(max )1(
20
Is the threat of Minimaxing credible? If we restrict analysis to static “Nash threats”,
then Friedman shows that only pay-offs larger than the static Nash equilibrium pay-offs can be supported
Others show that in games where the minimax pay-offs are lower than the static equilibrium pay-offs, even worse outcomes can be compatible with a SPE of the infinitely repeated game.
21
Basic idea of SPE with minimax pay-offs
time averaging After a deviation, play the minimax pay-off for N
periods, where N is chosen for all players s.t. After N periods return back to “cooperative”
mood (finite) N ensures that no player has an
incentive to deviate Cost of punishment is extremely small as with
time averaging pay-offs in a finite number of periods “do not make a difference” Average pay-off to player j when i is punished is v j
iiii NvagvNag )(min )(max
22
Basic idea of SPE with minimax pay-offs
discounted pay-offs
Previous strategies (for time averaging pay-offs) do not work as it may be that minimaxing another player gives a player a lower pay-off than his own minimax pay-off.
Reward punishers, instead of punishing them if they don’t punish
Choose a vector in the interior of V such that for each i you can still give a higher pay-off. V needs to be of “full dimension”
Play in three phases: Initial cooperative phase Punishment phase where players minimax for N(j) periods the
deviator j (as before); switch to punishment phase for player i if i deviates in one of the N(j)
periods. Reward phase after the punishment phase is fully completed
23
Renegotiation proofness in repeated games Is SPE the best notion of a credible threat? Suppose you cooperate for some time in the PD and
then someone defects, by chance. Should you go back immediately to always defect? Or should players “renegotiate”? It is in both players interest to revert back to the
cooperative outcome In any subgame the equilibrium played must not be Pareto-
dominated. Pareto-optimality as an assumption and the critique that is
possible (risk dominance and Pareto-dominance) Deviations are accidents and unlikely to be repated?
“Bygones are bygones”
24
Pareto perfection only applies in two-player games Two Nash equilibria in
pure strategies: (U,L,A) and (D,R,B)
ULA is Pareto-efficient Natural candidate? Suppose players 1 and 2
expect matrix chooser to choose A. Then they can renegotiate and gain by playing (D,R)
L R
U 0,0,10 -5,-5,0
D -5,-5,0 1,1,-5
A
L R
U -2,-2,0 -5,-5,0
D -5,-5,0 -1,-1,5
B
25
Definition of Pareto perfect equilibrium
Fix stage game g and play it for T periods. Let P(T) the set of pay-offs of pure strategy SPE of
G(T) R(t) is the set of strongly efficient points of P(t), i.e.,
this is the set of points such that there is not another pay-off point where no player is worse off and some player is better off.
Set Q(1) = P(1) For any t, let Q(t) be the set of pay-offs of pure
strategy SPE that can be enforced with continuation pay-offs in R(t-1)
A SPE is Pareto perfect if for every possible history and in every time period t, the continuation pay-offs are in R(T-t)
26
Pareto perfection restricts threats Some efficient equilibria cannot be supported
anymore under Pareto-perfection It restricts the set of threats, and thereby it is
more difficult to keep players on the equilibrium path
Example
27
Example Pareto-perfection
c1 c2 c3 c4
R1 0,0 1,4 0,0 6,0
R2 4,1 0,0 0,0 0,0
R3 0,0 0,0 3,3 0,0
R4 0,6 0,0 0,0 5,5
Three pure strategies in G(1) with pay-offs (4,1), (1,4) and (3,3)
In G(2) without discounting pay-off of 8 is possible. Unique element in R(2)
Without restriction to Pareto perfection in G(3) pay-off of 13 possible
With Pareto perfection in first period of G(3) no threat possible; one has to play stage game equilibrium
Equilibrium play alternates between odd and even periods under Pareto perfection
Top Related