Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each...

17
Similarity and Substitutability Paulo Natenzon Washington University in Saint Louis 14th SAET Conference at Waseda University August 21, 2014

Transcript of Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each...

Page 1: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Similarity and Substitutability

Paulo NatenzonWashington University in Saint Louis

14th SAET Conference at Waseda UniversityAugust 21, 2014

Page 2: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Motivation

I Goal: useful parametrizations of stochastic data

I Important to understand the behavioral content of eachparameter

I Two Thurstonian models

I Standard Probit and Bayesian Probit: X ∼ N (µ,Σ)

I Correlation matters under low precision and close toindifference

I Substitutability in the Standard Probit

I Similarity in the Bayesian Probit

Page 3: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Louis Leon Thurstone (1887–1955)

Page 4: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

‘Law’ of Comparative Judgement (Thurstone, 1927)

The ‘law’ is a model of binary comparisons:

Alternatives ordered in a psychological continuum

I gradations of gray, weight, excellence

The discriminal process for each alternative Xi = µi + εi

εi discriminal deviation ∼ N (0, σ2i )

σi discriminal dispersion

ρ(1, {1, 2}) = P{X1 > X2}

Page 5: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Example: choice of lotteries

Experimental data from Soltani, De Martino and Camerer (2012)

Each subject made 160 pairwise choices

In each trial, four seconds to evaluate the options:

of this period, one of the three gambles was removed from thescreen and subjects had only 2 sec to choose one of the tworemaining gambles in a selection period (Figure 1A). Two of threeinitial gambles were the low-risk gamble (target T) and the subject-tailored high-risk gamble (competitor C). The third gamble wasthe decoy gamble (D) that was randomly chosen from a set ofgambles with a wide range of attribute values (see Figure 1B andMethods for more details).

On two thirds of the trials (regular trials), the decoy gamble wasremoved after the evaluation period and the subject had to choosebetween T and C gambles. On the remaining one third of thetrials (catch trials), either the T or C gamble disappeared. Thecatch trials were included to conceal the underlying structure ofthe task and were subsequently discarded from the analysis (sincethey do not provide choices between T and C). Therefore, we onlyanalyze the regular trials to investigate how the preferencebetween T and C gambles changed as a function of a decoy thatwas present at the evaluation period, but not available in theselection period.

Having a long evaluation period (8 sec) and a short selectionperiod (2 sec) forces subjects to evaluate and ‘‘pre-choose’’options by ranking them during the evaluation period; therefore,they would be prepared to make a rapid choice in the 2-secselection period. This ensures that presentation of the decoy

during the evaluation period can influence context-dependentprocesses of assigning values enough to have a behavioral impactduring rapid selection. This ‘‘phantom decoy’’ design allowed usto study the effect of dominant decoys (decoys that are betterthan either T or C gambles) as well as dominated decoys (seebelow).

We found that subjects’ preference between T and C wassystematically influenced by the attributes of the decoys. The firstindication of the decoy influence on the subsequent choice wasthat the majority of our subjects did not select T and C gamblesequally (Figure S4 in Text S1), though they were constructed (fromthe estimation task data) to be equally preferable.

As in previous studies, we divided trials into 6 groups (D1 to D6)based on the position of the decoy (Figure 1B). Decoys in positionsD1 and D4 are called the asymmetrically dominant decoys becausethey dominate either T or C (they are less risky and also havelarger reward magnitudes), but do not dominate both. Decoys inpositions D3 and D6 are asymmetrically dominated decoys sincethey are either worse than the target (D6) or the competitor (D3)on both dimensions (i.e. they are more risky and also have smallerreward magnitudes), but are only dominated by one of T and C[6,10]. Finally, decoys in positions D2 and D5 are similar to thetarget and the competitor and are better on one dimension butworse on another. They are called similar decoys [28,33].

Figure 1. Experimental design and behavioral results. (A) Timeline of the experiment during the decoy task. A trial started with a fixationpoint, followed by the presentation of three options (monetary gambles) on the screen for 8 sec (evaluation period). These gambles were the target(T) and the competitor (C) gambles, tailored to be equally preferable, and a third gamble, the decoy (D). At the end of evaluation period, one of thethree gambles was removed from the screen and subjects had only 2 sec to choose one of the two remaining gambles by pressing a button(selection period). (B) Positions of decoys with respect to T and C. Decoys were presented in different locations of the attribute space: probability(dimension 1) and magnitude (dimension 2). For data analysis, decoys were grouped into 6 locations, depending on theirs position with respect tothe closest gamble to them. Decoys at D1 and D4 regions are referred to as the asymmetrically dominant. Decoys at D3 and D6 regions are referred toas the asymmetrically dominated. Finally, decoys at D2 and D5 regions are referred to as the similar decoys. (C, D) Modulation of preference for thetarget, and the decoy efficacy as a function of different decoys. The average of modulation for each decoy is plotted in black (error bars are the s.e.m.)and the gray symbols show the value for individual subjects. The star on a given decoy location shows that the modulation for that decoy wassignificantly different from zero (Wilcoxon signed rank test, p,0.05). Decoy effects were significant for all decoys except D2 decoys.doi:10.1371/journal.pcbi.1002607.g001

Neural Model of Context-Dependent Choice

PLoS Computational Biology | www.ploscompbiol.org 3 July 2012 | Volume 8 | Issue 7 | e1002607

And two seconds to choose

Page 6: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Standard Probit and Bayesian Probit

Same parameters for both models: X ∼ N(µ, 1t Σ

)I Random variable Xi for each alternative iI X1, . . . ,Xn joint normally distributed with

I µi ∈ R expectationsI σij ∈ [0, 1] correlationsI 1/t > 0 equal variance

Standard probit: ρ̈µσt (j ,B) = P{Xj ≥ Xk ,∀k ∈ B}

Bayesian probit: ρµσt (j ,B) = P{mj ≥ mk ,∀k ∈ B}

where m =[I−1 + tΣ−1

]−1 [Σ−1X

]are mean posterior beliefs

(prior is iid standard normal)

Page 7: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Binary Choice

Proposition

ρµσt (i , {i , j}) = Φ

(√t√2

(µi − µj)√1− σij

)= ρ̈µσt (i , {i , j})

I Φ is standard normal cdf

I Equivalence for binary choice data

I Same estimation procedures

I Distinct interpretation for σ needs more alternatives

Page 8: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Binary Choice

Proposition

ρµσt (i , {i , j}) = Φ

(√t√2

(µi − µj)√1− σij

)= ρ̈µσt (i , {i , j})

I Φ is standard normal cdf

I Equivalence for binary choice data

I Same estimation procedures

I Distinct interpretation for σ needs more alternatives

Page 9: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Effect of correlation

ρµσt (i , {i , j}) = Φ

(√t√2

(µi − µj)√1− σij

)= ρ̈µσt (i , {i , j})

-5 50

1

2

1-5 5

0

1

2

1

t (μ1-μ2)

ChoiceProbability

I σ matters when√t(µi − µj) is small

Page 10: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Substitutability in Standard Probit

Let µ1 = µ2 = µ3.

Proposition

ρ̈µσt (i , {i , j , k}) ≥ ρ̈µσt (j , {i , j , k}) if and only if σik ≤ σjk .

Example

B = {1, 2, 3}

ρ̈µσt (1,B) = 0.4ρ̈µσt (2,B) = 0.3ρ̈µσt (3,B) = 0.3

=⇒ σ12 = σ13 < σ23

Alternatives 2 and 3 have a higher degree of substitutability.

Page 11: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Substitutability in Standard Probit

Let µ1 = µ2 = µ3.

Proposition

ρ̈µσt (i , {i , j , k}) ≥ ρ̈µσt (j , {i , j , k}) if and only if σik ≤ σjk .

Example

B = {1, 2, 3}

ρ̈µσt (1,B) = 0.4ρ̈µσt (2,B) = 0.3ρ̈µσt (3,B) = 0.3

=⇒ σ12 = σ13 < σ23

Alternatives 2 and 3 have a higher degree of substitutability.

Page 12: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Proof

Let A = {1, 2, 3}.

X1

X2

X3

∼ N µ1

µ2µ3

, 1

t

1 σ12 σ13σ12 1 σ23σ13 σ23 1

ρ̈µσt (1,A) = P ({X1 > X2} ∩ {X1 > X3})

=

∫ ∞−∞

∫ x

−∞

∫ x

−∞ϕ(x , y , z) dz dy dx

closed form expression?

Page 13: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Proof

Let L1 =

[−1 1 0−1 0 1

]Let B = (B1,B2) be standard normal

Let MM ′ = L1ΣL1 = Var(L1X )Then

ρ̈µσt (1,A) = P ({X2 − X1 < 0} ∩ {X3 − X1 < 0})= P{L1X < 0}= P{MB < 0}

= P

B1 ≤ 0and

B2 ≤ −B1(1+σ23−σ12−σ13)√

4(1−σ12)(1−σ13)−(1+σ23−σ12−σ13)2

=1

4+

1

2πarctan

((1+σ23−σ12−σ13)√

4(1−σ12)(1−σ13)−(1+σ23−σ12−σ13)2

)

Page 14: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Proof

Since arctan is strictly increasing,

ρ̈µσt (1, {1, 2, 3}) > ρ̈µσt (2, {1, 2, 3})

if and only if

(1+σ23−σ12−σ13)√4(1−σ12)(1−σ13)−(1+σ23−σ12−σ13)2

>(1+σ13−σ12−σ23)√

4(1−σ12)(1−σ23)−(1+σ13−σ12−σ23)2

if and only if

1 + σ23 − σ12 − σ13 > 1 + σ13 − σ12 − σ23

if and only ifσ23 > σ13

Q.E.D.

Page 15: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Similarity in Bayesian Probit

Let ρµσ0 (i ,B) = limt→0+ ρµσt (i ,B)

Proposition

ρµσ0 (i , {i , j , k}) ≥ ρµσ0 (j , {i , j , k}) if and only if σik ≥ σjk .

Example

B = {1, 2, 3}

ρµσ0 (1,B) = 0.4ρµσ0 (2,B) = 0.3ρµσ0 (3,B) = 0.3

=⇒ σ23 < σ12 = σ13

Alternatives 2 and 3 have a lower degree of similarity.

Page 16: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Similarity in Bayesian Probit: intutition

Take any symmetric, absolutely continuous prior

1 1 2 2 3 32 3 1 3 1 23 2 3 1 2 1

If you only learn that

(2 � 3)

1 2 22 1 33 3 1

or

(3 � 2)

1 3 33 1 22 2 1

then you never choose alternative 1.

Page 17: Similarity and Substitutability › files › pages › imce › pnatenzon › ... · In each trial, four seconds to evaluate the options: of this period, one of the three gambles

Conclusion

I Goal: useful parametrizations of stochastic data

I Important to understand the behavioral content of eachparameter

I Two Thurstonian models

I Standard Probit and Bayesian Probit: X ∼ N (µ,Σ)

I Correlation matters under low precision and close toindifference

I Substitutability in the Standard Probit

I Similarity in the Bayesian Probit