Decision making. Blaise Pascal 1623 - 1662 Probability in games of chance How much should I bet on...

Decision making

Blaise Pascal1623 - 1662

Probability in games of chance

How much should I bet on ’20’?

E[gain] = Σgain(x) Pr(x)

Decisions under uncertainty

Maximize expected value (Pascal)

Bets should be assessed according to

x

p x gain x


The value of an alternative is a monotonous function of the

• Probability of reward

• Magnitude of reward

Do Classical Decision VariablesInfluence Brain Activity in LIP?

LIP

Varying Movement Value

Platt and Glimcher 1999

What Influences LIP?

Related to Movement Desirability• Value/Utility of Reward• Probability of Reward

Varying Movement Probability

What Influences LIP?

Related to Movement Desirability• Value/Utility of Reward• Probability of Reward


Neural activity in area LIP depends on:

• Probability of reward

• Magnitude of reward

Dorris and Glimcher 2004

Relative or absolute reward?

$X

$Y

$Z

$A $B $C $D $E

Consider a set of alternatives X and a binary relation on it, , interpreted as “preferred at least as”.

Consider the following three axioms:C1. Completeness: For every C2. Transitivity: For every

C3. Separability

Maximization of utility

X X

, , or x y X x y x y

, , , and imply x y z X x y y z x z

Theorem: A binary relation can be represented by a real-valued function if and only if it satisfies C1-C3

Under these conditions, the function u is unique up to increasing transformation(Cantor 1915)

A face utility function?

In there an explicit representation of ‘value’ of a choice in the brain?

Neurons in the orbitofrontal cortex encode value

Padoa-Schioppa and Assad, 2006

Examples of neurons encoding the chosen value

A neuron encoding the value of A

A neuron encoding the value of B

A neuron encoding the chosen juice taste

Encoding takes place at different times

post-offer (a, d, e, blue),

pre-juice (b, cyan),

post-juice (c, f, black)

How does the brain learn the values?

The computational problem

The goal is to maximize the sum of rewards

Eend

tt

V r

The computational problem

The value of the state S1 depends on the policy

1 2ice cream V S R V S

If the animal chooses ‘right’ at S1,

How to find the optimal policy in a complicated world?


• If values of the different states are known then this task is easy

1 t t tV S r V S


• If values of the different states are known then this task is easy

How can the values of the different states be learned?

1 t t tV S r V S

V(St) = the value of the state at time t

rt = the (average) reward delivered at time t

V(St+1) = the value of the state at time t+1

where

t t tV S V S

1 t t t tr V S V S

is the TD error.

The TD (temporal difference) learning algorithm

Schultz, Dayan and Montague, Science, 1997

CS Reward

Before trial 1:

1 2 3 4 5 6 7 8 9

1 2 9 0 V S V S V S

In trial 1:

• no reward in states 1-7

1 0 t t t tr V S V S

0 t t tV S V S

• reward of size 1 in states 8

9 8 1 t tr V S V S

8 t tV S V S

CS Reward

Before trial 2:

1 2 3 4 5 6 7 8 9

1 2 7 9 0 V S V S V S V S

8 V SIn trial 2, for states 1-6


0 t t tV S V S

For state 7,

1 t t t tr V S V S 2

7 7 tV S V S

CS Reward

Before trial 2:

1 2 3 4 5 6 7 8 9

1 2 7 9 0 V S V S V S V S

8 V SFor state 8,


8 8 1 2 tV S V S

CS Reward

Before trial 3:

1 2 3 4 5 6 7 8 9

1 2 6 9 0 V S V S V S V S

27 8 2 V S V S

In trial 2, for states 1-5


0 t t tV S V S

For state 6,

21 t t t tr V S V S

37 7 tV S V S

CS Reward

1 2 3 4 5 6 7 8 9

For state 7,

21 2 2 1 t t t tr V S V S

2 2 37 7 2 1 3 2 tV S V S

Before trial 3: 1 2 6 9 0 V S V S V S V S

27 8 2 V S V S

For state 8,

1 1 2 t t t tr V S V S

8 8 2 1 1 2 tV S V S

CS Reward

After many trials

1 2 3 4 5 6 7 8 9

1 8 91 0 V S V S V S


Except for the CS whose time is unknown

Schultz, 1998

Bayer and Glimcher, 1998

“We found that these neurons encoded the difference between the current reward and a weighted average of previous rewards, a reward prediction error, but only for outcomes that were better than expected”.

Bayer and Glimcher, 1998

Decision making. Blaise Pascal 1623 - 1662 Probability in games of chance How much should I bet on...

Documents

Transcript of Decision making. Blaise Pascal 1623 - 1662 Probability in games of chance How much should I bet on...