High-Con dence Predictions Under Adversarial...

60
High-Confidence Predictions Under Adversarial Uncertainty Andrew Drucker IAS Andrew Drucker, IAS High-Confidence Predictions 1/47

Transcript of High-Con dence Predictions Under Adversarial...

Page 1: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

High-Confidence Predictions Under AdversarialUncertainty

Andrew Drucker

IAS

Andrew Drucker, IAS High-Confidence Predictions 1/47

Page 2: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Setting: prediction on binary sequences

x = (x1, x2, x3, . . .) ∈ 0, 1ω

Bits of x revealed sequentially.

Goal: make some nontrivial prediction about unseen bits ofsequence x , given bits seen so far.

Andrew Drucker, IAS High-Confidence Predictions 2/47

Page 3: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Setting: prediction on binary sequences

x = (x1, x2, x3, . . .)

Question: What kinds of assumptions on x are needed tomake interesting predictions?

Our message: Surprisingly weak ones.

Andrew Drucker, IAS High-Confidence Predictions 3/47

Page 4: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Modeling questions

Prediction: a game between the Predictor and Nature.

What kind of opponent is Nature?

Andrew Drucker, IAS High-Confidence Predictions 4/47

Page 5: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Probabilistic models

x = (x1, x2, x3, . . .)

x ∼ D,

where D is some known probability distribution.

Problem: how to choose correct D for realistic applications?

Andrew Drucker, IAS High-Confidence Predictions 5/47

Page 6: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Classes of assumptions

x = (x1, x2, x3, . . .)

Adversarial models:x ∈ A,

where A ⊆ 0, 1ω is some known set.

Interested in worst-case performance.

These assumptions can be quite “safe”...

Our focus today.

Andrew Drucker, IAS High-Confidence Predictions 6/47

Page 7: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Prior work on adversarial prediction

Gales and fractal dimension[Lutz ‘03; Athreya, Hitchcock, Lutz, Mayordomo ‘07]

Gales: a class of betting strategies, to bet on unseen bits ofx ∈ A.

Goal: reach a fortune of ∞, on any x ∈ A.

The “handicap” we need can be related to measures of fractaldimension for A...

Andrew Drucker, IAS High-Confidence Predictions 7/47

Page 8: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Prior work on adversarial prediction

Ignorant forecasting

What is the chance of rain tomorrow?

Basic test of a meteorologist: “calibration.”

If governing distribution D is known, easy to achieve withBayes’ rule...

But: calibration can also be achieved by an ignorantforecaster! [Foster, Vohra ’98]

Andrew Drucker, IAS High-Confidence Predictions 8/47

Page 9: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Prior work on adversarial prediction

x = (x1, x2, x3, . . .)

These works’ goal: long-term, overall predictive success.

Our focus: make a single prediction with high confidence.

Andrew Drucker, IAS High-Confidence Predictions 9/47

Page 10: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

0-prediction

Our main scenario: want to predict a single 0 among thebits of x .

(We lose if prediction fails OR if we wait forever!)

Interpretation: choose a time to “safely” perform someaction;

[xt = 0] means “time t is safe.”

Andrew Drucker, IAS High-Confidence Predictions 10/47

Page 11: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Possible assumptions

ε-biased arrivals assumption: bits of x independent, with

Pr[xt = 1] = ε.

Best strategy succeeds with prob. 1− ε.

Andrew Drucker, IAS High-Confidence Predictions 11/47

Page 12: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Possible assumptions

Very strong assumption...

Idea (not new): study adversarial “relaxations” of ε-biasedmodel.

Andrew Drucker, IAS High-Confidence Predictions 12/47

Page 13: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Possible assumptions

Very strong assumption...

Idea (not new): study adversarial “relaxations” of ε-biasedmodel.

Andrew Drucker, IAS High-Confidence Predictions 12/47

Page 14: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Possible assumptions

Very strong assumption...

Idea (not new): study adversarial “relaxations” of ε-biasedmodel.

Andrew Drucker, IAS High-Confidence Predictions 13/47

Page 15: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Possible assumptions

Very strong assumption...

Idea (not new): study adversarial “relaxations” of ε-biasedmodel.

Andrew Drucker, IAS High-Confidence Predictions 14/47

Page 16: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Possible assumptions

LetNt := x1 + . . .+ xt .

ε-sparsity assumption: say that x is ε-sparse if

lim supt→∞

Nt/t ≤ ε.

Andrew Drucker, IAS High-Confidence Predictions 15/47

Page 17: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Possible assumptions

LetNt := x1 + . . .+ xt .

ε-weak sparsity assumption: say that x is ε-weakly sparse if

lim inft→∞

Nt/t ≤ ε.

Andrew Drucker, IAS High-Confidence Predictions 16/47

Page 18: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Our main result

Theorem

For any ε, γ > 0, there is a (randomized) 0-prediction strategy Sε,γthat succeeds with prob. ≥ 1− ε− γ,

on any ε-weakly sparse sequence.

Can do nearly as well as under ε-biased arrivals!

(Adversary’s sequence gets fixed before randomness in Sε,γ ...)

Andrew Drucker, IAS High-Confidence Predictions 17/47

Page 19: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Our main result

Theorem

For any ε, γ > 0, there is a (randomized) 0-prediction strategy Sε,γthat succeeds with prob. ≥ 1− ε− γ,

on any ε-weakly sparse sequence.

Can do nearly as well as under ε-biased arrivals!

(Adversary’s sequence gets fixed before randomness in Sε,γ ...)

Andrew Drucker, IAS High-Confidence Predictions 17/47

Page 20: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Proof ideas

Divide sequence into “epochs:”

(r -th epoch of length Kr = Θ(r2).)

Run a separate 0-prediction algorithm for each individualepoch.

Andrew Drucker, IAS High-Confidence Predictions 18/47

Page 21: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Proof ideas

Easy claim: x is ε-weakly sparse

∃ a subsequence of “nice” epochs,whose 1-densities are at most ε+ γ/3.

Let ε′ = ε+ γ/2.

Andrew Drucker, IAS High-Confidence Predictions 19/47

Page 22: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Proof ideas

Idea: give an algorithm S with the properties:

1 Makes a 0-prediction with noticeable prob. on each niceepoch;

2 On every epoch,

Pr

[true prediction

]≥(1−ε′ε′

)· Pr

[false prediction

].

(Would achieve our goal!)

Andrew Drucker, IAS High-Confidence Predictions 20/47

Page 23: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Proof ideas

Pr

[true prediction

]?≥

(1−ε′ε′

)· Pr

[false prediction

]

Whoops—can’t achieve this!

Modified goal: an upper bound(1−ε′ε′

)· Pr

[false prediction

]- Pr

[true prediction

]≤ (small)

Andrew Drucker, IAS High-Confidence Predictions 21/47

Page 24: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Proof ideas

Pr

[true prediction

]?≥

(1−ε′ε′

)· Pr

[false prediction

]

Whoops—can’t achieve this!

Modified goal: an upper bound(1−ε′ε′

)· Pr

[false prediction

]- Pr

[true prediction

]≤ O(1/|Kr |)

Andrew Drucker, IAS High-Confidence Predictions 22/47

Page 25: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Proof ideas

During the r -th epoch, alg. maintains a stack of “chips”(initially empty);

stack’s height

l

algorithm’s “courage” to predict next bit of x will be a 0.

Andrew Drucker, IAS High-Confidence Predictions 23/47

Page 26: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Proof ideas

During the r -th epoch, alg. maintains a stack of “chips”(initially empty);

stack’s height

l

algorithm’s “courage” to predict next bit of x will be a 0.

Andrew Drucker, IAS High-Confidence Predictions 23/47

Page 27: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Stack dynamicsAssume

ε′ =p

d= 1− q

d.

Observe a 0: add p “courage chips.”

Observe a 1: remove q chips.

e.g., p = 1:

Andrew Drucker, IAS High-Confidence Predictions 24/47

Page 28: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Stack dynamics

Assumeε′ =

p

d= 1− q

d.

Observe a 0: add p “courage chips.”

Observe a 1: remove q chips.

e.g., p = 1:

Andrew Drucker, IAS High-Confidence Predictions 24/47

Page 29: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Making predictions

Let Ht = stack height after observing first t bits of r -thepoch.

Overall algorithm for epoch r :

1 Choose t∗ uniformly from 1, 2, . . . ,Kr;2 Observe first t∗ − 1 bits;

3 Predict a 0 on step t∗ with probability

Ht∗−1

d · Kr,

else make no prediction this epoch.

Andrew Drucker, IAS High-Confidence Predictions 25/47

Page 30: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

1 If fraction of 1s in r -th epoch is ≤ ε+ γ/3 < p/d ,

a 0-prediction is made in epoch r with Ω(1) prob.

=⇒ Eventually (in some epoch), a prediction is made.

Andrew Drucker, IAS High-Confidence Predictions 26/47

Page 31: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

2 To compare odds of correct and incorrect 0-predictions,

analyze each chip’s contribution.

Andrew Drucker, IAS High-Confidence Predictions 27/47

Page 32: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

2 To compare odds of correct and incorrect 0-predictions,

analyze each chip’s contribution.

Andrew Drucker, IAS High-Confidence Predictions 27/47

Page 33: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

Intuition:

If a chip remains on the stack long enough, fraction of 1s whileit’s on is . p/d = ε′.

GOOD! (Contributes mostly to successful predictions.)

Total contribution to failure probability of other (“bad”) chipsis small.

We can analyze all chips in a simple, unified way...

Andrew Drucker, IAS High-Confidence Predictions 28/47

Page 34: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

Fix attention to a chip c on input x .

Let zeros(c) (ones(c)) denote the number of zeros (ones)appearing after steps where c is on the stack.

Andrew Drucker, IAS High-Confidence Predictions 29/47

Page 35: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

Chip c’s contribution to success and failure probabilities:↓ ↓

zeros(c)

d · K 2r

,ones(c)

d · K 2r

.

To compare: show that zeros(c) and ones(c) obey a linearinequality...

Andrew Drucker, IAS High-Confidence Predictions 30/47

Page 36: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

Claim:q · ones(c) − p · zeros(c) ≤ q.

Proof: LHS bounded by the net loss in stack height between firstappearance of c and (possible) removal...

c is removed along with ≤ q other chips!

Let’s sum over all c ...

Andrew Drucker, IAS High-Confidence Predictions 31/47

Page 37: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

Summing over all c (at most p · Kr chips total):

q ·∑

c

ones(c) − p ·∑

c

zeros(c) ≤ pqKr .

Q.E.D.

Q.E.D.

Andrew Drucker, IAS High-Confidence Predictions 32/47

Page 38: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

Summing over all c (at most p · Kr chips total):

(q/p) ·∑

c ones(c)

d · K 2r

−∑

c

zeros(c)

d · K 2r

≤ q

dKr.

(q/p) · Pr[failure in epoch r ] − Pr[success in epoch r ]

≤ O(K−1r ) = O(r−2).

Q.E.D.

Andrew Drucker, IAS High-Confidence Predictions 33/47

Page 39: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis ideas

Summing over all c (at most p · Kr chips total):

(q/p) ·∑

c ones(c)

d · K 2r

−∑

c

zeros(c)

d · K 2r

≤ q

dKr.

(1− ε′

ε′

)· Pr[failure in epoch r ] − Pr[success in epoch r ]

≤ O(K−1r ) = O(r−2).

Q.E.D.Andrew Drucker, IAS High-Confidence Predictions 34/47

Page 40: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Also in the paper

Bit prediction for broader classes of assumptions:

E.g., predict a bit (0 or 1), under the assumption that acertain word appears only rarely.

General statement involves finite automata.

Andrew Drucker, IAS High-Confidence Predictions 35/47

Page 41: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Also in the paper

Also: high-confidence predictions under no assumptions on x!

Andrew Drucker, IAS High-Confidence Predictions 36/47

Page 42: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Ignorant interval-forecasting

Sequence x ∈ 0, 1ω: now completely arbitrary.

Goal: predict the fraction of 1s in an unseen interval, withhigh accuracy and high confidence.

(Huh?)

Andrew Drucker, IAS High-Confidence Predictions 37/47

Page 43: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Ignorant interval-forecasting

Our “hook”—we get to choose the position and size of theprediction-interval.

Interval-forecaster alg.: makes a prediction of form:

“A p fraction of the next N bits will be 1s.”

Andrew Drucker, IAS High-Confidence Predictions 38/47

Page 44: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Ignorant interval-forecasting

Theorem

For any ε, δ > 0, there is a ignorant interval-forecaster Sε,δ that isaccurate to ±ε, with success probability 1− δ.

Runtime of Sε,δ is finite: = 2O(ε−2δ−1).

Andrew Drucker, IAS High-Confidence Predictions 39/47

Page 45: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

The approach

Consider x ∈ 0, 12n, n = d4/(ε2δ)e.

Arrange bits of x on leaves of a binary tree T .

Andrew Drucker, IAS High-Confidence Predictions 40/47

Page 46: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

The approach

Forecasting algorithm:

1 Choose a random walk W from root in T of length n − 1;

2 Pick a uniform t∗ ∈ 0, 1, . . . , n − 1, and select t∗th vertexalong W;

3 Predict that(fraction of 1s in right subtree) =

(fraction of 1s in left subtree).

Andrew Drucker, IAS High-Confidence Predictions 41/47

Page 47: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

The approach

Forecasting algorithm:

1 Choose a random walk W from root in T of length n − 1;

2 Pick a uniform t∗ ∈ 0, 1, . . . , n − 1, and select t∗th vertexalong W;

3 Predict that(fraction of 1s in right subtree) =

(fraction of 1s in left subtree).

Andrew Drucker, IAS High-Confidence Predictions 41/47

Page 48: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

The approach

Forecasting algorithm:

1 Choose a random walk W from root in T of length n − 1;

2 Pick a uniform t∗ ∈ 0, 1, . . . , n − 1, and select t∗th vertexalong W;

3 Predict that(fraction of 1s in right subtree) =

(fraction of 1s in left subtree).

Andrew Drucker, IAS High-Confidence Predictions 41/47

Page 49: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

The approach

Forecasting algorithm:

1 Choose a random walk W from root in T of length n − 1;

2 Pick a uniform t∗ ∈ 0, 1, . . . , n − 1, and select t∗th vertexalong W;

3 Predict that(fraction of 1s in right subtree) =

(fraction of 1s in left subtree).

Andrew Drucker, IAS High-Confidence Predictions 41/47

Page 50: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

The approach

Forecasting algorithm:

1 Choose a random walk W from root in T of length n − 1;

2 Pick a uniform t∗ ∈ 0, 1, . . . , n − 1, and select t∗th vertexalong W;

3 Predict that(fraction of 1s in right subtree) =

(fraction of 1s in left subtree).

Andrew Drucker, IAS High-Confidence Predictions 41/47

Page 51: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis idea

For 0 ≤ t ≤ n letXt ∈ [0, 1]

denote the fraction of 1s below the t-th step vertex.

Fact: For any fixed bit-sequence x ,

X0,X1, . . . ,Xn

is a martingale, from which:

E[(Xt+1 − Xt)(Xs+1 − Xs)] = 0,

for all s < t. Thus:

1 ≥ E[(Xn − X0)2] =∑

0≤t<n

E[(Xt+1 − Xt)2] .

Andrew Drucker, IAS High-Confidence Predictions 42/47

Page 52: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Analysis idea

∑0≤t<n

E[(Xt+1 − Xt)2] ≤ 1

(Xt+1 − Xt)2 small =⇒ t is a good choice for t∗!

(i.e., left and right subtrees have similar 1-densities).

So w.h.p. over walk W, most choices for t∗ are good!Q.E.D.

Andrew Drucker, IAS High-Confidence Predictions 43/47

Page 53: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Questions

Characterize the sets A ⊂ 0, 1ω for which confident0-prediction is possible?

Connection with fractal dimension, a la (Lutz et al.)?

Andrew Drucker, IAS High-Confidence Predictions 44/47

Page 54: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Questions

For which distributions D on 0, 1∞ can we extend to a“supporting set” A, preserving easiness of prediction?

Andrew Drucker, IAS High-Confidence Predictions 45/47

Page 55: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Questions

For which distributions D on 0, 1∞ can we extend to a“supporting set” A, preserving easiness of prediction?

Andrew Drucker, IAS High-Confidence Predictions 45/47

Page 56: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Questions

For which distributions D on 0, 1∞ can we extend to a“supporting set” A, preserving easiness of prediction?

Andrew Drucker, IAS High-Confidence Predictions 45/47

Page 57: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Questions

Is there a minimax theorem for 0-prediction?

Hard set A for 0-prediction ⇒ hard distribution D over A?

Would give alternate (non-constructive) proof of our mainresult...

More examples of surprisingly confident prediction?

Andrew Drucker, IAS High-Confidence Predictions 46/47

Page 58: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Questions

Is there a minimax theorem for 0-prediction?

Hard set A for 0-prediction ⇒ hard distribution D over A?

Would give alternate (non-constructive) proof of our mainresult...

More examples of surprisingly confident prediction?

Andrew Drucker, IAS High-Confidence Predictions 46/47

Page 59: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Questions

Is there a minimax theorem for 0-prediction?

Hard set A for 0-prediction ⇒ hard distribution D over A?

Would give alternate (non-constructive) proof of our mainresult...

More examples of surprisingly confident prediction?

Andrew Drucker, IAS High-Confidence Predictions 46/47

Page 60: High-Con dence Predictions Under Adversarial Uncertaintypeople.csail.mit.edu/andyd/prediction_web_slides.pdf · Andrew Drucker, IAS High-Con dence Predictions 28/47. Analysis ideas

Thanks!

Andrew Drucker, IAS High-Confidence Predictions 47/47