An Information Complexity Approach to Extended...

Post on 13-May-2020

0 views 0 download

Transcript of An Information Complexity Approach to Extended...

An Information Complexity Approach to

Extended Formulations

Ankur Moitra Institute for Advanced Study

joint work with Mark Braverman

The Permutahedron

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have? exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

Let Q = {A| A is doubly-stochastic}

exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

Let Q = {A| A is doubly-stochastic}

Then P is the projection of Q: P = {A t | A in Q }

exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

Let Q = {A| A is doubly-stochastic}

Then P is the projection of Q: P = {A t | A in Q }

Yet Q has only O(n2) facets

exponentially many!

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

xc(P) = Θ(logn) for a

regular n-gon, but Ω(√n)

for its perturbation

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

xc(P) = Θ(logn) for a

regular n-gon, but Ω(√n)

for its perturbation

In general, P = {x | y, (x,y) in Q} E

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

xc(P) = Θ(logn) for a

regular n-gon, but Ω(√n)

for its perturbation

In general, P = {x | y, (x,y) in Q} E

…analogy with quantifiers in Boolean formulae

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

EFs often give, or are based on new combinatorial

insights

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

EFs often give, or are based on new combinatorial

insights

e.g. Birkhoff-von Neumann Thm and permutahedron

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

EFs often give, or are based on new combinatorial

insights

e.g. Birkhoff-von Neumann Thm and permutahedron

e.g. prove there is low-cost object, through its polytope

Explicit, Hard Polytopes?

Explicit, Hard Polytopes?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

Explicit, Hard Polytopes?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

(If we could optimize over this polytope, then P = NP)

Explicit, Hard Polytopes?

Can we prove unconditionally there is no small EF?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

(If we could optimize over this polytope, then P = NP)

Explicit, Hard Polytopes?

Can we prove unconditionally there is no small EF?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

Caveat: this is unrelated to proving complexity l.b.s

(If we could optimize over this polytope, then P = NP)

Explicit, Hard Polytopes?

Can we prove unconditionally there is no small EF?

[Yannakakis ’90]: Yes, through the nonnegative rank

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

Caveat: this is unrelated to proving complexity l.b.s

(If we could optimize over this polytope, then P = NP)

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Braun et al ’12]: Any LP that approximates

clique within n1/2-eps has size exp(neps)

Hastad’s proved an n1-o(1) hardness of approx. for

clique, can we prove the analogue for EFs?

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Braun et al ’12]: Any LP that approximates

clique within n1/2-eps has size exp(neps)

Hastad’s proved an n1-o(1) hardness of approx. for

clique, can we prove the analogue for EFs?

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Braun et al ’12]: Any LP that approximates

clique within n1/2-eps has size exp(neps)

Theorem [Braverman, Moitra ’13]: Any LP that

approximates clique within n1-eps has size exp(neps)

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

The Factorization Theorem

The Factorization Theorem

How can we prove lower bounds on EFs?

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

Definition of the slack matrix…

The Slack Matrix

The Slack Matrix

P

The Slack Matrix

P S(P)

vertex

face

t

The Slack Matrix

P S(P)

vertex

face

t

vj

The Slack Matrix

P S(P)

vertex

face

t

vj

<ai,x> ≤ bi

The Slack Matrix

The entry in row i, column j is how slack the jth vertex

is on the ith constraint

P S(P)

vertex

face

t

vj

<ai,x> ≤ bi

The Slack Matrix

The entry in row i, column j is how slack the jth vertex

is on the ith constraint

bi - <ai, vj>

P S(P)

vertex

face

t

vj

<ai,x> ≤ bi

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

Definition of the slack matrix…

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

Definition of the slack matrix…

Definition of the nonnegative rank…

Nonnegative Rank

S =

Nonnegative Rank

S M1 Mr = + + …

rank one, nonnegative

Nonnegative Rank

S M1 Mr = + + …

Definition: rank+(S) is the smallest r s.t. S can be

written as the sum of r rank one, nonneg. matrices

rank one, nonnegative

Nonnegative Rank

S M1 Mr = + + …

Definition: rank+(S) is the smallest r s.t. S can be

written as the sum of r rank one, nonneg. matrices

rank one, nonnegative

Note: rank+(S) ≥ rank(S), but can be much larger too!

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

xc(P) = rank+(S(P))

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

xc(P) = rank+(S(P))

Intuition: the factorization gives a change of variables

that preserves the slack matrix!

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

xc(P) = rank+(S(P))

Intuition: the factorization gives a change of variables

that preserves the slack matrix!

We will give a new way to lower bound nonnegative

rank via information theory…

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

The Rectangle Bound

=

rank one, nonnegative

S M1 Mr + + …

The Rectangle Bound

=

rank one, nonnegative

M1 Mr + + …

The Rectangle Bound

= + + …

rank one, nonnegative

Mr

The Rectangle Bound

= + + …

rank one, nonnegative

Mr

The Rectangle Bound

= + + …

rank one, nonnegative

The Rectangle Bound

= + + …

rank one, nonnegative

The support of each Mi is a combinatorial rectangle

The Rectangle Bound

= + + …

rank one, nonnegative

The support of each Mi is a combinatorial rectangle

rank+(S) is at least # rectangles needed to cover supp of S

The Rectangle Bound

= + + …

rank one, nonnegative

rank+(S) is at least # rectangles needed to cover supp of S

The Rectangle Bound

= + + …

rank one, nonnegative

rank+(S) is at least # rectangles needed to cover supp of S

Non-deterministic Comm. Complexity

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

A Sampling Argument

= + + …

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

A Sampling Argument

= + + …

If r is too small, this procedure uses too little entropy!

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

The Construction of [Fiorini et al]

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices:

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

UNIQUE DISJ.

Output ‘YES’ if a

and b as sets

are disjoint, and

‘NO’ if a and b

have one index

in common

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Reconstruction Principle

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

Entropy Accounting 101

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

+

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r +

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r (1-δ)n log23 +

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r (1-δ)n log23 +

choose i

choose (a,b)

conditioned on i

≤ n log23 (?)

Suppose that a-j and b-j are fixed

a b bj

aj

Suppose that a-j and b-j are fixed

a b bj

aj

Mi restricted to (a-j,b-j)

Suppose that a-j and b-j are fixed

a b bj

aj

Mi restricted to (a-j,b-j)

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

zero

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

But rank(Mi)=1, hence there must be another

zero in either the same row or column

zero

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b) Mi(a,b)

(…bj=0…) (…bj=1…)

But rank(Mi)=1, hence there must be another

zero in either the same row or column

zero zero

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b) Mi(a,b)

(…bj=0…) (…bj=1…)

But rank(Mi)=1, hence there must be another

zero in either the same row or column

zero zero

H(aj,bj| i, a-j, b-j) ≤ 1 < log23

Entropy Accounting 101

Generate uniformly random (a,b) in T:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r +

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Generate uniformly random (a,b) in T:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r n +

choose i

choose (a,b)

conditioned on i

≤ n log23

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

Is there a K (with small xc) s.t. Pcorr K (C+1)Pcorr?

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

Is there a K (with small xc) s.t. Pcorr K (C+1)Pcorr?

+ C

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

New Goal:

Output the answer to

UDISJ with prob. at

least ½ + 1/2(C+1)

Is there a K (with small xc) s.t. Pcorr K (C+1)Pcorr?

+ C

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

There is a natural barrier at C = √n for proving l.b.s:

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

Claim: If UDISJ can be computed with prob.

½+1/2(C+1) using o(n/C2) bits, then UDISJ can be

computed with prob. ¾ using o(n) bits

There is a natural barrier at C = √n for proving l.b.s:

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

Claim: If UDISJ can be computed with prob.

½+1/2(C+1) using o(n/C2) bits, then UDISJ can be

computed with prob. ¾ using o(n) bits

Proof: Run the protocol O(C2) times and take the

majority vote

There is a natural barrier at C = √n for proving l.b.s:

Information Complexity

Information Complexity

In fact, a more technical barrier is:

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

≥ n x information revealed

for a one-bit problem

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

≥ n x information revealed

for a one-bit problem Direct Sum Theorem

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

≥ n x information revealed

for a one-bit problem Direct Sum Theorem

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Send her bit with prob.

½ + 1/C, else send

the complement

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Send her bit with prob.

½ + 1/C, else send

the complement

Send his bit with prob.

½ + 1/C, else send

the complement

Send her bit with prob.

½ + 1/C, else send

the complement

Send his bit with prob.

½ + 1/C, else send

the complement

Is there a stricter one bit problem that we could

reduce to instead?

Send her bit with prob.

½ + 1/C, else send

the complement

Send his bit with prob.

½ + 1/C, else send

the complement

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Definition: Output matrix is the prob. of outputting

“one” for each pair of inputs

Definition: Output matrix is the prob. of outputting

“one” for each pair of inputs

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

The constraint that a protocol achieves advantage at

least 1/C is a set of linear constraints on this matrix

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

The constraint that a protocol achieves advantage at

least 1/C is a set of linear constraints on this matrix

(Using Hellinger): bits revealed ≥ (max– min)2 = Ω(1/C2)

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

The constraint that a protocol achieves advantage at

least 1/C is a set of linear constraints on this matrix

(Using Hellinger): bits revealed ≥ (max– min)2 = Ω(1/C2)

(New): bits revealed ≥ |diagonal – anti-diagonal| = 0

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

For the previous new protocol:

½ + 1/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

(Using Hellinger): bits revealed ≥ (max – min)2 = Ω(1/C2)

For the previous new protocol:

½ + 1/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

(Using Hellinger): bits revealed ≥ (max – min)2 = Ω(1/C2)

(New): bits revealed ≥ |diagonal – anti-diagonal| = Ω(1/C)

For the previous new protocol:

½ + 1/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

a

b

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

a

b

Symmetrized Protocol T’:

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

0

0

0

1

1

0

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

0

0

0

1

1

0

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

0

0

0

1

1

0

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Claim: The protocol T’ has bias 1/C for DISJ.

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Claim: The protocol T’ has bias 1/C for DISJ.

Claim: Yet flipping a pair (ai, bi) between (0,0), (0,1) or

(1,0) results in two distributions p,q (on inputs to T) with

|p-q|1 ≤ 1/n

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Claim: The protocol T’ has bias 1/C for DISJ.

Claim: Yet flipping a pair (ai, bi) between (0,0), (0,1) or

(1,0) results in two distributions p,q (on inputs to T) with

|p-q|1 ≤ 1/n

Proof:

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Can our framework be used to prove further lower

bounds for extension complexity?

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Can our framework be used to prove further lower

bounds for extension complexity?

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

For average case instances?

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Can our framework be used to prove further lower

bounds for extension complexity?

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

For average case instances? For SDPs?

Thanks!

Any Questions?

P vj