Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By...

42
Computational aspects in Statistics: Sparse PCA & Ising blockmodel Quentin Berthet Cargese Summer School - 2018

Transcript of Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By...

Page 1: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Computational aspects in Statistics:

Sparse PCA & Ising blockmodel

Quentin Berthet

Cargese Summer School - 2018

Page 2: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Objectives & Jargon

• Inference problems: detection/estimation, with theoretical guarantees.

• Statement about a parameter of size p, based on an i.i.d. sample of size n.

• Finite sample results, with fixed probability δ, up to universal constants:

e.g. for X1, . . . , Xn i.i.d. from N (µ, Ip),

‖Xn−µ‖22 ≤ C p+log(1/δ)n w.p. 1− δ.

• High-dimensional setting: large p.

• No prior, structural assumptions.

• Important notions:

Optimal statistical rates.

Algorithmic efficiency.

Asterix en Corse, Goscinny, Uderzo (73)

Q.Berthet - Cargese Summer School - 2018 2/41

Page 3: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Sparse PCA

Page 4: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

P. Rigollet (MIT) R. Samworth (Cambridge) T. Wang (Cambridge)

• Optimal Detection of Sparse Principal Components in High Dimension

Q.B., P. Rigollet, Ann. Statis. (2013)

• Computational Lower Bounds for Sparse PCA

Q.B., P. Rigollet, COLT (2013)

• Statistical and Computational Trade-offs in Estimation of Sparse Principal Components

T. Wang, Q.B., R. Samworth, Ann. Statis. (2016)

Page 5: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Sparse principal component detection

X1, . . . , Xn ∈ Rp independent, centered Gaussian with unknown covariance.

Testing problem between two hypotheses, Johnstone (01), & Lu (09).

H0 : IpH1 : Ip + θvv>, v ∈ B0(k)

v is a k-sparse unit vector. B0(k) = v ∈ Rp : |v|2 = 1 , |v|0 ≤ k.v

Isotropy: N (0, Ip) Sparse PC: N (0, Ip + θ vv>)

Q.Berthet - Cargese Summer School - 2018 5/41

Page 6: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Importance of sparsity assumption

General problem of principal component detection, Onatski et al. (13)

H0 : IpH1 : Ip + θvv>, v ∈ Sp−1

v is a unit vector. Sp−1 = v ∈ Rp : |v|2 = 1.

v

Isotropy: N (0, Ip) Principal Component: N (0, Ip + θ vv>)

BBP transition instuition, Baik, Ben Arous, Peche (05): Strong signal: θ >√

pn.

Q.Berthet - Cargese Summer School - 2018 6/41

Page 7: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Sparse principal component detection

Testing problem between two hypotheses.

H0 : IpH1 : Ip + θvv>, v ∈ B0(k)

v is a k-sparse unit vector. B0(k) = v ∈ Rp : |v|2 = 1 , |v|0 ≤ k.v

Isotropy: N (0, Ip) Sparse PC: N (0, Ip + θ vv>)

The signal strength is θ, quantifies the distance between the distributions.

Q.Berthet - Cargese Summer School - 2018 7/41

Page 8: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Sparse principal component detection

Testing problem between two hypotheses.

H0 : IpH1 : Ip + θvv>, v ∈ B0(k)

v is a k-sparse unit vector. B0(k) = v ∈ Rp : |v|2 = 1 , |v|0 ≤ k.v

Isotropy: N (0, Ip) Sparse PC: N (0, Ip + θ vv>), small θ

The signal strength is θ, quantifies the distance between the distributions.

Q.Berthet - Cargese Summer School - 2018 8/41

Page 9: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Sparse principal component detection

Testing problem between two hypotheses.

H0 : IpH1 : Ip + θvv>, v ∈ B0(k)

v is a k-sparse unit vector. B0(k) = v ∈ Rp : |v|2 = 1 , |v|0 ≤ k.v

Isotropy: N (0, Ip) Sparse PC: N (0, Ip + θ vv>), large θ

The signal strength is θ, quantifies the distance between the distributions.

Q.Berthet - Cargese Summer School - 2018 9/41

Page 10: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Results - Detection

Statistics

• Maximizing x>Σx over unit sparse vectors, λkmax(Σ). Test powerful in regime

θ &

√k log(p/k)

n.

• Lower bounds of the same order, optimal result.

Computations

• SDP relaxation of λkmax(Σ) by d’Aspremont et al (07), requires

θ &

√k2 log(p)

n.

• Can be better than λmax(Σ), but still a suboptimal result.

Q.Berthet - Cargese Summer School - 2018 10/41

Page 11: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Overall picture - Testing

Computationally efficient tests seem to require

θ ≈√k2 log(p)

n

Rate observed for several explicit testing methods.

Tests: Diagonal method - Johnstone (01), SDP - d’Aspremont et al. (07),MDP - Berthet and Rigollet (13), other heuristics.

qk log(d/k)

n

qk2 log(d)

n

no detection combinatorial method polynomial-time method

Q.Berthet - Cargese Summer School - 2018 11/41

Page 12: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

The actual situation?

Situation suggested by those results

qk log(d/k)

n

qk2 log(d)

n

no detectioncombinatorial method

- - -no polynomial-time detection

polynomial-time method

So far, only upper bounds, suggestions.

Need for Complexity Theoretic Lower Bounds

Q.Berthet - Cargese Summer School - 2018 12/41

Page 13: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Reduction

We employ the following strategy:

no detection

polynomial-time method

qk↵

n

qk log(d/k)

n

combinatorial method

If detection at θ ≤ θα =

√kα

n, α ∈ [1, 2) with ψ ∈ P . . .

. . . then we can solve a computationally hard problem in polynomial time.

Q.Berthet - Cargese Summer School - 2018 13/41

Page 14: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Reduction in statistics: Le Cam deficiency

• Notion of similarity between statistical problems on P = Pγ : γ ∈ Γ andQ = Qγ : γ ∈ Γ.

δ(P,Q) = infT :Markov

supγ∈Γ

dTV(Pγ, TQγ)

“if you can solve one problem, you can solve the other up to probability δ(P,Q).”

• Given an estimator γP, one can construct an estimator γQ such that

Qγ(γQ 6= γ) ≤ Pγ(γP 6= γ) + δ(P,Q) .

• Done by transforming the instances, using γP as a blackbox.

• Similar to the notion of worst-case hardness in computer science.

Q.Berthet - Cargese Summer School - 2018 14/41

Page 15: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Reduction in statistics: Le Cam deficiency

• Notion of computational similarity between statistical problems onP = Pγ : γ ∈ Γ and Q = Qγ : γ ∈ Γ.

δcomp(P,Q) = infT :Markov comp.

supγ∈Γ

dTV(Pγ, TQγ)

“if you can computationally solve one problem, you can solve the other up toprobability δcomp(P,Q).”

• Given a computationally tractable estimator γP, one can construct acomputationally tractable estimator γQ such that

Qγ(γQ 6= γ) ≤ Pγ(γP 6= γ) + δcomp(P,Q) .

• Done by transforming the instances, using γP as a blackbox.

• Similar to the notion of worst-case hardness in computer science.

Q.Berthet - Cargese Summer School - 2018 15/41

Page 16: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Reasoning by contradiction

• What is the consequence of a test that works for sparse PCA at regime

θα =√

n , α ∈ [1, 2)? (hypothetical blackbox)

• Theorem: One can transform in polynomial time an instance of size m of

planted clique of size k = m1

4−α in an instance of sparse PCA at regime θα.

• Corrollary: Computational limit to solve Sparse PCA in regime θα, α ∈ [1, 2)if computational limit to solve planted clique in regime k = mc, c < 1/2(conjectured).

Q.Berthet - Cargese Summer School - 2018 16/41

Page 17: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Computational hardness in statistics

• Optimal signal level: information-theoretic barriers, combinatorial method.

• Linking sparse PCA and planted clique B., Rigollet (13), Wang et al. (16).

no detection

combinatorial method

---no polynomial-time detection

polynomial-time methods

qk log(d)

n

qk↵

n

Take-home message: gap of√k for computationally efficient methods.

• Line of work on establishing computational limits by reduction.

• Approach related to “algorithm-independence” point of view.

Q.Berthet - Cargese Summer School - 2018 17/41

Page 18: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Ising blockmodel

Page 19: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

P. Rigollet (MIT) P. Srivastava (Caltech → Tata Inst.)

• Exact recovery in the Ising blockmodel

Q. B., P.Rigollet, and P. Srivastava

Ann. Statis. (to appear)

Page 20: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Motivation

• Finding communities in populations, based on similar behavior and influence.

• One of the justifications for stochastic blockmodels

• What if we observe the behavior, not the graph?

Q.Berthet - Cargese Summer School - 2018 20/41

Page 21: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Motivation

Model with p individuals, observation σ ∈ −1, 1p, balanced communities (S, S).

SS SS

For a given question: members of the community influence an individual’s answer.

Example: A metal sheet has temperature at point x, y given byT (x, y) = T0 + k(x2 + y2). What is T (r, θ)?

+1 ) T (r, θ) = T0 + kr2

−1 ) T (r, θ) = T0 + k(r2 + θ2)

Q.Berthet - Cargese Summer School - 2018 21/41

Page 22: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Motivation

SS SS

Model with p individuals, σ ∈ −1, 1p and balanced communities (S, S).

PS(σ) = ︸ ︷︷ ︸?

.

Q.Berthet - Cargese Summer School - 2018 22/41

Page 23: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Motivation

SS SS

/p/p

/p/p

↵/p↵/p

Model with p individuals, σ ∈ −1, 1p and balanced communities (S, S).

PS(σ) =1

Zα,βexp

[ β2p

i∼jσiσj +

α

2p

ij

σiσj

].

Q.Berthet - Cargese Summer School - 2018 23/41

Page 24: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Motivation

SS SS

/p/p

/p/p

↵/p↵/p

Model with p individuals, σ ∈ −1, 1p and balanced communities (S, S).

PS(σ) =1

Zα,βexp

[ β2p

i∼jσiσj +

α

2p

ij

σiσj

].

Q.Berthet - Cargese Summer School - 2018 24/41

Page 25: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Problem description

Ising blockmodel:

PS(σ) =1

Zα,βexp

[ β2p

i∼jσiσj +

α

2p

ij

σiσj

]=

1

Zα,βexp

(−HS,α,β(σ)

).

Energy decreases (probability increases) with more agreement inside each block.

• Blockmodel: PS(σi = σj) =

b for all i ∼ ja for all i j

• Balance: |S| = |S| = p/2 ,

• Homophily: β > 0⇔ b > 1/2 ,

• Assortativity: β > α⇔ b > a .

• Relationship with Hopfield model, and posterior for the SBM.

Observations: σ(1), . . . , σ(n) ∈ −1, 1p i.i.d. from PS.

Objective: recover the balanced partition (S, S) from observations.

Q.Berthet - Cargese Summer School - 2018 25/41

Page 26: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Stochastic blockmodels

• one observation of random graph onp vertices

P(i↔ j) =

b for all i ∼ ja for all i j

• Exact recovery using SDP iff

a = alog p

p, b = b

log p

p

and (a + b)/2 > 1 +√

ab

Abbe, Bandeira, Hall ’14Mossel, Neeman, Sly ’14Xu, Lelarge, Massouliee ’14Hajek, Wu ’16

Wigner matrices

Graphical models / MRF

• n observations σ(1), . . . , σ(n) i.i.d.

P(σ) ∝ exp[ β2p

i,j

Jijσiσj

]

• Goal estimating sparse J = Jij(max degree d)

• Sample complexity n 2d log p

Chow-Liu ’68Bresler, Mossel, Sly ’08Santhanam, Wainwright ’12Bresler ’15Vuffray, Misra, Lokhov, Chertkov ’16

Wishart matrices

Q.Berthet - Cargese Summer School - 2018 26/41

Page 27: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Problem overview

• By symmetry E[σi] = 0, what of the second moment?

• Structure of the problem visible in the covariance matrix Σ

Σ = E[σσ>] =

(∆ ΩΩ ∆

)+ (1−∆)Ip .

• Difficulty of the problem related with the values of quantities ∆,Ω ∈ (−1, 1)

∆ = 2b− 1 , Ω = 2a− 1 .

• Parallel with the stochastic block model on graphs with independent edges

• Summary - Analysis

Deviations: Behavior of Σ around Σ, sample size guarantees.

Population: Results function of ∆− Ω, scaling in p and α, β.

Q.Berthet - Cargese Summer School - 2018 27/41

Page 28: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Maximum likelihood estimation

• Log-likelihood Ln(S) = −n logZα,β +n

2Tr[ΣQS]

• Maximum likelihood estimator:

V ∈ argmaxV ∈P

Tr[ΣV ] , where P = vv> : v ∈ −1, 1p, v>1[p] = 0 .

• Define Γ = PΣP and Γ = P ΣP , for a projector P on the orthogonal of 1:

Γ = (1−∆)P + p∆− Ω

2uSu

>S , uS =

1√p

(1S − 1S)

• For all V ∈ P, Tr[ΓV ] = Tr[ΣV ], so equivalently

V ∈ argmaxV ∈P

Tr[ΓV ]

Q.Berthet - Cargese Summer School - 2018 28/41

Page 29: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

SDP relaxation

V ∈ argmaxV ∈P

Tr[ΓV ] , where P = vv> : v ∈ −1, 1p, v>1[p] = 0 .

NP-Hard (Min bisection)

• Semidefinite convex relaxation of P

E = V : diag(V ) = 1, V 0 .

Change of variable V = vv>

MAXCUT Goemans-Williamson (95)

• Point V solution of maxV ∈CTr[ΓV ]equivalent to Γ ∈ NC(V )

• Relaxation is tight for populationmatrix Γ: V = VS if n =∞.

VSVS

PP

NP(VS)NP(VS)

Q.Berthet - Cargese Summer School - 2018 29/41

Page 30: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

SDP relaxation

V ∈ argmaxV ∈P

Tr[ΓV ] , where P = vv> : v ∈ −1, 1p, v>1[p] = 0 .

NP-Hard (Min bisection)

• Semidefinite convex relaxation of P

E = V : diag(V ) = 1, V 0 .

Change of variable V = vv>

MAXCUT Goemans-Williamson (95)

• Point V solution of maxV ∈CTr[ΓV ]equivalent to Γ ∈ NC(V )

• Relaxation is tight for populationmatrix Γ: V = VS if n =∞.

VSVS

EE

PP

NE(VS)NE(VS)NP(VS)NP(VS)

Q.Berthet - Cargese Summer School - 2018 29/41

Page 31: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Exact recovery

• Upper bound: we have V = VS with probability 1− δ for

n &1

Cα,β

log(p/δ)

∆− Ω,

by bounding function of Γ− Γ, a sum of independent matrices. Tropp (12)

• Matching lower bound: Information theoretic argument yields

n ≤ γ

β − αlog(p/4)

∆− Ω=⇒ P(recovery) . γ

• Full understanding of the scaling of ∆− Ω needed.

Q.Berthet - Cargese Summer School - 2018 30/41

Page 32: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

The Curie–Weiss model (α = β) Σ =

(∆ ∆∆ ∆

)+ (1−∆)Ip

• Mean magnetization: µ = 1>σp ∈ [−1, 1]. Observe that

∆ =1

p2

p∑

i,j=1

E[σiσj]−1

p= E[µ2]− 1

p≈ E[µ2]

• Free energy: µ is a sufficient statistic

Pβ(µ) ≈ 1

Zβexp

(− p

4gCWβ (µ)

), gCW

β (µ) = −2βµ2 + 4h(1 + µ

2

)

• Ground states: Minimizers G ⊂ [−1, 1] of gCWβ (µ).

• Concentration: µ ≈ ground state with exponentially large probability so

∆ ≈ E[µ2] ≈ 1

|G|∑

s∈Gs2 .

Q.Berthet - Cargese Summer School - 2018 31/41

Page 33: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Free energy of the Curie-Weiss model

Ground states G = µ(β),−µ(β), µ(β) ≥ 0:

-1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0

β = 0 β = 1 β = 1.2 β = 1.8

0.0 0.5 1.0 1.5 2.0 2.50.0

0.2

0.4

0.6

0.8

1.0

∆ ≈ 1

|G|∑

s∈Gs2 = µ(β)2

β 7→ µ(β)

Q.Berthet - Cargese Summer School - 2018 32/41

Page 34: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Free energy of the Ising blockmodel

• Energy function of the block magnetizations: (µS, µS) =2

p

(1>Sσ,1

>Sσ)

PS(σ) =1

Zα,βexp

(− p

8

(− βµ2

S − βµ2S − 2αµS µS

))

• Marginal: number of configurations with magnetizations µ is( (p/2)

1+µ2 (p/2)

)

PS(µS, µS) ≈ 1

Zα,βexp

(− p

8gα,β(µS, µS)

)

where gα,β is the free energy defined by

gα,β(µS, µS) = −βµ2S − βµ2

S − 2αµS µS + 4h(1 + µS

2

)+ 4h

(1 + µS2

).

Q.Berthet - Cargese Summer School - 2018 33/41

Page 35: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

The Ising blockmodel model (α < β) Σ =

(∆ ΩΩ ∆

)+ (1−∆)Ip

• Block magnetizations: µS = 1S>σp/2 , µS =

1S>σp/2 ∈ [−1, 1]. Observe that

∆ ≈ 2

p2

i∼jE[σiσj] ≈

1

2E[µ2

S + µ2S] and Ω =

2

p2

ij

E[σiσj] = E[µS µS]

• Free energy: (µS, µS) ∈ [−1, 1]2 is a sufficient statistic

PS(µS, µS) ≈ 1

Zα,βexp

(− p

8gα,β(µS, µS)

)

• Ground states: Minimizers G ⊂ [−1, 1]2 of gCWα,β(µS, µS).

• Concentration: (µS, µS) ≈ ground states with exp. large probability so

∆− Ω ≈ 1

2E[(µS − µS)2] ≈ 1

|G|∑

s∈G(s1 − s2)

2.

Q.Berthet - Cargese Summer School - 2018 34/41

Page 36: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Ground states for the Ising blockmodel

gα,β(µS, µS) = −βµ2S − βµ2

S − 2αµS µS + 4h(1 + µS

2

)+ 4h

(1 + µS2

)

α = −6 α = −2.5 α = −0.5 α = 0

Ground states on the skew-diagonal (µS = −µS) for α ≤ 0 and fixed β = 1.5 < 2

-3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0

0.2

0.4

0.6

0.8

1.0

α 7→ µS(α, β = 1.5)

Q.Berthet - Cargese Summer School - 2018 35/41

Page 37: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Ground states for the Ising blockmodel

gα,β(µS, µS) = −βµ2S − βµ2

S − 2αµS µS + 4h(1 + µS

2

)+ 4h

(1 + µS2

)

α = −4 α = −0.9 α = −0.2 α = 0

Ground states on the skew-diagonal (µS = −µS) for α ≤ 0 and fixed β = 2.5 > 2

-3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0

0.2

0.4

0.6

0.8

1.0

α 7→ µS(α, β) β = 2.5

Q.Berthet - Cargese Summer School - 2018 36/41

Page 38: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Phase diagram

Full understanding of the position of the ground states for β > 0, α < β6 BERTHET, RIGOLLET AND SRIVASTAVA

2 24 41 1

2

4

(III)(I)

(II)

Figure 1: The phase diagram of the Ising block model, with three regions for theparameters ↵ and > 0. In region (I), where ↵ < 0 and + |↵| > 2, there aretwo ground states of the form (x,x) and (x, x), x > 0. In region (II), where + |↵| < 2, there is one ground state at (0, 0). In region (III), where ↵ > 0 and + |↵| > 2, there are two ground states of the form (x, x) and (x,x), x > 0.At the boundary between regions (I) and (III), there are four ground states. Thedotted line has equation ↵ = , we only consider parameters in the region to itsleft, where > ↵.

Proof. Throughout this proof, for any b 2 IR, we denote by gcwb (x), x 2

[1, 1], the free energy of the Curie-Weiss model with inverse temperature b. Wewrite g := g↵, for simplicity to denote the free energy of the IBM.

Note that

(2.7) g(x, y) = gcw+↵

2

(x) + gcw+↵

2

(y) + ↵(x y)2 .

We split our analysis according to the sign of ↵. Note first that if ↵ = 0, we have

g(x, y) = gcw2

(x) + gcw2

(y) .

It yields that:

• If 2, then gcw2

has a unique local minimum at x = 0 which implies that g

has a unique minimum at (0, 0)• If > 2, then gcw

2

has exactly two minima at x(/2) and x(/2), where

x(/2) 2 (1, 1). It implies that g has four minima at (±x(/2), ±x(/2)).

Next, if ↵ > 0, in view of (2.7) we have

g(x, y) gcw+↵

2

(x) + gcw+↵

2

(y)

with equality i↵ x = y. It implies that:

• If ↵+ 2, then g has a unique minimum at (0, 0)

• Phase diagram for all the parameter regions

Region (I): Two ground states (µS, µS) = ±(x,−x).

Region (II): One ground state at (0, 0).

Region (III): Two ground states (µS, µS) = ±(x, x).

Q.Berthet - Cargese Summer School - 2018 37/41

Page 39: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Concentration

• Quantities of interest as expectations of the mean block magnetizations

∆ ≈ 1

2E[µ2

S + µ2S] , Ω ≈ E[µSµS] and ∆− Ω ≈ 1

2E[(µS − µS)2] .

• Gaussian approximation of the discrete distribution with Z ∼ N (0, I2).

Eα,β[ϕ(µ)] 'p1

|G|∑

s∈GE[ϕ(s+ 2

√2

pH−1/2Z)

]∀ϕ .

• Approximation of the gap ∆− Ω:

∆− Ω 'p

2x2 in region (I)

Cα,βp

in region (II)

C ′α,βp

in region (III)

Q.Berthet - Cargese Summer School - 2018 38/41

Page 40: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Naive estimation

• Covariance matrix:

Σ = E[σσ>] =

∆ Ω

Ω ∆

+ (1−∆)Ip .

• Empirical covariance matrix:

Σ =1

n

n∑

t=1

σ(t)σ(t)> = Σ±√

log p

nentrywise

• Threshold off-diagonal entries of Σ at (∆ + Ω)/2

• Exact recovery if

n &

log p in region (I)

p2 log p in region (II)

p2 log p in region (III)

Q.Berthet - Cargese Summer School - 2018 39/41

Page 41: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Exact recovery

• Upper bound: we have V = VS with probability 1− δ for

n &1

Cα,β

log(p/δ)

∆− Ω,

by bounding function of Γ− Γ, a sum of independent matrices. Tropp 12

• Matching lower bound: Fano’s inequality yields

n ≤ γ

β − αlog(p/4)

∆− Ω=⇒ P(recovery) . γ

• Full understanding of the scaling of ∆− Ω gives optimal rates.

n &

log p in region (I)

p log p in regions (II) and (III)

with constant factors illustrating further these transitions.

Q.Berthet - Cargese Summer School - 2018 40/41

Page 42: Quentin Berthet - GitHub PagesQ.Berthet - Cargese Summer School - 2018 26/41 Problem overview By symmetry E ˙i] = 0, what of the second moment? Structure of the problem visible in

Conclusion

• Contributions

Model for interactions between individuals in different communities.

Analysis from statistical physics to understand parameters of the problem.

Study of convex relaxations with an analysis on normal cones.

• Open questions

Exact recovery threshold, conjecture that n∗ = C∗ log(p)(β−α)(∆−Ω).

Rates for partial recovery in Hamming distance.

Generalization to multiple blocks, more complex structures.

THANK YOU

Work supported by the Isaac Newton Trust and the Alan Turing Institute.

Q.Berthet - Cargese Summer School - 2018 41/41