Conditional Central Limit Theorems for Gaussian Projections

27
Conditional Central Limit Theorems for Gaussian Projections Galen Reeves Department of ECE and Department of Statistical Science Duke University ISIT, June 2017

Transcript of Conditional Central Limit Theorems for Gaussian Projections

Page 1: Conditional Central Limit Theorems for Gaussian Projections

Conditional Central Limit Theoremsfor Gaussian Projections

Galen Reeves

Department of ECE and Department of Statistical ScienceDuke University

ISIT June 2017

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

2 27

Conditional CLT for random projections

random vectorX isin Rn

low-dimensional projectionZ = ΘX Θ isin Rktimesn

Let Θ be IID Gaussian projection matrix Let GZ be Gaussian withsame mean and covariance as Z

PZ asymp GZ CLT

PZ|Θ(middot |Θ) asymp GZ Conditional CLT

3 27

Projection pursuit in statistics

I Dataset is collection of high-dimensional vectors

x1 x2 middot middot middot xM

I Compute low-dimensional projections in direction Θ

Θx1Θx2 middot middot middot ΘxM

I Huber 1981 suggests that a direction Θ is lsquointerestingrsquo ifempirical distribution of projected data is far from Gaussian

I Diaconis amp Freedman 1984 provide negative result Almost allprojections are close to Gaussian provided that

1M

Msumi=1

∣∣ 1nxi

2 minus γ∣∣ asymp 0

deviation of second moments

1M

Msumi=1

1n |xi middot xj | asymp 0

measure of correlation

4 27

Illustration Projections of cubes

Figure Cubes of dimensions 3ndash9 [Buja et al 1996]

5 27

Measure-theoretic Dvoretzky Theorem

Theorem (Dvoretzky - Milman)

Random d-dimensional sections of a symmetric convex body in Rnare approximately spherical if d = O(log n)

Figure Random section of high-dimensional convex set [Vershynin 2014]

6 27

Related work

I CLT for low-dimensional projections Maxwell 1875 Borel 1914

Poincare 1912 Diaconis amp Freedman 1984 Johnson 2003

I Conditional CLT Sudakov 1978 Diaconis amp Freedman 1984 Hall

amp Li 1993 Weizacker 1997 Anttila et al 2003 Bobkov 2003 Naor

amp Romik 2003 Dasgupta et al 2006 Klartag 2007 Meckes 2010

Meckes 2012 Dumbgen amp Conte-Zerial 2013 Leeb 2013

I Meckes 2012 provides explicit convergence rates in terms ofbounded-Lipschitz metric Under second moment constraintson X she shows that

k le (2minus ε) log n

log log n=rArr dBL(PZ|Θ GZ)rarr 0

I Dumbgen amp Conte-Zerial 2013 prove necessary conditionsmatching sufficient conditions of Diaconis amp Freedman 1984

7 27

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 2: Conditional Central Limit Theorems for Gaussian Projections

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

2 27

Conditional CLT for random projections

random vectorX isin Rn

low-dimensional projectionZ = ΘX Θ isin Rktimesn

Let Θ be IID Gaussian projection matrix Let GZ be Gaussian withsame mean and covariance as Z

PZ asymp GZ CLT

PZ|Θ(middot |Θ) asymp GZ Conditional CLT

3 27

Projection pursuit in statistics

I Dataset is collection of high-dimensional vectors

x1 x2 middot middot middot xM

I Compute low-dimensional projections in direction Θ

Θx1Θx2 middot middot middot ΘxM

I Huber 1981 suggests that a direction Θ is lsquointerestingrsquo ifempirical distribution of projected data is far from Gaussian

I Diaconis amp Freedman 1984 provide negative result Almost allprojections are close to Gaussian provided that

1M

Msumi=1

∣∣ 1nxi

2 minus γ∣∣ asymp 0

deviation of second moments

1M

Msumi=1

1n |xi middot xj | asymp 0

measure of correlation

4 27

Illustration Projections of cubes

Figure Cubes of dimensions 3ndash9 [Buja et al 1996]

5 27

Measure-theoretic Dvoretzky Theorem

Theorem (Dvoretzky - Milman)

Random d-dimensional sections of a symmetric convex body in Rnare approximately spherical if d = O(log n)

Figure Random section of high-dimensional convex set [Vershynin 2014]

6 27

Related work

I CLT for low-dimensional projections Maxwell 1875 Borel 1914

Poincare 1912 Diaconis amp Freedman 1984 Johnson 2003

I Conditional CLT Sudakov 1978 Diaconis amp Freedman 1984 Hall

amp Li 1993 Weizacker 1997 Anttila et al 2003 Bobkov 2003 Naor

amp Romik 2003 Dasgupta et al 2006 Klartag 2007 Meckes 2010

Meckes 2012 Dumbgen amp Conte-Zerial 2013 Leeb 2013

I Meckes 2012 provides explicit convergence rates in terms ofbounded-Lipschitz metric Under second moment constraintson X she shows that

k le (2minus ε) log n

log log n=rArr dBL(PZ|Θ GZ)rarr 0

I Dumbgen amp Conte-Zerial 2013 prove necessary conditionsmatching sufficient conditions of Diaconis amp Freedman 1984

7 27

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 3: Conditional Central Limit Theorems for Gaussian Projections

Conditional CLT for random projections

random vectorX isin Rn

low-dimensional projectionZ = ΘX Θ isin Rktimesn

Let Θ be IID Gaussian projection matrix Let GZ be Gaussian withsame mean and covariance as Z

PZ asymp GZ CLT

PZ|Θ(middot |Θ) asymp GZ Conditional CLT

3 27

Projection pursuit in statistics

I Dataset is collection of high-dimensional vectors

x1 x2 middot middot middot xM

I Compute low-dimensional projections in direction Θ

Θx1Θx2 middot middot middot ΘxM

I Huber 1981 suggests that a direction Θ is lsquointerestingrsquo ifempirical distribution of projected data is far from Gaussian

I Diaconis amp Freedman 1984 provide negative result Almost allprojections are close to Gaussian provided that

1M

Msumi=1

∣∣ 1nxi

2 minus γ∣∣ asymp 0

deviation of second moments

1M

Msumi=1

1n |xi middot xj | asymp 0

measure of correlation

4 27

Illustration Projections of cubes

Figure Cubes of dimensions 3ndash9 [Buja et al 1996]

5 27

Measure-theoretic Dvoretzky Theorem

Theorem (Dvoretzky - Milman)

Random d-dimensional sections of a symmetric convex body in Rnare approximately spherical if d = O(log n)

Figure Random section of high-dimensional convex set [Vershynin 2014]

6 27

Related work

I CLT for low-dimensional projections Maxwell 1875 Borel 1914

Poincare 1912 Diaconis amp Freedman 1984 Johnson 2003

I Conditional CLT Sudakov 1978 Diaconis amp Freedman 1984 Hall

amp Li 1993 Weizacker 1997 Anttila et al 2003 Bobkov 2003 Naor

amp Romik 2003 Dasgupta et al 2006 Klartag 2007 Meckes 2010

Meckes 2012 Dumbgen amp Conte-Zerial 2013 Leeb 2013

I Meckes 2012 provides explicit convergence rates in terms ofbounded-Lipschitz metric Under second moment constraintson X she shows that

k le (2minus ε) log n

log log n=rArr dBL(PZ|Θ GZ)rarr 0

I Dumbgen amp Conte-Zerial 2013 prove necessary conditionsmatching sufficient conditions of Diaconis amp Freedman 1984

7 27

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 4: Conditional Central Limit Theorems for Gaussian Projections

Projection pursuit in statistics

I Dataset is collection of high-dimensional vectors

x1 x2 middot middot middot xM

I Compute low-dimensional projections in direction Θ

Θx1Θx2 middot middot middot ΘxM

I Huber 1981 suggests that a direction Θ is lsquointerestingrsquo ifempirical distribution of projected data is far from Gaussian

I Diaconis amp Freedman 1984 provide negative result Almost allprojections are close to Gaussian provided that

1M

Msumi=1

∣∣ 1nxi

2 minus γ∣∣ asymp 0

deviation of second moments

1M

Msumi=1

1n |xi middot xj | asymp 0

measure of correlation

4 27

Illustration Projections of cubes

Figure Cubes of dimensions 3ndash9 [Buja et al 1996]

5 27

Measure-theoretic Dvoretzky Theorem

Theorem (Dvoretzky - Milman)

Random d-dimensional sections of a symmetric convex body in Rnare approximately spherical if d = O(log n)

Figure Random section of high-dimensional convex set [Vershynin 2014]

6 27

Related work

I CLT for low-dimensional projections Maxwell 1875 Borel 1914

Poincare 1912 Diaconis amp Freedman 1984 Johnson 2003

I Conditional CLT Sudakov 1978 Diaconis amp Freedman 1984 Hall

amp Li 1993 Weizacker 1997 Anttila et al 2003 Bobkov 2003 Naor

amp Romik 2003 Dasgupta et al 2006 Klartag 2007 Meckes 2010

Meckes 2012 Dumbgen amp Conte-Zerial 2013 Leeb 2013

I Meckes 2012 provides explicit convergence rates in terms ofbounded-Lipschitz metric Under second moment constraintson X she shows that

k le (2minus ε) log n

log log n=rArr dBL(PZ|Θ GZ)rarr 0

I Dumbgen amp Conte-Zerial 2013 prove necessary conditionsmatching sufficient conditions of Diaconis amp Freedman 1984

7 27

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 5: Conditional Central Limit Theorems for Gaussian Projections

Illustration Projections of cubes

Figure Cubes of dimensions 3ndash9 [Buja et al 1996]

5 27

Measure-theoretic Dvoretzky Theorem

Theorem (Dvoretzky - Milman)

Random d-dimensional sections of a symmetric convex body in Rnare approximately spherical if d = O(log n)

Figure Random section of high-dimensional convex set [Vershynin 2014]

6 27

Related work

I CLT for low-dimensional projections Maxwell 1875 Borel 1914

Poincare 1912 Diaconis amp Freedman 1984 Johnson 2003

I Conditional CLT Sudakov 1978 Diaconis amp Freedman 1984 Hall

amp Li 1993 Weizacker 1997 Anttila et al 2003 Bobkov 2003 Naor

amp Romik 2003 Dasgupta et al 2006 Klartag 2007 Meckes 2010

Meckes 2012 Dumbgen amp Conte-Zerial 2013 Leeb 2013

I Meckes 2012 provides explicit convergence rates in terms ofbounded-Lipschitz metric Under second moment constraintson X she shows that

k le (2minus ε) log n

log log n=rArr dBL(PZ|Θ GZ)rarr 0

I Dumbgen amp Conte-Zerial 2013 prove necessary conditionsmatching sufficient conditions of Diaconis amp Freedman 1984

7 27

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 6: Conditional Central Limit Theorems for Gaussian Projections

Measure-theoretic Dvoretzky Theorem

Theorem (Dvoretzky - Milman)

Random d-dimensional sections of a symmetric convex body in Rnare approximately spherical if d = O(log n)

Figure Random section of high-dimensional convex set [Vershynin 2014]

6 27

Related work

I CLT for low-dimensional projections Maxwell 1875 Borel 1914

Poincare 1912 Diaconis amp Freedman 1984 Johnson 2003

I Conditional CLT Sudakov 1978 Diaconis amp Freedman 1984 Hall

amp Li 1993 Weizacker 1997 Anttila et al 2003 Bobkov 2003 Naor

amp Romik 2003 Dasgupta et al 2006 Klartag 2007 Meckes 2010

Meckes 2012 Dumbgen amp Conte-Zerial 2013 Leeb 2013

I Meckes 2012 provides explicit convergence rates in terms ofbounded-Lipschitz metric Under second moment constraintson X she shows that

k le (2minus ε) log n

log log n=rArr dBL(PZ|Θ GZ)rarr 0

I Dumbgen amp Conte-Zerial 2013 prove necessary conditionsmatching sufficient conditions of Diaconis amp Freedman 1984

7 27

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 7: Conditional Central Limit Theorems for Gaussian Projections

Related work

I CLT for low-dimensional projections Maxwell 1875 Borel 1914

Poincare 1912 Diaconis amp Freedman 1984 Johnson 2003

I Conditional CLT Sudakov 1978 Diaconis amp Freedman 1984 Hall

amp Li 1993 Weizacker 1997 Anttila et al 2003 Bobkov 2003 Naor

amp Romik 2003 Dasgupta et al 2006 Klartag 2007 Meckes 2010

Meckes 2012 Dumbgen amp Conte-Zerial 2013 Leeb 2013

I Meckes 2012 provides explicit convergence rates in terms ofbounded-Lipschitz metric Under second moment constraintson X she shows that

k le (2minus ε) log n

log log n=rArr dBL(PZ|Θ GZ)rarr 0

I Dumbgen amp Conte-Zerial 2013 prove necessary conditionsmatching sufficient conditions of Diaconis amp Freedman 1984

7 27

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 8: Conditional Central Limit Theorems for Gaussian Projections

Random coding arguments in information theory

I Random vector X is supported on the standard basis vectors

e1 middot middot middot en

I Random vector Z = ΘX is supported on columns of matrix

Θ1 middot middot middotΘn

This is an IID Gaussian codebook of rate R = 1k log n

8 27

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 9: Conditional Central Limit Theorems for Gaussian Projections

Source coding with random codebook

I Optimal distortion for Gaussian source N (0 σ2) given bydistortion rate function

D(R) = σ2 exp(minus2R)

I Quadratic Wasserstein distance between Z | Θ andW sim N (0 σ2Ik) corresponds to distortion of codebook

W 22

(PZ|Θ PW

)ge E

[miniisin[n]Θi minusW2 | Θ

]optimal coupling

ge D(R) optimality of DRF

= k σ2 nminus2k R = 1

k log n

I Inequalities are tight up to o(k) for large (k n)

9 27

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 10: Conditional Central Limit Theorems for Gaussian Projections

Channel coding with random codebook

I Gaussian channel

Yi = Zi +radictNi 1 le i le k

I Expected relative entropy between Z | Θ and GY correspondsto gap between channel capacity C = 1

2 log(1 + snr) andmutual information

[D(PY |Θ

∥∥GY )] = k C minus I(XY |Θ)

= k(C minusR

)+

+ o(k)

10 27

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 11: Conditional Central Limit Theorems for Gaussian Projections

Random linear estimation compressed sensing CDMA

I X is unknown signal with second moment γ = E[

1nX

2]

I Θ is measurement matrix with sampling rate δ = kn

I Noisy measurements

Y = Z +radictN Z = ΘX

I Mutual information satisfies

I(XY |Θ) =k

2log(

1 +γ

t

)minus EΘ

[D(PY |Θ

∥∥GY )]Hence compressed sensing is optimal in the sense of mutualinformation if and only if CCLT holds

11 27

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 12: Conditional Central Limit Theorems for Gaussian Projections

Fundamental limits of random linear estimation

I Guo amp Verdu 2005 derive formulas for mutual information andMMSE using heuristic replica method from statistical physics

I Rigorous results for special cases Verdu amp Shamai 1999 Tse amp

Hanly 1999 Montanari amp Tse 2006 Korada amp Macris 2010 Bayati

amp Montanari 20011 R amp Gastpar 2012 Wu amp Verdu 2012

Krzakala et al 2012 Donoho et al 2013 Huleihel amp Merhav 2017

I R amp Pfister 2016 prove that replica symmetric formulas arecorrect Proof focuses on new measurements

Ym+1 = ΘmX +radictNm+1 Θm isin R1timesn X sim PX|YmΘm

I Related results obtain independently by Barbier et al 2016

12 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 13: Conditional Central Limit Theorems for Gaussian Projections

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

13 27

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 14: Conditional Central Limit Theorems for Gaussian Projections

Problem formulation

Assumptions

I X has finite second moment E[

1nX

2]

= 1

I Θ has IID Gaussian entries N (0 1n)

Define

αr(X) =(E[∣∣ 1nX

2 minus γ∣∣r]) 1

r

deviation of second moment

βr(X) =(E[

1n

∣∣X middotX prime∣∣r]) 1r

measure of correlation

β2(X) =1

n

radicradicradicradic nsumi=1

λ2i (E[XXT ]) le 1radic

nλmax(E

[XXT

])

14 27

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 15: Conditional Central Limit Theorems for Gaussian Projections

Main results

Z = ΘX Θ isin Rktimesn

Theorem (CCLT for Quadratic Wasserstein distance)

The quadratic Wasserstein distance satisfies between theconditional distribution of Z given Θ and the Gaussian distributionwith the same mean and covariance as Z satisfies

[W 2

2 (PZ|Θ GZ)]

le C(k α(X) + k

34 (β1(X))

12 + k(β2(X))

4k+4

)

15 27

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 16: Conditional Central Limit Theorems for Gaussian Projections

Main results

Y = ΘX +radictN Θ isin Rktimesn

Theorem (Conditional CLT for relative entropy)

For all ε isin (0 1) the relative entropy between the conditionaldistribution of Y given Θ and the Gaussian distribution with thesame mean and covariance as Y satisfies

[D(PY |Θ

∥∥GY )]le C

(k log

(1+ 1

t

)α(X)

ε+ k

34 (β1(X))

12 + k

14

(1+ (2+ε)

t

) k4β2(X)

)

16 27

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 17: Conditional Central Limit Theorems for Gaussian Projections

Consequences of main results

I For any n-dimensional random vector X satisfying

α(X) le Cradicn β2(X) le Cradic

n

the quadratic Wasserstein distance satisfies

[W 2

2 (PZ|Θ GZ)]le C

(nminus

14 + k nminus

2k+4

)

I Rate-distortion lower bond If H(X) le C log n then

[W 2

2 (PZ|Θ GZ)]ge k nminus

2k

I Recovers same scaling condition as Meckes 2012 with strongermetric and weaker assumptions

17 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 18: Conditional Central Limit Theorems for Gaussian Projections

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

18 27

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 19: Conditional Central Limit Theorems for Gaussian Projections

Key steps in proof

1 Talagrandrsquos transportation inequality

W 22 (PY |Θ GY ) le 2k(1 + t)D

(PY |Θ

∥∥GY )2 Decomposition in terms of CLT and mutual information

[D(PY |Θ

∥∥GY )] = D(PY GY ) + I(Y Θ)

3 Two-moment inequality for mutual information R 2017

I(XY ) le Cλ

radicω(S)V λ

np(Y |X)V 1minusλnq (Y |X)

(q minus p) q lt 1 lt p

Vs(Y |X) =

intys Var(p(y |X)) dy

19 27

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 20: Conditional Central Limit Theorems for Gaussian Projections

Moments of variance

Vs(Y |X) =

intys Var(p(y |X)) dy

= E

[(1

SaminusR

) k2(S2gminusR2

Sa minusR

)s2

minus(

1

Sa

)k2(S2g

Sa

)s2

]

with

Sa = t+1

2nX12 +

1

2nX22

Sg =

radic(t+

1

nX12

)(t+

1

nX22

)R =

1

n〈X1 X2〉

where X1 and X2 are independent copies of X

20 27

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 21: Conditional Central Limit Theorems for Gaussian Projections

Table of Contents

MotivationStatisticsInformation TheoryRandom linear estimation

Results

Proof outline

Conclusion

21 27

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 22: Conditional Central Limit Theorems for Gaussian Projections

Conclusion

I Conditional CLT has many applicationsI Projection pursuit Measure-theoretic Dvoretzky TheoremI Random coding arguments in information theoryI Phase transitions in compressed sensing R amp Pfister 2016I Approximate inference methods based on message passing

I Main results are bounds on quadratic Wasserstein distanceand relative entropy in terms of moments of distribution

I Proof usesI Talagrandrsquos transportation inequalityI Decomposition into CLT and mutual informationI Two-moment inequality for mutual information R 2017

22 27

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 23: Conditional Central Limit Theorems for Gaussian Projections

References I

G Reeves ldquoConditional central limit theorems for Gaussian projectionsrdquo Dec2016 [Online] Available httpsarxivorgabs161209252

mdashmdash ldquoConditional central limit theorems for Gaussian projectionsrdquo in ProcIEEE Int Symp Inform Theory Aachen Germany Jun 2017

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo2017 [Online] Available httpsarxivorgabs170207302

mdashmdash ldquoTwo-moment inequailties for Renyi entropy and mutual informationrdquo inProc IEEE Int Symp Inform Theory Aachen Germany Jun 2017

P J Huber ldquoProjection pursuitrdquo The Annals of Statistics vol 13 no 2 pp435ndash475 1985

P Diaconis and D Freedman ldquoAsymptotics of graphical projection pursuitrdquoThe Annals of Statistics vol 12 no 3 pp 793ndash815 1984

R Vershynin ldquoEstimation in high dimensions A geometric perspectiverdquoDecember 2 2014 [Online] Availablehttpwww-personalumichedusimromanvpapersestimation-tutorialpdf

V N Sudakov ldquoTypical distributions of linear functionals in finite-dimensionalspaces of high dimensionrdquo Soviet Math Doklady vol 16 no 6 pp 1578ndash15821978

23 27

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 24: Conditional Central Limit Theorems for Gaussian Projections

References II

P Hall and K-C Li ldquoOn almost linearity of low dimensional projections fromhigh dimensional datardquo The Annals of Statistics vol 21 no 2 pp 867ndash8891993

H von Weizsacker ldquoSudakovrsquos typical marginals random linear functionals anda conditional central limit theoremrdquo Probability Theory and Related Fields vol107 no 3 pp 313ndash324 1997

M Anttila K Ball and I Perissinaki ldquoThe central limit problem for convexbodiesrdquo Transactions of the American Mathematical Society vol 355 no 12pp 4723ndash4735 2003

S G Bobkov ldquoOn concentration of distributions of random weighted sumsrdquoThe Annals of Probability vol 31 no 1 pp 195ndash215 2003

A Naor and D Romik ldquoProjecting the surface measure of the sphere of `np rdquoAnnales de lrsquoInstitut Henri Poincare (B) Probability and Statistics vol 39 no 2pp 241ndash246 2003

B Klartag ldquoA central limit theorem for convex setsrdquo Inventiones mathematicaevol 168 no 1 pp 91ndash131 Apr 2007

E Meckes ldquoApproximation of projections of random vectorsrdquo Journal ofTheoretical Probability vol 25 no 2 pp 333ndash352 2010

24 27

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 25: Conditional Central Limit Theorems for Gaussian Projections

References III

mdashmdash ldquoProjections of probability distributions A measure-theoretic Dvoretzkytheoremrdquo in Geometric Aspects of Functional Analysis ser Lecture Notes inMathematics Springer 2012 vol 2050 pp 317ndash326

L Dumbgen and P D Conte-Zerial ldquoOn low-dimensional projections ofhigh-dimensional distributionsrdquo in From Probability to Statistics and BackHigh-Dimensional Models and Processes ndash A Festschrift in Honor of Jon AWellner Institute of Mathematical Statistics Collections 2013 vol 9 pp91ndash104

H Leeb ldquoOn the conditional distributions of low-dimensional projections fromhigh-dimensional datardquo The Annals of Statistics vol 41 no 2 pp 464ndash4832013

S Verdu and S Shamai ldquoSpectral efficiency of cdma with random spreadingrdquoIEEE Trans Inform Theory vol 45 pp 622ndash640 Mar 1999

D N C Tse and S Hanly ldquoLinear multiuser receivers Effective interferenceeffective bandwith and user capacityrdquo IEEE Trans Inform Theory vol 45 pp641ndash657 Mar 1999

A Montanari and D Tse ldquoAnalysis of belief propagation for non-linear problemsThe example of CDMA (or How to prove Tanakarsquos formula)rdquo in Proc IEEEInform Theory Workshop Punta del Este Uruguay 2006 pp 160ndash164

25 27

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 26: Conditional Central Limit Theorems for Gaussian Projections

References IV

S B Korada and N Macris ldquoTight bounds on the capicty of binary inputrandom CDMA systemsrdquo IEEE Trans Inform Theory vol 56 no 11 pp5590ndash5613 Nov 2010

M Bayati and A Montanari ldquoThe dynamics of message passing on densegraphs with applications to compressed sensingrdquo IEEE Trans Inform Theoryvol 57 no 2 pp 764ndash785 Feb 2011

G Reeves and M Gastpar ldquoThe sampling rate-distortion tradeoff for sparsitypattern recovery in compressed sensingrdquo IEEE Trans Inform Theory vol 58no 5 pp 3065ndash3092 May 2012

Y Wu and S Verdu ldquoOptimal phase transitions in compressed sensingrdquo IEEETrans Inform Theory vol 58 no 10 pp 6241 ndash 6263 Oct 2012

F Krzakala M Mezard F Sausset Y F Sun and L ZdeborovaldquoStatistical-physics-based reconstruction in compressed sensingrdquo PhysicalReview X vol 2 no 2 May 2012

D L Donoho A Javanmard and A Montanari ldquoInformation-theoreticallyoptimal compressed sensing via spatial coupling and approximate messagepassingrdquo IEEE Trans Inform Theory vol 59 no 11 pp 7434ndash7464 Jul 2013

W Huleihel and N Merhav ldquoAsymptotic MMSE analysis under sparserepresentation modelingrdquo Signal Processing vol 131 pp 320ndash332 2017

26 27

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion
Page 27: Conditional Central Limit Theorems for Gaussian Projections

References V

G Reeves and H D Pfister ldquoThe replica-symmetric prediction for compressedsensing with Gaussian matrices is exactrdquo Jul 2016 [Online] Availablehttpsarxivorgabs160702524

mdashmdash ldquoThe replica-symmetric prediction for compressed sensing with Gaussianmatrices is exactrdquo in Proc IEEE Int Symp Inform Theory Barcelona SpainJul 2016 pp 665 ndash 669

J Barbier M Dia N Macris and F Krzakala ldquoThe mutual information inrandom linear estimationrdquo in Proc Annual Allerton Conf on Commun Controland Comp Monticello IL 2016 [Online] Availablehttparxivorgabs160702335

27 27

  • Motivation
    • Statistics
    • Information Theory
    • Random linear estimation
      • Results
      • Proof outline
      • Conclusion