Stein's Method Applied to Some Statistical Problems · 2017. 7. 5. · Np( ;I) for p 3 James-Stein...

Post on 23-Jan-2021

6 views 0 download

Transcript of Stein's Method Applied to Some Statistical Problems · 2017. 7. 5. · Np( ;I) for p 3 James-Stein...

Stein’s Method Applied to Some Statistical Problems

Jay Bartroff

Borchard Colloquium 2017

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 1 / 36

Outline of this talk

1. Stein’s Method2. Bounds to the normal for group sequential statistics with

covariatesI Produce explicit distributional bounds to the limiting normal

distribution for repeatedly-computed MLE

(θ1, θ2, . . . , θK )

of a parameter vector θ ∈ Rp in group sequential trials3. Concentration inequalities for occupancy models with log-concave

marginalsI How to get bounded, size biased couplings for certain multivariate

occupancy models, then use these to get concentration inequalitiesI Joint work with Larry Goldstein (USC) and Ümit Islak (Bogaziçi

University, Istanbul)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 2 / 36

Outline of this talk

1. Stein’s Method2. Bounds to the normal for group sequential statistics with

covariatesI Produce explicit distributional bounds to the limiting normal

distribution for repeatedly-computed MLE

(θ1, θ2, . . . , θK )

of a parameter vector θ ∈ Rp in group sequential trials3. Concentration inequalities for occupancy models with log-concave

marginalsI How to get bounded, size biased couplings for certain multivariate

occupancy models, then use these to get concentration inequalitiesI Joint work with Larry Goldstein (USC) and Ümit Islak (Bogaziçi

University, Istanbul)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 2 / 36

Outline of this talk

1. Stein’s Method2. Bounds to the normal for group sequential statistics with

covariatesI Produce explicit distributional bounds to the limiting normal

distribution for repeatedly-computed MLE

(θ1, θ2, . . . , θK )

of a parameter vector θ ∈ Rp in group sequential trials3. Concentration inequalities for occupancy models with log-concave

marginalsI How to get bounded, size biased couplings for certain multivariate

occupancy models, then use these to get concentration inequalitiesI Joint work with Larry Goldstein (USC) and Ümit Islak (Bogaziçi

University, Istanbul)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 2 / 36

Outline of this talk

1. Stein’s Method2. Bounds to the normal for group sequential statistics with

covariatesI Produce explicit distributional bounds to the limiting normal

distribution for repeatedly-computed MLE

(θ1, θ2, . . . , θK )

of a parameter vector θ ∈ Rp in group sequential trials3. Concentration inequalities for occupancy models with log-concave

marginalsI How to get bounded, size biased couplings for certain multivariate

occupancy models, then use these to get concentration inequalitiesI Joint work with Larry Goldstein (USC) and Ümit Islak (Bogaziçi

University, Istanbul)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 2 / 36

Charles Stein1920-2016

Stein’s paradox: X in admissable for θ inNp(θ, I) for p ≥ 3

James-Stein shrinkage estimatorStein’s unbiased risk estimator for MSEStein’s Lemma (1): Covariance estimationStein’s Lemma (2): Sequential sample

size tailsStein’s method for distributional

approximation

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 3 / 36

Application 1: Group sequential analysis

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 4 / 36

Application 1: Group sequential analysis

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 5 / 36

But there are other books on this subject. . .

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 6 / 36

Setup

Response Yi ∈ R of i th patient depends onknown covariate vector xi

unknown parameter vector θ ∈ Rp

Primary goal: To test a null hypothesis about θ, e.g.,

H0 : θ = 0H ′0 : θj ≤ 0

H ′′0 : atθ = b, some vector a, scalar b

Secondary goals: Compute p-values or confidence regions for θ at theend of study

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 7 / 36

Setup

Response Yi ∈ R of i th patient depends onknown covariate vector xi

unknown parameter vector θ ∈ Rp

Primary goal: To test a null hypothesis about θ, e.g.,

H0 : θ = 0H ′0 : θj ≤ 0

H ′′0 : atθ = b, some vector a, scalar b

Secondary goals: Compute p-values or confidence regions for θ at theend of study

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 7 / 36

Setup

Response Yi ∈ R of i th patient depends onknown covariate vector xi

unknown parameter vector θ ∈ Rp

Primary goal: To test a null hypothesis about θ, e.g.,

H0 : θ = 0H ′0 : θj ≤ 0

H ′′0 : atθ = b, some vector a, scalar b

Secondary goals: Compute p-values or confidence regions for θ at theend of study

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 7 / 36

Setup: Group sequential analysisFor efficiency, ethical, practical, financial reasons, the standard in trialshas become group sequential analysis

A group sequential trial with at most K groups

Group 1: Y1, . . . ,Yn1

Group 2: Yn1+1, . . . ,Yn2

...Group K: YnK−1+1, . . . ,YnK

Group sequential dominant format for clinical trials since. . .

Beta-Blocker Heart Attack Trial (“BHAT”, JAMA 82)Randomized trial of propranolol for heart attack survivors3837 patients randomizedStarted June 1978, planned as ≤ 4-year study, terminated8 months early due to observed benefit of propranololJay Bartroff (USC) Stein’s for Stats 4.Jul.17 8 / 36

Setup: Group sequential analysisFor efficiency, ethical, practical, financial reasons, the standard in trialshas become group sequential analysis

A group sequential trial with at most K groups

Group 1: Y1, . . . ,Yn1

Group 2: Yn1+1, . . . ,Yn2

...Group K: YnK−1+1, . . . ,YnK

Group sequential dominant format for clinical trials since. . .

Beta-Blocker Heart Attack Trial (“BHAT”, JAMA 82)Randomized trial of propranolol for heart attack survivors3837 patients randomizedStarted June 1978, planned as ≤ 4-year study, terminated8 months early due to observed benefit of propranololJay Bartroff (USC) Stein’s for Stats 4.Jul.17 8 / 36

Setup: Group sequential analysisFor efficiency, ethical, practical, financial reasons, the standard in trialshas become group sequential analysis

A group sequential trial with at most K groups

Group 1: Y1, . . . ,Yn1

Group 2: Yn1+1, . . . ,Yn2

...Group K: YnK−1+1, . . . ,YnK

Group sequential dominant format for clinical trials since. . .

Beta-Blocker Heart Attack Trial (“BHAT”, JAMA 82)Randomized trial of propranolol for heart attack survivors3837 patients randomizedStarted June 1978, planned as ≤ 4-year study, terminated8 months early due to observed benefit of propranololJay Bartroff (USC) Stein’s for Stats 4.Jul.17 8 / 36

Setup: Group sequential analysisStopping rule related to H0

likelihood ratio, t-, F -, χ2- tests commonOf the form:

Stop and reject H0 at stage min{k ≤ K : T (Y1, . . . ,Ynk ) ≥ Ck}

for some statistic T (Y1, . . . ,Ynk ), often a function of the MLE

θk = θk (Y1, . . . ,Ynk )

The joint distribution ofθ1, θ2, . . . , θK

needed tochoose critical values Ck

compute p-value at end of studygive confidence region for θ at end of study

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 9 / 36

Setup: Group sequential analysisStopping rule related to H0

likelihood ratio, t-, F -, χ2- tests commonOf the form:

Stop and reject H0 at stage min{k ≤ K : T (Y1, . . . ,Ynk ) ≥ Ck}

for some statistic T (Y1, . . . ,Ynk ), often a function of the MLE

θk = θk (Y1, . . . ,Ynk )

The joint distribution ofθ1, θ2, . . . , θK

needed tochoose critical values Ck

compute p-value at end of studygive confidence region for θ at end of study

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 9 / 36

Setup: Group sequential analysisStopping rule related to H0

likelihood ratio, t-, F -, χ2- tests commonOf the form:

Stop and reject H0 at stage min{k ≤ K : T (Y1, . . . ,Ynk ) ≥ Ck}

for some statistic T (Y1, . . . ,Ynk ), often a function of the MLE

θk = θk (Y1, . . . ,Ynk )

The joint distribution ofθ1, θ2, . . . , θK

needed tochoose critical values Ck

compute p-value at end of studygive confidence region for θ at end of study

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 9 / 36

Background: Group sequential analysisJennison & Turnbull (JASA 97)Asymptotic multivariate normal distribution of

(θ1, θ2, . . . , θK )

in a regression setup Yiind∼ fi(Yi ; xi , θ) nice

Asymptotics: nk − nk−1 →∞ for all k , K fixedE∞(θk ) = θ

“Independent increments”

Cov∞(θk1 , θk2) = Var∞(θk2) any k1 ≤ k2

“Folk Theorem”Normal limit widely (over-)used (software packages, etc.) beforeJennison & Turnbull paperCommonly heard: “Once n is 5 or so the normal limit kicks in!”Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 10 / 36

Background: Group sequential analysisJennison & Turnbull (JASA 97)Asymptotic multivariate normal distribution of

(θ1, θ2, . . . , θK )

in a regression setup Yiind∼ fi(Yi ; xi , θ) nice

Asymptotics: nk − nk−1 →∞ for all k , K fixedE∞(θk ) = θ

“Independent increments”

Cov∞(θk1 , θk2) = Var∞(θk2) any k1 ≤ k2

“Folk Theorem”Normal limit widely (over-)used (software packages, etc.) beforeJennison & Turnbull paperCommonly heard: “Once n is 5 or so the normal limit kicks in!”Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 10 / 36

Background: Group sequential analysisJennison & Turnbull (JASA 97)Asymptotic multivariate normal distribution of

(θ1, θ2, . . . , θK )

in a regression setup Yiind∼ fi(Yi ; xi , θ) nice

Asymptotics: nk − nk−1 →∞ for all k , K fixedE∞(θk ) = θ

“Independent increments”

Cov∞(θk1 , θk2) = Var∞(θk2) any k1 ≤ k2

“Folk Theorem”Normal limit widely (over-)used (software packages, etc.) beforeJennison & Turnbull paperCommonly heard: “Once n is 5 or so the normal limit kicks in!”Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 10 / 36

Background: Group sequential analysisJennison & Turnbull (JASA 97)Independent increments

Cov∞(θk1 , θk2) = Var∞(θk2) any k1 ≤ k2

Suppose

H0 : atθ = 0, Tk = (at θk )Ik = (at θk )[Var∞(at θk )]−1.

Then

Cov∞(Tk1 ,Tk2) = Ik1 Ik2atCov∞(θk1 , θk2)a

= Ik1 Ik2atVar∞(θk2)a

= Ik1 Ik2Var∞(at θk2)

= Ik1 Ik2 I−1k2

= Ik1 = Var∞(Tk1)

⇒ Cov∞(Tk1 ,Tk2 − Tk1) = 0⇒ T1,T2 − T1, . . . ,TK − TK−1 asymptotically independent normals

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 11 / 36

Background: Group sequential analysisJennison & Turnbull (JASA 97)Independent increments

Cov∞(θk1 , θk2) = Var∞(θk2) any k1 ≤ k2

Suppose

H0 : atθ = 0, Tk = (at θk )Ik = (at θk )[Var∞(at θk )]−1.

Then

Cov∞(Tk1 ,Tk2) = Ik1 Ik2atCov∞(θk1 , θk2)a

= Ik1 Ik2atVar∞(θk2)a

= Ik1 Ik2Var∞(at θk2)

= Ik1 Ik2 I−1k2

= Ik1 = Var∞(Tk1)

⇒ Cov∞(Tk1 ,Tk2 − Tk1) = 0⇒ T1,T2 − T1, . . . ,TK − TK−1 asymptotically independent normals

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 11 / 36

Background: Group sequential analysisJennison & Turnbull (JASA 97)Independent increments

Cov∞(θk1 , θk2) = Var∞(θk2) any k1 ≤ k2

Suppose

H0 : atθ = 0, Tk = (at θk )Ik = (at θk )[Var∞(at θk )]−1.

Then

Cov∞(Tk1 ,Tk2) = Ik1 Ik2atCov∞(θk1 , θk2)a

= Ik1 Ik2atVar∞(θk2)a

= Ik1 Ik2Var∞(at θk2)

= Ik1 Ik2 I−1k2

= Ik1 = Var∞(Tk1)

⇒ Cov∞(Tk1 ,Tk2 − Tk1) = 0⇒ T1,T2 − T1, . . . ,TK − TK−1 asymptotically independent normals

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 11 / 36

Background: Group sequential analysisJennison & Turnbull (JASA 97)Independent increments

Cov∞(θk1 , θk2) = Var∞(θk2) any k1 ≤ k2

Suppose

H0 : atθ = 0, Tk = (at θk )Ik = (at θk )[Var∞(at θk )]−1.

Then

Cov∞(Tk1 ,Tk2) = Ik1 Ik2atCov∞(θk1 , θk2)a

= Ik1 Ik2atVar∞(θk2)a

= Ik1 Ik2Var∞(at θk2)

= Ik1 Ik2 I−1k2

= Ik1 = Var∞(Tk1)

⇒ Cov∞(Tk1 ,Tk2 − Tk1) = 0⇒ T1,T2 − T1, . . . ,TK − TK−1 asymptotically independent normals

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 11 / 36

New ContributionsExtension to group sequential setting of Berry-Esseen bound formultivariate normal limit for smooth functions

Anastasiou & Reinert 17: Bounds w/ explicit constants forbounded Wasserstein distance for scalar MLE (K = 1 analysis),Bernoulli .Anastasiou & Ley 17: Bounds for the asymptotic normality of themaximum likelihood estimator using the Delta method, ALEA.Anastasiou 17: Bounds for the normal approximation of themaximum likelihood estimator from m-dependent randomvariables. Statistics & Probability Letters.Anastasiou & Gaunt 16+: Multivariate normal approximation of themaximum likelihood estimator via the delta method,ArXiv:1609.03970Anastasiou 15+: Assessing the multivariate normal approximationof the maximum likelihood estimator from high-dimensional,heterogeneous data., ArXiv:1510.03679Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 12 / 36

New ContributionsExtension to group sequential setting of Berry-Esseen bound formultivariate normal limit for smooth functions

Anastasiou & Reinert 17: Bounds w/ explicit constants forbounded Wasserstein distance for scalar MLE (K = 1 analysis),Bernoulli .Anastasiou & Ley 17: Bounds for the asymptotic normality of themaximum likelihood estimator using the Delta method, ALEA.Anastasiou 17: Bounds for the normal approximation of themaximum likelihood estimator from m-dependent randomvariables. Statistics & Probability Letters.Anastasiou & Gaunt 16+: Multivariate normal approximation of themaximum likelihood estimator via the delta method,ArXiv:1609.03970Anastasiou 15+: Assessing the multivariate normal approximationof the maximum likelihood estimator from high-dimensional,heterogeneous data., ArXiv:1510.03679Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 12 / 36

New ContributionsExtension to group sequential setting of Berry-Esseen bound formultivariate normal limit for smooth functions

Anastasiou & Reinert 17: Bounds w/ explicit constants forbounded Wasserstein distance for scalar MLE (K = 1 analysis),Bernoulli .Anastasiou & Ley 17: Bounds for the asymptotic normality of themaximum likelihood estimator using the Delta method, ALEA.Anastasiou 17: Bounds for the normal approximation of themaximum likelihood estimator from m-dependent randomvariables. Statistics & Probability Letters.Anastasiou & Gaunt 16+: Multivariate normal approximation of themaximum likelihood estimator via the delta method,ArXiv:1609.03970Anastasiou 15+: Assessing the multivariate normal approximationof the maximum likelihood estimator from high-dimensional,heterogeneous data., ArXiv:1510.03679Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 12 / 36

New Contributions, Cont’d

Relaxing independence assumption: Assume log-likelihood ofYk := (Ynk−1+1, . . . ,Ynk ) is of the form∑

i∈Gk

log fi(Yi , θ) + gk (Yk , θ)

for well-behaved functions fi , gk

gk = 0 gives Jennison & Turnbull’s independent settingSome generalized linear mixed models (GLMMs) with randomstage effect Uk take this form

I Uk = effect due to lab, monitoring board, cohort, etc.

Penalized quasi-likelihood (Breslow & Clayton, JASA 93)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 13 / 36

New Contributions, Cont’d

Relaxing independence assumption: Assume log-likelihood ofYk := (Ynk−1+1, . . . ,Ynk ) is of the form∑

i∈Gk

log fi(Yi , θ) + gk (Yk , θ)

for well-behaved functions fi , gk

gk = 0 gives Jennison & Turnbull’s independent settingSome generalized linear mixed models (GLMMs) with randomstage effect Uk take this form

I Uk = effect due to lab, monitoring board, cohort, etc.

Penalized quasi-likelihood (Breslow & Clayton, JASA 93)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 13 / 36

New Contributions, Cont’d

Relaxing independence assumption: Assume log-likelihood ofYk := (Ynk−1+1, . . . ,Ynk ) is of the form∑

i∈Gk

log fi(Yi , θ) + gk (Yk , θ)

for well-behaved functions fi , gk

gk = 0 gives Jennison & Turnbull’s independent settingSome generalized linear mixed models (GLMMs) with randomstage effect Uk take this form

I Uk = effect due to lab, monitoring board, cohort, etc.

Penalized quasi-likelihood (Breslow & Clayton, JASA 93)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 13 / 36

New Contributions, Cont’d

Relaxing independence assumption: Assume log-likelihood ofYk := (Ynk−1+1, . . . ,Ynk ) is of the form∑

i∈Gk

log fi(Yi , θ) + gk (Yk , θ)

for well-behaved functions fi , gk

gk = 0 gives Jennison & Turnbull’s independent settingSome generalized linear mixed models (GLMMs) with randomstage effect Uk take this form

I Uk = effect due to lab, monitoring board, cohort, etc.

Penalized quasi-likelihood (Breslow & Clayton, JASA 93)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 13 / 36

GLMM Example: Poisson regression

Letting fµ = Po(µ) density,

For Yi in k th stage, Yi |Ukind∼ fµi where µi = exp(βtxi + Uk )

{Uk}iid∼ hλ

θ = (β, λ).

Then log-likelihood is

log

K∏k=1

∫ ∏i∈Gk

fµi (Yi)hλ(Uk )dUk

=K∑

k=1

∑i∈Gk

log fµi (Yi) + gk (Yk , θ)

where µi = exp(βtxi).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 14 / 36

GLMM Example: Poisson regression

Letting fµ = Po(µ) density,

For Yi in k th stage, Yi |Ukind∼ fµi where µi = exp(βtxi + Uk )

{Uk}iid∼ hλ

θ = (β, λ).

Then log-likelihood is

log

K∏k=1

∫ ∏i∈Gk

fµi (Yi)hλ(Uk )dUk

=K∑

k=1

∑i∈Gk

log fµi (Yi) + gk (Yk , θ)

where µi = exp(βtxi).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 14 / 36

GLMM Example: Poisson regression

Letting fµ = Po(µ) density,

For Yi in k th stage, Yi |Ukind∼ fµi where µi = exp(βtxi + Uk )

{Uk}iid∼ hλ

θ = (β, λ).

Then log-likelihood is

log

K∏k=1

∫ ∏i∈Gk

fµi (Yi)hλ(Uk )dUk

=K∑

k=1

∑i∈Gk

log fµi (Yi) + gk (Yk , θ)

where µi = exp(βtxi).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 14 / 36

GLMM Example: Poisson regression

Letting fµ = Po(µ) density,

For Yi in k th stage, Yi |Ukind∼ fµi where µi = exp(βtxi + Uk )

{Uk}iid∼ hλ

θ = (β, λ).

Then log-likelihood is

log

K∏k=1

∫ ∏i∈Gk

fµi (Yi)hλ(Uk )dUk

=K∑

k=1

∑i∈Gk

log fµi (Yi) + gk (Yk , θ)

where µi = exp(βtxi).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 14 / 36

Stein’s Method for MVN Approximation

Generator approach: Barbour 90, Goetze 91Size biasing: Goldstein & Rinott 96, Rinott & Rotar 96Zero biasing: Goldstein & Reinert 05Exchangeable pair: Chatterjee & Meckes 08, Reinert & Röllin 09Stein couplings: Fang & Röllin 15

Theorem (Reinert & Röllin 09)If W ,W ′ ∈ Rq exchangeable pair with EW = 0, EWW t = Σ PD, andE(W ′ −W |W ) = ΛW + R with Λ invertible, then for any 3-timesdifferentiable h : Rq → R,

|Eh(W )− Eh(Σ1/2Z )| ≤ a|h|24

+b|h|312

+ c(|h|1 +

q2||Σ||1/2|h|2

)for certain explicit constants a,b, c.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 15 / 36

Stein’s Method for MVN Approximation

Generator approach: Barbour 90, Goetze 91Size biasing: Goldstein & Rinott 96, Rinott & Rotar 96Zero biasing: Goldstein & Reinert 05Exchangeable pair: Chatterjee & Meckes 08, Reinert & Röllin 09Stein couplings: Fang & Röllin 15

Theorem (Reinert & Röllin 09)If W ,W ′ ∈ Rq exchangeable pair with EW = 0, EWW t = Σ PD, andE(W ′ −W |W ) = ΛW + R with Λ invertible, then for any 3-timesdifferentiable h : Rq → R,

|Eh(W )− Eh(Σ1/2Z )| ≤ a|h|24

+b|h|312

+ c(|h|1 +

q2||Σ||1/2|h|2

)for certain explicit constants a,b, c.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 15 / 36

Bounds to normal for θK := (θ1, θ2, . . . , θK )Approach: Apply Reinert & Röllin 09 result with W = score functionincrements to get smooth function bounds to normal.

ResultIn the group sequential setup above, if the Yi are independent or followGLMMs with the log-likelihood of the k th group dataYk = (Ynk−1+1, . . . ,Ynk ) of the form∑

i∈Gk

log fi(Yi , θ) + gk (Yk , θ),

then under regularity conditions on the fi and gk there are a,b, c,d s.t.

|Eh(J−1/2(θK −θK ))−Eh(Z )| ≤ aK 2||J−1/2||2|h|24

+bK 3||J−1/2||3|h|3

12

+ cK ||J−1/2||(|h|1 +

pK 2

2||Σ||1/2||J−1/2|||h|2

)+ d .

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 16 / 36

Bounds to normal for θK := (θ1, θ2, . . . , θK )Approach: Apply Reinert & Röllin 09 result with W = score functionincrements to get smooth function bounds to normal.

ResultIn the group sequential setup above, if the Yi are independent or followGLMMs with the log-likelihood of the k th group dataYk = (Ynk−1+1, . . . ,Ynk ) of the form∑

i∈Gk

log fi(Yi , θ) + gk (Yk , θ),

then under regularity conditions on the fi and gk there are a,b, c,d s.t.

|Eh(J−1/2(θK −θK ))−Eh(Z )| ≤ aK 2||J−1/2||2|h|24

+bK 3||J−1/2||3|h|3

12

+ cK ||J−1/2||(|h|1 +

pK 2

2||Σ||1/2||J−1/2|||h|2

)+ d .

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 16 / 36

Comments on resultBound

|Eh(J−1/2(θK −θK ))−Eh(Z )| ≤ aK 2||J−1/2||2|h|24

+bK 3||J−1/2||3|h|3

12

+ cK ||J−1/2||(|h|1 +

pK 2

2||Σ||1/2||J−1/2|||h|2

)+ d .

a,b, c terms directly from Reinert & Röllin 09 boundc term ∝ Var(R) in

E(W ′ −W |W ) = ΛW + R,

vanishes in independent cased term is from Taylor Series remaindersRate O(1/

√nK ) under usual asymptotic

nk − nk−1

nK→ γk ∈ (0,1)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 17 / 36

Comments on resultBound

|Eh(J−1/2(θK −θK ))−Eh(Z )| ≤ aK 2||J−1/2||2|h|24

+bK 3||J−1/2||3|h|3

12

+ cK ||J−1/2||(|h|1 +

pK 2

2||Σ||1/2||J−1/2|||h|2

)+ d .

a,b, c terms directly from Reinert & Röllin 09 boundc term ∝ Var(R) in

E(W ′ −W |W ) = ΛW + R,

vanishes in independent cased term is from Taylor Series remaindersRate O(1/

√nK ) under usual asymptotic

nk − nk−1

nK→ γk ∈ (0,1)

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 17 / 36

Sketch of ProofIndependent CaseScore statistic

Si(θ) =∂

∂θlog fi(Yi , θ) ∈ Rp, W =

∑i∈G1

Si(θ), . . . ,∑i∈GK

Si(θ)

∈ Rq,

where q = pK .

Fisher Information

Ji(θ) = −E(∂

∂θSi(θ)t

)∈ Rp×p

J(θ1, . . . , θK ) = diag

( n1∑i=1

Ji(θ1), . . . ,

nK∑i=1

Ji(θK )

)∈ Rq×q

Σ := Var(W ) = diag

∑i∈G1

Ji(θ), . . . ,∑i∈GK

Ji(θ)

∈ Rq×q

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 18 / 36

Sketch of ProofIndependent CaseScore statistic

Si(θ) =∂

∂θlog fi(Yi , θ) ∈ Rp, W =

∑i∈G1

Si(θ), . . . ,∑i∈GK

Si(θ)

∈ Rq,

where q = pK .

Fisher Information

Ji(θ) = −E(∂

∂θSi(θ)t

)∈ Rp×p

J(θ1, . . . , θK ) = diag

( n1∑i=1

Ji(θ1), . . . ,

nK∑i=1

Ji(θK )

)∈ Rq×q

Σ := Var(W ) = diag

∑i∈G1

Ji(θ), . . . ,∑i∈GK

Ji(θ)

∈ Rq×q

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 18 / 36

Sketch of ProofIndependent CaseScore statistic

Si(θ) =∂

∂θlog fi(Yi , θ) ∈ Rp, W =

∑i∈G1

Si(θ), . . . ,∑i∈GK

Si(θ)

∈ Rq,

where q = pK .

Fisher Information

Ji(θ) = −E(∂

∂θSi(θ)t

)∈ Rp×p

J(θ1, . . . , θK ) = diag

( n1∑i=1

Ji(θ1), . . . ,

nK∑i=1

Ji(θK )

)∈ Rq×q

Σ := Var(W ) = diag

∑i∈G1

Ji(θ), . . . ,∑i∈GK

Ji(θ)

∈ Rq×q

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 18 / 36

Sketch of Proof: Exchangeable pairIndependent Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of all else2. Replace Yi∗ by independent copy Y ′i∗ (keeping xi∗), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W

which is easy to check entry-wise

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 19 / 36

Sketch of Proof: Exchangeable pairIndependent Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of all else2. Replace Yi∗ by independent copy Y ′i∗ (keeping xi∗), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W

which is easy to check entry-wise

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 19 / 36

Sketch of Proof: Exchangeable pairIndependent Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of all else2. Replace Yi∗ by independent copy Y ′i∗ (keeping xi∗), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W

which is easy to check entry-wise

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 19 / 36

Sketch of Proof: Exchangeable pairIndependent Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of all else2. Replace Yi∗ by independent copy Y ′i∗ (keeping xi∗), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W

which is easy to check entry-wise

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 19 / 36

Sketch of Proof: Relating θK to WIndependent Case

By standard Taylor series,

θK−θK = J(θ∗K )−1S, where S =

( n1∑i=1

Si(θ1), . . . ,

nK∑i=1

Si(θK )

)∈ Rq

and θ∗K ∈ Rq on line segment connecting θK , θK .

Then

|Eh(J1/2(θK − θK ))− Eh(Z )| ≤|Eh(J−1/2S)− Eh(Z )|+ |Eh(J1/2J(θ∗K )−1S)− Eh(J−1/2S)|

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 20 / 36

Sketch of Proof: Relating θK to WIndependent Case

By standard Taylor series,

θK−θK = J(θ∗K )−1S, where S =

( n1∑i=1

Si(θ1), . . . ,

nK∑i=1

Si(θK )

)∈ Rq

and θ∗K ∈ Rq on line segment connecting θK , θK .

Then

|Eh(J1/2(θK − θK ))− Eh(Z )| ≤|Eh(J−1/2S)− Eh(Z )|+ |Eh(J1/2J(θ∗K )−1S)− Eh(J−1/2S)|

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 20 / 36

Sketch of Proof: Relating θK to WIndependent Case

Using S = AW where

A =

1p 0p · · · 0p1p 1p · · · 0p...

.... . .

...1p 1p · · · 1p

,

1p,0p ∈ Rp×p identity and 0 matrices,

1st term is

|Eh(J−1/2S)− Eh(Z )| = |Eh(W )− Eh(Σ1/2Z )|

where h(w) = h(J−1/2Aw), then apply Reinert-Röllin and simplify.

2nd term is bounded by Taylor series arguments.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 21 / 36

Sketch of Proof: Relating θK to WIndependent Case

Using S = AW where

A =

1p 0p · · · 0p1p 1p · · · 0p...

.... . .

...1p 1p · · · 1p

,

1p,0p ∈ Rp×p identity and 0 matrices,

1st term is

|Eh(J−1/2S)− Eh(Z )| = |Eh(W )− Eh(Σ1/2Z )|

where h(w) = h(J−1/2Aw), then apply Reinert-Röllin and simplify.

2nd term is bounded by Taylor series arguments.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 21 / 36

Sketch of Proof: Exchangeable pairGLMM Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of Y1, . . . ,YnK

2. If i∗ in k th group, replace Yi∗ by independent copy Y ′i∗ with mean

ϕ(βtxi∗ + Uk ), ϕ−1 = link function

(same covariates xi∗ , group effect Uk ), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W + R

where R = R(g1, . . . ,gK )

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 22 / 36

Sketch of Proof: Exchangeable pairGLMM Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of Y1, . . . ,YnK

2. If i∗ in k th group, replace Yi∗ by independent copy Y ′i∗ with mean

ϕ(βtxi∗ + Uk ), ϕ−1 = link function

(same covariates xi∗ , group effect Uk ), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W + R

where R = R(g1, . . . ,gK )

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 22 / 36

Sketch of Proof: Exchangeable pairGLMM Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of Y1, . . . ,YnK

2. If i∗ in k th group, replace Yi∗ by independent copy Y ′i∗ with mean

ϕ(βtxi∗ + Uk ), ϕ−1 = link function

(same covariates xi∗ , group effect Uk ), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W + R

where R = R(g1, . . . ,gK )

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 22 / 36

Sketch of Proof: Exchangeable pairGLMM Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of Y1, . . . ,YnK

2. If i∗ in k th group, replace Yi∗ by independent copy Y ′i∗ with mean

ϕ(βtxi∗ + Uk ), ϕ−1 = link function

(same covariates xi∗ , group effect Uk ), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W + R

where R = R(g1, . . . ,gK )

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 22 / 36

Sketch of Proof: Exchangeable pairGLMM Case

1. Choose i∗ ∈ {1, . . . ,nK} uniformly, independent of Y1, . . . ,YnK

2. If i∗ in k th group, replace Yi∗ by independent copy Y ′i∗ with mean

ϕ(βtxi∗ + Uk ), ϕ−1 = link function

(same covariates xi∗ , group effect Uk ), call result W ′

⇒W ,W ′ exchangeable

⇒W ,W ′ satisfy linearity condition

E(W ′ −W |W ) = −n−1K W + R

where R = R(g1, . . . ,gK )

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 22 / 36

Concentration and Coupling

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 23 / 36

Application 2: Concentration inequalities foroccupancy models with log-concave marginals

Main idea: How to get bounded, size biased couplings for certainmultivariate occupancy models, then use methods pioneered byGhosh and Goldstein 11 to get concentration inequalities

Concentration Inequalities

e.g., P (Y − µ ≥ t) ≤ exp(− t2

2cµ+ ct

)Widely used in

high dimensional statisticsmachine learningrandom matrix theoryapplications: wireless communications, physics, . . .

(See Raginsky & Sason 15)Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 24 / 36

Application 2: Concentration inequalities foroccupancy models with log-concave marginals

Main idea: How to get bounded, size biased couplings for certainmultivariate occupancy models, then use methods pioneered byGhosh and Goldstein 11 to get concentration inequalities

Concentration Inequalities

e.g., P (Y − µ ≥ t) ≤ exp(− t2

2cµ+ ct

)Widely used in

high dimensional statisticsmachine learningrandom matrix theoryapplications: wireless communications, physics, . . .

(See Raginsky & Sason 15)Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 24 / 36

Setup

Occupancy model M = (Mα)

Mα may beI degree count of vertex α in an Erdös-Rényi random graphI # of balls in box α in multinomial modelI # balls of color α in sample from urn of colored balls

We consider statistics like

Yge =m∑α=1

1{Mα ≥ d}, Yeq =m∑α=1

1{Mα = d}

Yge =m∑α=1

wα1{Mα ≥ dα}, Yeq =m∑α=1

wα1{Mα = dα}

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 25 / 36

Setup

Occupancy model M = (Mα)

Mα may beI degree count of vertex α in an Erdös-Rényi random graphI # of balls in box α in multinomial modelI # balls of color α in sample from urn of colored balls

We consider statistics like

Yge =m∑α=1

1{Mα ≥ d}, Yeq =m∑α=1

1{Mα = d}

Yge =m∑α=1

wα1{Mα ≥ dα}, Yeq =m∑α=1

wα1{Mα = dα}

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 25 / 36

Setup

Occupancy model M = (Mα)

Mα may beI degree count of vertex α in an Erdös-Rényi random graphI # of balls in box α in multinomial modelI # balls of color α in sample from urn of colored balls

We consider statistics like

Yge =m∑α=1

1{Mα ≥ d}, Yeq =m∑α=1

1{Mα = d}

Yge =m∑α=1

wα1{Mα ≥ dα}, Yeq =m∑α=1

wα1{Mα = dα}

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 25 / 36

Setup

Occupancy model M = (Mα)

Mα may beI degree count of vertex α in an Erdös-Rényi random graphI # of balls in box α in multinomial modelI # balls of color α in sample from urn of colored balls

We consider statistics like

Yge =m∑α=1

1{Mα ≥ d}, Yeq =m∑α=1

1{Mα = d}

Yge =m∑α=1

wα1{Mα ≥ dα}, Yeq =m∑α=1

wα1{Mα = dα}

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 25 / 36

Some Methods for Concentration Inequalities

McDiarmid’s Bounded Difference InequalityI Y a function with bounded differences of independent inputs

Negative AssociationI e.g., Dubashi & Ranjan 98

Certifiable FunctionsI Y a certifiable function of independent inputsI Controlling a large enough subset of inputs “certifies” value of

functionI e.g., McDiarmid & Reed 06

Bounded Size Bias Couplings

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 26 / 36

Some Methods for Concentration Inequalities

McDiarmid’s Bounded Difference InequalityI Y a function with bounded differences of independent inputs

Negative AssociationI e.g., Dubashi & Ranjan 98

Certifiable FunctionsI Y a certifiable function of independent inputsI Controlling a large enough subset of inputs “certifies” value of

functionI e.g., McDiarmid & Reed 06

Bounded Size Bias Couplings

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 26 / 36

Some Methods for Concentration Inequalities

McDiarmid’s Bounded Difference InequalityI Y a function with bounded differences of independent inputs

Negative AssociationI e.g., Dubashi & Ranjan 98

Certifiable FunctionsI Y a certifiable function of independent inputsI Controlling a large enough subset of inputs “certifies” value of

functionI e.g., McDiarmid & Reed 06

Bounded Size Bias Couplings

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 26 / 36

Some Methods for Concentration Inequalities

McDiarmid’s Bounded Difference InequalityI Y a function with bounded differences of independent inputs

Negative AssociationI e.g., Dubashi & Ranjan 98

Certifiable FunctionsI Y a certifiable function of independent inputsI Controlling a large enough subset of inputs “certifies” value of

functionI e.g., McDiarmid & Reed 06

Bounded Size Bias Couplings

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 26 / 36

Bounded Size Bias CouplingsIf there is a coupling Y s of Y with the Y -size bias distribution, i.e.,

E [Yf (Y )] = µE [f (Y s)] for all f ,

and Y s ≤ Y + c for some c > 0 with probability one, then

max {P(Y − µ ≥ t),P(Y − µ ≤ −t)} ≤ bµ,c(t).

Ghosh & Goldstein 11: For all t > 0,

P (Y − µ ≤ −t) ≤ exp(− t2

2cµ

)P (Y − µ ≥ t) ≤ exp

(− t2

2cµ+ ct

).

b exponential as t →∞.Arratia & Baxendale 13:

bµ,c(t) = exp(−µ

ch(

)), h(x) = (1 + x) log(1 + x)− x .

b Poisson as t →∞.Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 27 / 36

Bounded Size Bias CouplingsIf there is a coupling Y s of Y with the Y -size bias distribution, i.e.,

E [Yf (Y )] = µE [f (Y s)] for all f ,

and Y s ≤ Y + c for some c > 0 with probability one, then

max {P(Y − µ ≥ t),P(Y − µ ≤ −t)} ≤ bµ,c(t).

Ghosh & Goldstein 11: For all t > 0,

P (Y − µ ≤ −t) ≤ exp(− t2

2cµ

)P (Y − µ ≥ t) ≤ exp

(− t2

2cµ+ ct

).

b exponential as t →∞.Arratia & Baxendale 13:

bµ,c(t) = exp(−µ

ch(

)), h(x) = (1 + x) log(1 + x)− x .

b Poisson as t →∞.Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 27 / 36

Bounded Size Bias CouplingsIf there is a coupling Y s of Y with the Y -size bias distribution, i.e.,

E [Yf (Y )] = µE [f (Y s)] for all f ,

and Y s ≤ Y + c for some c > 0 with probability one, then

max {P(Y − µ ≥ t),P(Y − µ ≤ −t)} ≤ bµ,c(t).

Ghosh & Goldstein 11: For all t > 0,

P (Y − µ ≤ −t) ≤ exp(− t2

2cµ

)P (Y − µ ≥ t) ≤ exp

(− t2

2cµ+ ct

).

b exponential as t →∞.Arratia & Baxendale 13:

bµ,c(t) = exp(−µ

ch(

)), h(x) = (1 + x) log(1 + x)− x .

b Poisson as t →∞.Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 27 / 36

Main Result

M = (Mα)α∈[m], Mα lattice log-concave

Yge =∑α∈[m]

wα1{Mα ≥ dα}, Yne =∑α∈[m]

wα1{Mα 6= dα}.

Main Result (in words)1. If M is bounded from below and can be closely coupled to a

version M ′ having the same distribution conditional onM ′α = Mα + 1, then there is a bounded size biased couplingY s

ge ≤ Yge + C and the above concentration inequalities hold.2. If M is non-degenerate at (dα) and can be closely coupled to a

version M ′ having the same distribution conditional on M ′α 6= dα,then there is a bounded size biased coupling Y s

ne ≤ Yne + C′ andthe above concentration inequalities hold.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 28 / 36

Main Result

M = (Mα)α∈[m], Mα lattice log-concave

Yge =∑α∈[m]

wα1{Mα ≥ dα}, Yne =∑α∈[m]

wα1{Mα 6= dα}.

Main Result (in words)1. If M is bounded from below and can be closely coupled to a

version M ′ having the same distribution conditional onM ′α = Mα + 1, then there is a bounded size biased couplingY s

ge ≤ Yge + C and the above concentration inequalities hold.2. If M is non-degenerate at (dα) and can be closely coupled to a

version M ′ having the same distribution conditional on M ′α 6= dα,then there is a bounded size biased coupling Y s

ne ≤ Yne + C′ andthe above concentration inequalities hold.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 28 / 36

Main Result

M = (Mα)α∈[m], Mα lattice log-concave

Yge =∑α∈[m]

wα1{Mα ≥ dα}, Yne =∑α∈[m]

wα1{Mα 6= dα}.

Main Result (in words)1. If M is bounded from below and can be closely coupled to a

version M ′ having the same distribution conditional onM ′α = Mα + 1, then there is a bounded size biased couplingY s

ge ≤ Yge + C and the above concentration inequalities hold.2. If M is non-degenerate at (dα) and can be closely coupled to a

version M ′ having the same distribution conditional on M ′α 6= dα,then there is a bounded size biased coupling Y s

ne ≤ Yne + C′ andthe above concentration inequalities hold.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 28 / 36

Main ResultA few more details on Part 1M = f (U) whereU is some collection of random variablesf is measurable

Closely coupled means given Uk ∼ L(Vk ) := L(U|Mα ≥ k), there iscoupling U+

k and constant B such that

L(U+k |Uk ) = L(Vk |M+

k ,α = Mk ,α + 1) and Y+k ,ge, 6=α ≤ Yk ,ge,6=α + B,

where Yk ,ge, 6=α =∑

β 6=α 1(Mk ,β ≥ dβ).

The constant isC = |w |(B|d |+ 1)

where |w | = max wα, |d | = max dα.

Part 2 is similar.Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 29 / 36

Main ResultA few more details on Part 1M = f (U) whereU is some collection of random variablesf is measurable

Closely coupled means given Uk ∼ L(Vk ) := L(U|Mα ≥ k), there iscoupling U+

k and constant B such that

L(U+k |Uk ) = L(Vk |M+

k ,α = Mk ,α + 1) and Y+k ,ge, 6=α ≤ Yk ,ge,6=α + B,

where Yk ,ge, 6=α =∑

β 6=α 1(Mk ,β ≥ dβ).

The constant isC = |w |(B|d |+ 1)

where |w | = max wα, |d | = max dα.

Part 2 is similar.Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 29 / 36

Main ResultA few more details on Part 1M = f (U) whereU is some collection of random variablesf is measurable

Closely coupled means given Uk ∼ L(Vk ) := L(U|Mα ≥ k), there iscoupling U+

k and constant B such that

L(U+k |Uk ) = L(Vk |M+

k ,α = Mk ,α + 1) and Y+k ,ge, 6=α ≤ Yk ,ge,6=α + B,

where Yk ,ge, 6=α =∑

β 6=α 1(Mk ,β ≥ dβ).

The constant isC = |w |(B|d |+ 1)

where |w | = max wα, |d | = max dα.

Part 2 is similar.Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 29 / 36

Main ResultA few more details on Part 1M = f (U) whereU is some collection of random variablesf is measurable

Closely coupled means given Uk ∼ L(Vk ) := L(U|Mα ≥ k), there iscoupling U+

k and constant B such that

L(U+k |Uk ) = L(Vk |M+

k ,α = Mk ,α + 1) and Y+k ,ge, 6=α ≤ Yk ,ge,6=α + B,

where Yk ,ge, 6=α =∑

β 6=α 1(Mk ,β ≥ dβ).

The constant isC = |w |(B|d |+ 1)

where |w | = max wα, |d | = max dα.

Part 2 is similar.Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 29 / 36

Main ResultMain Ingredient in Proof

Incrementing LemmaIf M is lattice log-concave then there is π(x ,d) ∈ [0,1] such that if

M ′ ∼ L(M|M ≥ d) and B|M ′ ∼ Bern(π(M ′,d)),

thenM ′ + B ∼ L(M|M ≥ d + 1).

Extension of Goldstein & Penrose 10 for M Binomial, d = 0Analogous versions for

L(M|M ≤ d) ↪→ L(M|M ≤ d − 1)

L(M) ↪→ L(M|M 6= d)

where ↪→ means “coupled to”

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 30 / 36

Main ResultMain Ingredient in Proof

Incrementing LemmaIf M is lattice log-concave then there is π(x ,d) ∈ [0,1] such that if

M ′ ∼ L(M|M ≥ d) and B|M ′ ∼ Bern(π(M ′,d)),

thenM ′ + B ∼ L(M|M ≥ d + 1).

Extension of Goldstein & Penrose 10 for M Binomial, d = 0Analogous versions for

L(M|M ≤ d) ↪→ L(M|M ≤ d − 1)

L(M) ↪→ L(M|M 6= d)

where ↪→ means “coupled to”

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 30 / 36

Example 1: Erdös-Rényi random graph

m verticesIndependent edges with probability pα,β = pβ,α ∈ [0,1).Constructing U+

k from Uk :1. Selection non-neighbor β of α with probability

∝ pα,β1− pα,β

2. Add edge connecting β to α

This affects at most 1 other vertex so B = 1 and

Y sge ≤ Yge + |w |(|d |+ 1).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 31 / 36

Example 1: Erdös-Rényi random graph

m verticesIndependent edges with probability pα,β = pβ,α ∈ [0,1).Constructing U+

k from Uk :1. Selection non-neighbor β of α with probability

∝ pα,β1− pα,β

2. Add edge connecting β to α

This affects at most 1 other vertex so B = 1 and

Y sge ≤ Yge + |w |(|d |+ 1).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 31 / 36

Example 1: Erdös-Rényi random graph

m verticesIndependent edges with probability pα,β = pβ,α ∈ [0,1).Constructing U+

k from Uk :1. Selection non-neighbor β of α with probability

∝ pα,β1− pα,β

2. Add edge connecting β to α

This affects at most 1 other vertex so B = 1 and

Y sge ≤ Yge + |w |(|d |+ 1).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 31 / 36

Example 1: Erdös-Rényi random graph

Applying this with dα = d , wα = 1,

P(Yge − µ ≤ −t) ≤ exp(

−t2

2(d + 1)µ

)≤ exp

(−t2

2(d + 1)m

)

Compare with McDiarmid’s bounded difference inequality:

Yge = f (X1, . . . ,X(m2)), Xi = 1{edge between vertex pair i},

supXi ,Xi′

∣∣∣f (X1, . . . ,Xi , . . . ,X(m2))− f (X1, . . . ,Xi ′ , . . . ,X(m

2))∣∣∣ ≤ 2,

so

P(Yge − µ ≤ −t) ≤ exp(

−t2

4m(m − 1)

)New bound an improvement for m > 2d + 3

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 32 / 36

Example 1: Erdös-Rényi random graph

Applying this with dα = d , wα = 1,

P(Yge − µ ≤ −t) ≤ exp(

−t2

2(d + 1)µ

)≤ exp

(−t2

2(d + 1)m

)

Compare with McDiarmid’s bounded difference inequality:

Yge = f (X1, . . . ,X(m2)), Xi = 1{edge between vertex pair i},

supXi ,Xi′

∣∣∣f (X1, . . . ,Xi , . . . ,X(m2))− f (X1, . . . ,Xi ′ , . . . ,X(m

2))∣∣∣ ≤ 2,

so

P(Yge − µ ≤ −t) ≤ exp(

−t2

4m(m − 1)

)New bound an improvement for m > 2d + 3

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 32 / 36

Example 1: Erdös-Rényi random graph

Applying this with dα = d , wα = 1,

P(Yge − µ ≤ −t) ≤ exp(

−t2

2(d + 1)µ

)≤ exp

(−t2

2(d + 1)m

)

Compare with McDiarmid’s bounded difference inequality:

Yge = f (X1, . . . ,X(m2)), Xi = 1{edge between vertex pair i},

supXi ,Xi′

∣∣∣f (X1, . . . ,Xi , . . . ,X(m2))− f (X1, . . . ,Xi ′ , . . . ,X(m

2))∣∣∣ ≤ 2,

so

P(Yge − µ ≤ −t) ≤ exp(

−t2

4m(m − 1)

)New bound an improvement for m > 2d + 3

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 32 / 36

Example 1: Erdös-Rényi random graph

Applying this with dα = d , wα = 1,

P(Yge − µ ≤ −t) ≤ exp(

−t2

2(d + 1)µ

)≤ exp

(−t2

2(d + 1)m

)

Compare with McDiarmid’s bounded difference inequality:

Yge = f (X1, . . . ,X(m2)), Xi = 1{edge between vertex pair i},

supXi ,Xi′

∣∣∣f (X1, . . . ,Xi , . . . ,X(m2))− f (X1, . . . ,Xi ′ , . . . ,X(m

2))∣∣∣ ≤ 2,

so

P(Yge − µ ≤ −t) ≤ exp(

−t2

4m(m − 1)

)New bound an improvement for m > 2d + 3

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 32 / 36

Example 1: Erdös-Rényi random graph

Applying this with dα = d , wα = 1,

P(Yge − µ ≤ −t) ≤ exp(

−t2

2(d + 1)µ

)≤ exp

(−t2

2(d + 1)m

)

Compare with McDiarmid’s bounded difference inequality:

Yge = f (X1, . . . ,X(m2)), Xi = 1{edge between vertex pair i},

supXi ,Xi′

∣∣∣f (X1, . . . ,Xi , . . . ,X(m2))− f (X1, . . . ,Xi ′ , . . . ,X(m

2))∣∣∣ ≤ 2,

so

P(Yge − µ ≤ −t) ≤ exp(

−t2

4m(m − 1)

)New bound an improvement for m > 2d + 3

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 32 / 36

Example 2: Multinomial Counts

n balls independently into m boxesApplications in species trapping, linguistics, . . .# empty boxes proved asymptotically normal by Weiss 58, Rényi62 in uniform caseEnglund 81: L∞ bound for # of empty cells, uniform caseDubashi & Ranjan 98: Concentration inequality via NAPenrose 09: L∞ bound for # of isolated balls, uniform andnonuniform casesBartroff & Goldstein 13: L∞ bound for all d ≥ 2, uniform case

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 33 / 36

Example 2: Multinomial Counts

n balls independently into m boxesApplications in species trapping, linguistics, . . .# empty boxes proved asymptotically normal by Weiss 58, Rényi62 in uniform caseEnglund 81: L∞ bound for # of empty cells, uniform caseDubashi & Ranjan 98: Concentration inequality via NAPenrose 09: L∞ bound for # of isolated balls, uniform andnonuniform casesBartroff & Goldstein 13: L∞ bound for all d ≥ 2, uniform case

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 33 / 36

Example 2: Multinomial Counts

n balls independently into m boxesApplications in species trapping, linguistics, . . .# empty boxes proved asymptotically normal by Weiss 58, Rényi62 in uniform caseEnglund 81: L∞ bound for # of empty cells, uniform caseDubashi & Ranjan 98: Concentration inequality via NAPenrose 09: L∞ bound for # of isolated balls, uniform andnonuniform casesBartroff & Goldstein 13: L∞ bound for all d ≥ 2, uniform case

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 33 / 36

Example 2: Multinomial Counts

n balls independently into m boxesApplications in species trapping, linguistics, . . .# empty boxes proved asymptotically normal by Weiss 58, Rényi62 in uniform caseEnglund 81: L∞ bound for # of empty cells, uniform caseDubashi & Ranjan 98: Concentration inequality via NAPenrose 09: L∞ bound for # of isolated balls, uniform andnonuniform casesBartroff & Goldstein 13: L∞ bound for all d ≥ 2, uniform case

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 33 / 36

Example 2: Multinomial Counts

n balls independently into m boxesApplications in species trapping, linguistics, . . .# empty boxes proved asymptotically normal by Weiss 58, Rényi62 in uniform caseEnglund 81: L∞ bound for # of empty cells, uniform caseDubashi & Ranjan 98: Concentration inequality via NAPenrose 09: L∞ bound for # of isolated balls, uniform andnonuniform casesBartroff & Goldstein 13: L∞ bound for all d ≥ 2, uniform case

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 33 / 36

Example 2: Multinomial Counts

n balls independently into m boxesApplications in species trapping, linguistics, . . .# empty boxes proved asymptotically normal by Weiss 58, Rényi62 in uniform caseEnglund 81: L∞ bound for # of empty cells, uniform caseDubashi & Ranjan 98: Concentration inequality via NAPenrose 09: L∞ bound for # of isolated balls, uniform andnonuniform casesBartroff & Goldstein 13: L∞ bound for all d ≥ 2, uniform case

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 33 / 36

Example 2: Multinomial Counts

n balls independently into m boxesApplications in species trapping, linguistics, . . .# empty boxes proved asymptotically normal by Weiss 58, Rényi62 in uniform caseEnglund 81: L∞ bound for # of empty cells, uniform caseDubashi & Ranjan 98: Concentration inequality via NAPenrose 09: L∞ bound for # of isolated balls, uniform andnonuniform casesBartroff & Goldstein 13: L∞ bound for all d ≥ 2, uniform case

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 33 / 36

Example 2: Multinomial Counts

pα,j = prob. that ball j ∈ [n] falls in box α ∈ [m]

Mα = # balls in box α

=∑j∈[n]

1{ball j falls in box α}

Constructing U+k from Uk : Choose ball j 6∈ box α with probability

∝pα,j

1− pα,j

and add it to box α.

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 34 / 36

Example 2: Multinomial Counts

pα,j = prob. that ball j ∈ [n] falls in box α ∈ [m]

Mα = # balls in box α

=∑j∈[n]

1{ball j falls in box α}

Constructing U+k from Uk : Choose ball j 6∈ box α with probability

∝pα,j

1− pα,j

and add it to box α.

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 34 / 36

Example 2: Multinomial Counts

pα,j = prob. that ball j ∈ [n] falls in box α ∈ [m]

Mα = # balls in box α

=∑j∈[n]

1{ball j falls in box α}

Constructing U+k from Uk : Choose ball j 6∈ box α with probability

∝pα,j

1− pα,j

and add it to box α.

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 34 / 36

Example 3: Multivariate Hypergeometric Sampling

Urn with n =∑

α∈[m] nα colored balls, nα balls of color αSample of size s drawn without replacementMα = # balls in sample of color αApplications in sampling (and subsampling) theory, gambling,coupon-collector problems

Constructing U+k from Uk : Select non-α colored ball in sample with

probability

∝nα(j)/n

1− nα(j)/n, α(j) = color of ball j

and replace it with α-colored ball

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 35 / 36

Example 3: Multivariate Hypergeometric Sampling

Urn with n =∑

α∈[m] nα colored balls, nα balls of color αSample of size s drawn without replacementMα = # balls in sample of color αApplications in sampling (and subsampling) theory, gambling,coupon-collector problems

Constructing U+k from Uk : Select non-α colored ball in sample with

probability

∝nα(j)/n

1− nα(j)/n, α(j) = color of ball j

and replace it with α-colored ball

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 35 / 36

Example 3: Multivariate Hypergeometric Sampling

Urn with n =∑

α∈[m] nα colored balls, nα balls of color αSample of size s drawn without replacementMα = # balls in sample of color αApplications in sampling (and subsampling) theory, gambling,coupon-collector problems

Constructing U+k from Uk : Select non-α colored ball in sample with

probability

∝nα(j)/n

1− nα(j)/n, α(j) = color of ball j

and replace it with α-colored ball

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 35 / 36

Example 3: Multivariate Hypergeometric Sampling

Urn with n =∑

α∈[m] nα colored balls, nα balls of color αSample of size s drawn without replacementMα = # balls in sample of color αApplications in sampling (and subsampling) theory, gambling,coupon-collector problems

Constructing U+k from Uk : Select non-α colored ball in sample with

probability

∝nα(j)/n

1− nα(j)/n, α(j) = color of ball j

and replace it with α-colored ball

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 35 / 36

Example 3: Multivariate Hypergeometric Sampling

Urn with n =∑

α∈[m] nα colored balls, nα balls of color αSample of size s drawn without replacementMα = # balls in sample of color αApplications in sampling (and subsampling) theory, gambling,coupon-collector problems

Constructing U+k from Uk : Select non-α colored ball in sample with

probability

∝nα(j)/n

1− nα(j)/n, α(j) = color of ball j

and replace it with α-colored ball

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 35 / 36

Example 3: Multivariate Hypergeometric Sampling

Urn with n =∑

α∈[m] nα colored balls, nα balls of color αSample of size s drawn without replacementMα = # balls in sample of color αApplications in sampling (and subsampling) theory, gambling,coupon-collector problems

Constructing U+k from Uk : Select non-α colored ball in sample with

probability

∝nα(j)/n

1− nα(j)/n, α(j) = color of ball j

and replace it with α-colored ball

Y sge, 6=α ≤ Yge, 6=α so B = 0, thus Y s

ge ≤ Yge + |w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 35 / 36

SummaryStein’s method applied to produce

I explicit bounds to the limiting normal distribution forrepeatedly-computed MLE (θ1, θ2, . . . , θK ) of a parameter vector ingroup sequential trials

I concentration inequalities for a class of occupancy models withlog-concave marginals

Many unanswered questions in statistics possibly susceptible toStein’s

I concentration inequalities for heavy-tailed distributionsI convergence of empirical measures and dimension reduction

methodsF projections of empirical measures onto subspacesF high-dimensional PCA

I other problems in sequential analysisF sequentially stopped test statisticsF stopping rules for high-dimensional MCMC

Thank You!Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

SummaryStein’s method applied to produce

I explicit bounds to the limiting normal distribution forrepeatedly-computed MLE (θ1, θ2, . . . , θK ) of a parameter vector ingroup sequential trials

I concentration inequalities for a class of occupancy models withlog-concave marginals

Many unanswered questions in statistics possibly susceptible toStein’s

I concentration inequalities for heavy-tailed distributionsI convergence of empirical measures and dimension reduction

methodsF projections of empirical measures onto subspacesF high-dimensional PCA

I other problems in sequential analysisF sequentially stopped test statisticsF stopping rules for high-dimensional MCMC

Thank You!Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

SummaryStein’s method applied to produce

I explicit bounds to the limiting normal distribution forrepeatedly-computed MLE (θ1, θ2, . . . , θK ) of a parameter vector ingroup sequential trials

I concentration inequalities for a class of occupancy models withlog-concave marginals

Many unanswered questions in statistics possibly susceptible toStein’s

I concentration inequalities for heavy-tailed distributionsI convergence of empirical measures and dimension reduction

methodsF projections of empirical measures onto subspacesF high-dimensional PCA

I other problems in sequential analysisF sequentially stopped test statisticsF stopping rules for high-dimensional MCMC

Thank You!Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Back Up Slides

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Background: McDiarmid’s Inequality

IfX1, . . . ,Xn independentY = f (X1, . . . ,Xn), f measurablethere are ci such that

supxi ,x ′i

∣∣f (x1, . . . , xi , . . . , xn)− f (x1, . . . , x ′i , . . . , xn)∣∣ ≤ ci ,

then

P(Y − µ ≥ t) ≤ exp

(− t2

2∑n

i=1 c2i

)for all t > 0,

and a similar left tail bound.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Comparison 2: Negative Association

X1,X2, ...,Xm are NA if

E(f (Xi ; i ∈ A1)g(Xj ; j ∈ A2)) ≤ E(f (Xi ; i ∈ A1))E(g(Xj ; j ∈ A2))

for anyA1,A2 ⊂ [m] disjoint,f ,g coordinate-wise nondecreasing.

Dubashi & Ranjan 98If X1,X2, ...,Xm are NA indicators, then Y =

∑mi=1 Xi satisfies

P(Y − µ ≥ t) ≤(

µ

µ+ t

)t+µ

et for all t > 0

= O (exp(−t log t)) as t →∞.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Comparison 2: Negative Association

Both NA and our method yield same order bound for Yge inMultinomial countsMultivariate hypergeometric sampling

but NA cannot be applied to:Yne in multinomial countsYne in multivariate hypergeometric samplingYge or Yne in Erdös-Rényi random graphYge or Yne in germ-grain models

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Comparison 2: Negative Association

Both NA and our method yield same order bound for Yge inMultinomial countsMultivariate hypergeometric sampling

but NA cannot be applied to:Yne in multinomial countsYne in multivariate hypergeometric samplingYge or Yne in Erdös-Rényi random graphYge or Yne in germ-grain models

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Comparison 3: Certifiable Functions

McDiarmid & Reed 06If X1,X2, ...,Xn independent and Y = f (X1,X2, ...,Xn) where f iscertifiable:

There is c such that changing any coordinate xj changes the valueof f (x) by at most c,If f (x) = s then there is C ⊂ [n] with |C| ≤ as + b such that thatyi = ci ∀i ∈ C implies f (y) ≥ s,

Then

P(Y − µ ≤ −t) ≤ exp(− t2

2c2(aµ+ b + t/3c)

)for all t > 0,

= O(exp(−t)) as t →∞.

A similar right tail bound.

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Comparison 3: Certifiable Functions

Asymptotically O(e−t ).Best possible rate via log Sobolev inequalities(?)

Multinomial Occupancy: We showed C = |w | so if wα = 1,

P(Yge − µge ≤ −t) ≤ exp(−t2

2µge

).

Similar for right tail, Yne

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Comparison 3: Certifiable Functions

Asymptotically O(e−t ).Best possible rate via log Sobolev inequalities(?)

Multinomial Occupancy: We showed C = |w | so if wα = 1,

P(Yge − µge ≤ −t) ≤ exp(−t2

2µge

).

Similar for right tail, Yne

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Comparison 3: Certifiable Functions

Asymptotically O(e−t ).Best possible rate via log Sobolev inequalities(?)

Multinomial Occupancy: We showed C = |w | so if wα = 1,

P(Yge − µge ≤ −t) ≤ exp(−t2

2µge

).

Similar for right tail, Yne

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain Models

Used in forestry, wireless sensor networks, material science, . . .Germs Uα ∼ fα strictly positive on [0, r)p

Grains Bα = closed ball of radius ρα centered at Uα

d : [0, r)p → {0,1, . . . ,m} = # of intersections we’re interested inat x ∈ [0, r)p

Choice of r relative to p, ρα guarantees nontrivial distribution of

M(x) = # of grains containing at point x ∈ [0, r)p

=∑α∈[m]

1{x ∈ Bα}

Yge =

∫[0,r)p

w(x)1{M(x) ≥ d(x)}dx

= (weighted) volume of d-way intersections of grains

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain Models

Used in forestry, wireless sensor networks, material science, . . .Germs Uα ∼ fα strictly positive on [0, r)p

Grains Bα = closed ball of radius ρα centered at Uα

d : [0, r)p → {0,1, . . . ,m} = # of intersections we’re interested inat x ∈ [0, r)p

Choice of r relative to p, ρα guarantees nontrivial distribution of

M(x) = # of grains containing at point x ∈ [0, r)p

=∑α∈[m]

1{x ∈ Bα}

Yge =

∫[0,r)p

w(x)1{M(x) ≥ d(x)}dx

= (weighted) volume of d-way intersections of grains

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain Models

Used in forestry, wireless sensor networks, material science, . . .Germs Uα ∼ fα strictly positive on [0, r)p

Grains Bα = closed ball of radius ρα centered at Uα

d : [0, r)p → {0,1, . . . ,m} = # of intersections we’re interested inat x ∈ [0, r)p

Choice of r relative to p, ρα guarantees nontrivial distribution of

M(x) = # of grains containing at point x ∈ [0, r)p

=∑α∈[m]

1{x ∈ Bα}

Yge =

∫[0,r)p

w(x)1{M(x) ≥ d(x)}dx

= (weighted) volume of d-way intersections of grains

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain Models

Used in forestry, wireless sensor networks, material science, . . .Germs Uα ∼ fα strictly positive on [0, r)p

Grains Bα = closed ball of radius ρα centered at Uα

d : [0, r)p → {0,1, . . . ,m} = # of intersections we’re interested inat x ∈ [0, r)p

Choice of r relative to p, ρα guarantees nontrivial distribution of

M(x) = # of grains containing at point x ∈ [0, r)p

=∑α∈[m]

1{x ∈ Bα}

Yge =

∫[0,r)p

w(x)1{M(x) ≥ d(x)}dx

= (weighted) volume of d-way intersections of grains

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain Models

Used in forestry, wireless sensor networks, material science, . . .Germs Uα ∼ fα strictly positive on [0, r)p

Grains Bα = closed ball of radius ρα centered at Uα

d : [0, r)p → {0,1, . . . ,m} = # of intersections we’re interested inat x ∈ [0, r)p

Choice of r relative to p, ρα guarantees nontrivial distribution of

M(x) = # of grains containing at point x ∈ [0, r)p

=∑α∈[m]

1{x ∈ Bα}

Yge =

∫[0,r)p

w(x)1{M(x) ≥ d(x)}dx

= (weighted) volume of d-way intersections of grains

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain ModelsMain ideas in proof

Different approach:1. Generate U0 independent of U1, . . . ,Um

2. Compute U0, . . . ,Ud(U0) and set Y sge = Yge(Md(U0))

3. Y sge has size bias distribution by Conditional Lemma with

A = {M(U0) ≥ d(U0)}:

Conditional Lemma (Goldstein & Penrose 10)If P(A) ∈ (0,1) < 1 and Y = P(A|F), then Y s has the Y -size biasdistribution if L(Y s) = L(Y |A).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain ModelsMain ideas in proof

Different approach:1. Generate U0 independent of U1, . . . ,Um

2. Compute U0, . . . ,Ud(U0) and set Y sge = Yge(Md(U0))

3. Y sge has size bias distribution by Conditional Lemma with

A = {M(U0) ≥ d(U0)}:

Conditional Lemma (Goldstein & Penrose 10)If P(A) ∈ (0,1) < 1 and Y = P(A|F), then Y s has the Y -size biasdistribution if L(Y s) = L(Y |A).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain ModelsMain ideas in proof

Different approach:1. Generate U0 independent of U1, . . . ,Um

2. Compute U0, . . . ,Ud(U0) and set Y sge = Yge(Md(U0))

3. Y sge has size bias distribution by Conditional Lemma with

A = {M(U0) ≥ d(U0)}:

Conditional Lemma (Goldstein & Penrose 10)If P(A) ∈ (0,1) < 1 and Y = P(A|F), then Y s has the Y -size biasdistribution if L(Y s) = L(Y |A).

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain ModelsMain ideas in proofArgument: Generate U0 ∼ w(x)/

∫w . Given Uk ∼ L(U0|M(U0) ≥ k),

with probability π(Mk (U0), k) choose germ β with probability

∝pβ(U0)

1− pβ(U0), where pβ(x) = P(x ∈ Uβ),

from germs whose grains do not contain U0, replace it with U ′β ∼ PU0

to get Uk+1, where

PU0(V ) = P(Uβ ∈ V |D(Uβ,U0) ≤ ρβ).

Otherwise Uk+1 = Uk .

Volume increase replacing Uβ by U ′β at most νp|ρ|p(νp = vol. of unit ball)Volume increase between U0 and Ud(U0) at most νp|ρ|p|d |Y s

ge increases Yge by at most νp|ρ|p|d ||w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36

Another Application: Germ-Grain ModelsMain ideas in proofArgument: Generate U0 ∼ w(x)/

∫w . Given Uk ∼ L(U0|M(U0) ≥ k),

with probability π(Mk (U0), k) choose germ β with probability

∝pβ(U0)

1− pβ(U0), where pβ(x) = P(x ∈ Uβ),

from germs whose grains do not contain U0, replace it with U ′β ∼ PU0

to get Uk+1, where

PU0(V ) = P(Uβ ∈ V |D(Uβ,U0) ≤ ρβ).

Otherwise Uk+1 = Uk .

Volume increase replacing Uβ by U ′β at most νp|ρ|p(νp = vol. of unit ball)Volume increase between U0 and Ud(U0) at most νp|ρ|p|d |Y s

ge increases Yge by at most νp|ρ|p|d ||w |

Jay Bartroff (USC) Stein’s for Stats 4.Jul.17 36 / 36