Marco Gaboardi, Ryan Rogers, Or She etryrogers/AISTATS2019_poster.pdfMarco Gaboardi, Ryan Rogers, Or...

Locally Private Mean Estimation:Z-test and Tight Confidence Intervals

Marco Gaboardi, Ryan Rogers, Or Sheffet

Mean estimation (ME)

I Setting: We have n samples drawn from a Gaussian

X1, ...,Xn ∼i.i.d N (µ, σ2)

such that µ ∈ [−R,R] for some known bound R, and σ iseither provided as an input (known variance case) or leftunspecified (unknown variance case).

I Goal: Determine an estimate of µ useful for Z-test and forreleasing confidence intervals:

i .i .d .∼ N (µ,σ2),M(X)

[µ ∈M(X)] ≥ 1− β

The Need for Privacy

I Data may contain sensitive information.

I Releasing the result may leak information

Modified Goal: Determine an estimate of µ which preservesthe privacy of those in the study and that is useful for Z-testand for releasing confidence interval.

Local Differential Privacy [4] (LDP)

I Central Model: Data is submitted in the clear to a trustedcurator and the output of a statistic on the data is privatized.

I Local Model: No trusted curator - data is privatized andthen collected.

I An algorithm M : X → O is ε-differentially private if for allinputs, x , x ′ and outcome sets S ⊆ O:

P [M(x) ∈ S] ≤ eεP [M(x ′) ∈ S] .

I Local model of differential privacy is used in practice.

LDP randomizers properties

We use the following mechanisms

I Gaussian Noise [2]: Suppose each datum is sampled froman interval I of length `. Then we add independent noise

N (0, 2`2 ln(2/δ)/ε2)

to each datum guaranteeing (ε, δ)-differential privacy.

I Randomized Response [5]: Suppose each datum is a bit0, 1 and on each datum we operate independently,applying RRε : 0, 1 → 0, 1 where

RRε(b) =

b w.p. eε

1− b else

I Bit Flipping algorithm [1]: Suppose each datum xi is ad -dimensional vector indicating its type using a standardbasis vector. The Bit Flipping mechanism now runs dindependent randomized response mechanism for eachcoordinate separately with parameter ε/2:

BF(x1i , . . . , x

di ) = (RRε/2(x1

i ), . . . ,RRε/2(xdi ))

LDP ME - Known Variance

Our approach is inspired by the work of Karwa and Vadhan [3].We adapt it to the local model.

KnownVar (X;σ, β, ε, n,R) (sketch)

1. Find a bin of length σ most likely to hold µ

2. Construct an interval of length 4σ + 2σ√

2 log (8n/β)centered at this bin

3. Project all remaining points onto this interval and addind. Gaussian noise.

KnownVar properties

I Privacy: KnownVar is (ε, δ)-LDP.

I Confidence Interval: If n ≥ 1600(

eε/2+1eε/2−1

)2log(

then KnownVar returns an interval I such that:P

X,KnownVar[ mu ∈ I ] ≥ 1− β. whose size is:

|I | = O

(σ ·

√log (n/β) · log (1/β) · log(1/δ)

Locally Private Z-test

I For any interval on the reals I we can associate a likelihood

of pIdef= P

X∼P[X ∈ I ], and we know that w.p. pI ± β it

indeed holds that µ ∈ I .

I This mimics the power of a Z -test — in particular we cannow compare two intervals as to which one is more likely tohold µ, compare populations, etc.

I Below are results showing the empirical p-values and poweraveraged over 100 trials for various privacy parameters.

0.1 0.2 0.3 0.4 0.5 0.6

p−Values for Z−Test

Alternative Hypotheses for the Mean

p−va

0.1 0.2 0.3 0.4 0.5 0.6

p−va

0.1 0.2 0.3 0.4 0.5 0.6

p−va

0.1 0.2 0.3 0.4 0.5 0.6

p−va

epsilon

0.50.811.5

Lower Bounds

I Main Lemma: LetM be a one-shot (each individual ispresented with a single query) local ε-differentially privatemechanism. Let P and Q be two distributions, with

∆def= dTV(P,Q). Fix any 0 < δ < e−1 and set

ε∗ = 8ε∆√

ln(2/δ) + 16ε∆√

. Then, for any

set S of outputs,

i.i.d∼ P; M[M(X ∈ S] ≤ eε

Xi.i.d∼ Q; M

[M(X) ∈ S] + δ

I Lower bound: Any one-shot local differentially privatealgorithm must return an interval of length

(σ√

log(1/β)

)I Lower bound: LetM be a ε-LDP mechanism which is

(αdist, αquant, β)-useful for the p-quantile problem over P ,given that the true p-quantile lies in the interval [−R,R].Then, for any β < 1

6it must hold that

n ≥ Ω( 1α2

quantε2 · ln( R

αdistβ)).

LDP ME - Unknown Variance

Our approach mimics the same approach fromAlgorithm KnownVar but without the knowledge of thevariance.

I Goal: Find a suitably large yet sufficiently tight interval[s1, s2].

I Problem: This cannot be done using the off-the-shelf BitFlipping mechanism as that required we know the granularityof each bin in advance.

I Solution: We abandon the idea of finding a histogram on thedata. Instead, we propose finding a good approximation forσ using a quantile estimation based on a binary search,using the following algorithm.

Algorithm BinQuantRequire: Data x1, · · · , xN, target quantile p∗; ε, [Qmin,Qmax],λ, T .Initialize j = 0, n = N/T , s1 = Qmin, s2 = Qmax.for j = 1, · · · ,T do

Select users U (j) = j · n + 1, j · n + 2, · · · , (j + 1) · nSet t(j) ← s1+s2

Denote φ(j)(x) = 1x < t(j).Run randomized response on U (j) and obtain

Z (j) = 1n θRR(n, φ(j)).

if (Z (j) > p∗ + λ2

) then

s2 ← t(j)

else if (Z (j) < p∗ − λ2

) then

s1 ← t(j)

elsebreak

Ensure: t(j)

Our algorithm UnkVar uses the quantile estimation twice: oncefor p∗ = 1

2where t∗ = µ, and once for the value of

p∗ = Φ(1) ≈ 0.8413 for which the corresponding threshold ist∗ = µ + σ. Using these two values we obtain estimations forµ, σ and we apply a similar approach to Algorithm KnownVar.

UnkVar properties

I Privacy: UnkVar is (ε, δ)-LDP.

I Confidence Interval: Let X ∼ N (µ, σ2) i.i.d. Fixparameters ε, β ∈ (0, 1/2). Given that µ ∈ [−R,R] andthat σmin ≤ σ ≤ σmax ≤ 2R, if

n ≥ 1500 log2(16Rσmin

eε+1eε−1

)2· ln(16 log2(16R/σmin)

then the interval I returned by Algorithm UnkVar satisfies

that PX, UnkVar

[I 3 µ

]≥ 1− β, and moreover

(σ ·

√log (n/β) log (1/β) log(1/δ)

)I Very large variance case: If σ > R we give a different

algorithm, based on matching quantiles. We estimatep− = Pr[X < −R] and p+ = Pr[X < R], then plot theGaussian based on the quantiles of N (0, 1) obtaining p−and p+.

References

[1] Bassily and Smith. Local, private, efficient protocols for succincthistograms. In STOC’15.

[2] C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor.Our data, ourselves: Privacy via distributed noise generation. InEUROCRYPT06, 2006.

[3] V. Karwa and S. P. Vadhan. Finite sample differentially privateconfidence intervals. In ITCS18.

[4] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova,and A. D. Smith. What can we learn privately? In FOCS08.

[5] S. L. Warner. Randomized response: A survey technique foreliminating evasive answer bias. Journal of the American Statis-tical Association, 60:63–69, 1965.

AISTATS 2019 Naha, Japan

Marco Gaboardi, Ryan Rogers, Or She etryrogers/AISTATS2019_poster.pdfMarco Gaboardi, Ryan Rogers, Or...

Documents

Transcript of Marco Gaboardi, Ryan Rogers, Or She etryrogers/AISTATS2019_poster.pdfMarco Gaboardi, Ryan Rogers, Or...

UR MathematicsSome implications of Chu’s 10 10 extension of Bailey’s 6 6 summation formula James McLaughlin, Andrew V. Sills, Peter Zimmer April 15, 2011 Keywords: q-Series, Rogers-Ramanuja

IronDeﬁciencyAnemiainAdultOnsetStill’sDiseasewith ......iron deﬁciency anemia, for which she was subsequently treated with ferrous sulfate 325mg three times a day. When seen

OZON 'She Drives Me Crazy' Issue

Quantitative characters II: heritabilitycontent.csbs.utah.edu/~rogers/ant5221/lecture/QTs2.pdf · Quantitative characters II: heritability . The variance of a trait (x) is the average

Midwestern University Chicago College of Osteopathic Medicine President: Ryan Bade.

MARINA OLYMPIOS LYCOURGOS - Alpha C.K. Art Galleryackgallery.com/.../2018/09/...Exhibition-Catalogue.pdf · Marina Olympios Lycourgos was born in âêçé in Nicosia. She studied

Linear programming, width-1 CSPs, and robust satisfactionodonnell/papers/width1.pdf · Linear programming, width-1 CSPs, and robust satisfaction Gabor Kun Ryan O’Donnelly Suguru

Notes on She eld's quantum zipper - MIT Mathematicsmath.mit.edu/~sbenoist/Doc/NSQZ.pdf · Notes on She eld's quantum zipper Stéphane Benoist ... These are reading notes of She eld's

Yuan Zhou, Ryan O’Donnell Carnegie Mellon University

A FRAMEWORK OF ROGERS{RAMANUJAN IDENTITIES AND …

· nMaroon 5: She Will Be Loved nMaroon 5: Sweetest Goodbye n3 Doors Down: Here Without You nHowieDay: Collide n ( ) nBeautiful, Love, Male_Vocalists, Pop, Amazing,

Trevor Hastie, Stanford University with Ryan Tibshirani and Rob …hastie/TALKS/hastieJSM2017.pdf · 2017. 8. 2. · Trevor Hastie, Stanford University with Ryan Tibshirani and Rob

New degree bounds for polynomials with prescribed signs Ryan ODonnell (MIT) Rocco Servedio (Harvard/Columbia)

Notes on She eld's quantum zipper

18.175: Lecture 15 .1in Characteristic functions and …math.mit.edu/~sheffield/175/Lecture15.pdf18.175: Lecture 15 Characteristic functions and central limit theorem Scott She eld

Theme Music: Kenny Rogers Every time two fools … · Physics 131 10/21/16 Prof. E. F. Redish 1 10/21/16 Physics 131 1 Theme Music: Kenny Rogers Every time two fools collide Cartoon:

Decidability and Deﬁnability in the Σ2-Enumeration … · i Abstract Enumeration reducibility was introduced by Friedberg and Rogers in 1959 as a positive reducibility between

TITAN Mass Measurement of 11 Li (and other halo nuclei) Ryan Ringle, TRIUMF.

Presented by: Ryan ODonnell Carnegie Melloni by D. H. J. Polymath.

Minimally Invasive Surgery Symposium Gut Hormones and the Medical Management of Diabesity Donna H. Ryan, MD Pennington Biomedical Research Center Baton.