Secure Distributed Framework for Achieving ϵ -Differential Privacy

55
Secure Distributed Framework for Achieving ϵ- Differential Privacy Dima Alhadidi, Noman Mohammed, Benjamin C. M. Fung, and Mourad Debbabi Concordia Institute for Information Systems Engineering Concordia University, Montreal, Quebec, Canada {dm_alhad,no_moham,fung,debbabi}@encs.concordia.ca

description

Secure Distributed Framework for Achieving ϵ -Differential Privacy. Dima Alhadidi , Noman Mohammed, Benjamin C. M. Fung, and Mourad Debbabi Concordia Institute for Information Systems Engineering Concordia University, Montreal, Quebec, Canada - PowerPoint PPT Presentation

Transcript of Secure Distributed Framework for Achieving ϵ -Differential Privacy

Page 1: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

Secure Distributed Framework for Achieving ϵ-Differential PrivacyDima Alhadidi, Noman Mohammed, Benjamin C. M. Fung, and Mourad DebbabiConcordia Institute for Information Systems EngineeringConcordia University, Montreal, Quebec, Canada{dm_alhad,no_moham,fung,debbabi}@encs.concordia.ca

Page 2: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

26/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 3: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

36/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 4: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

46/24/2012

Motivation

Individuals Data Publisher

Anonymization Algorithm

Data Recipients

Centralized

Distributed

Page 5: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

56/24/2012

Motivation• Distributed: Vertically-Partitioned

ID Job

1 Writer

2 Dancer

3 Writer

4 Dancer

5 Engineer

6 Engineer

7 Engineer

8 Dancer

9 Lawyer

10 Lawyer

ID Sex Salary

1 M 30K

2 M 25K

3 M 35K

4 F 37K

5 F 65K

6 F 35K

7 M 30K

8 F 44K

9 M 44K

10 F 44K

Page 6: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

66/24/2012

Motivation• Distributed: Vertically-Partitioned

ID Job Sex Salary

1 Writer M 30K

2 Dancer M 25K

3 Writer M 35K

4 Dancer F 37K

5 Engineer

F 65K

6 Engineer

F 35K

7 Engineer

M 30K

8 Dancer F 44K

9 Lawyer M 44K

10 Lawyer F 44K

Page 7: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

76/24/2012

Motivation• Distributed: Horizontally-

PartitionedID Job Sex Age

Surgery

1 Janitor M 34 Transgender

2 Lawyer F 58 Plastic

3 Mover M 58 Urology

4 Lawyer M 24 Vascular

5 Mover M 34 Transgender

6 Janitor M 44 Plastic

7 Doctor F 44 Vascular

ID Job Sex Age

Surgery

8 Doctor M 58 Plastic

9 Doctor M 24 Urology

10 Janitor F 63 Vascular

11 Mover F 63 Plastic

Page 8: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

86/24/2012

Motivation• Distributed: Horizontally-

PartitionedID Job Sex Age

Surgery

1 Janitor M 34 Transgender

2 Lawyer F 58 Plastic

3 Mover M 58 Urology

4 Lawyer M 24 Vascular

5 Mover M 34 Transgender

6 Janitor M 44 Plastic

7 Doctor F 44 Vascular

8 Doctor M 58 Plastic

9 Doctor M 24 Urology

10 Janitor F 63 Vascular

11 Mover F 63 Plastic

Page 9: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

96/24/2012

Motivation• Distributed: Horizontally-

PartitionedID Job Sex Age

Surgery

1 Janitor M 34 Transgender

2 Lawyer F 58 Plastic

3 Mover M 58 Urology

4 Lawyer M 24 Vascular

5 Mover M 34 Transgender

6 Janitor M 44 Plastic

7 Doctor F 44 Vascular

8 Doctor M 58 Plastic

9 Doctor M 24 Urology

10 Janitor F 63 Vascular

11 Mover F 63 Plastic

Page 10: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

106/24/2012

Motivation• Distributed: Horizontally-

PartitionedID Job Sex Age

Surgery

1 Janitor M 34 Transgender

2 Lawyer F 58 Plastic

3 Mover M 58 Urology

4 Lawyer M 24 Vascular

5 Mover M 34 Transgender

6 Janitor M 44 Plastic

7 Doctor F 44 Vascular

8 Doctor M 58 Plastic

9 Doctor M 24 Urology

10 Janitor F 63 Vascular

11 Mover F 63 Plastic

Page 11: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

116/24/2012

Motivation• Distributed: Horizontally-

PartitionedID Job Sex Age

Surgery

1 Janitor M 34 Transgender

2 Lawyer F 58 Plastic

3 Mover M 58 Urology

4 Lawyer M 24 Vascular

5 Mover M 34 Transgender

6 Janitor M 44 Plastic

7 Doctor F 44 Vascular

8 Doctor M 58 Plastic

9 Doctor M 24 Urology

10 Janitor F 63 Vascular

11 Mover F 63 Plastic

Page 12: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

126/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 13: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

136/24/2012

Problem Statement• Desideratum to develop a two-

party data publishing algorithm for horizontally-partitioned data which :– achieves differential privacy and – satisfies the security definition of

secure multiparty computation (SMC).

Page 14: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

146/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 15: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

156/24/2012

Related WorkAlgorithms

Data Owner Privacy Model

Centralized

DistributedDifferential Privacy

Partition-based PrivacyHorizontall

yVertically

LeFevre et al., Fung et al., etc

Xiao et al. , Mohammed et al. , etc.

Jurczyk and Xiong, Mohammed et al.

Jiang and Clifton, Mohammed et al.

Our proposal

Page 16: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

166/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 17: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

176/24/2012

k-AnonymityRaw patient table

Job Sex Age DiseaseEngineer Male 35 FeverEngineer Male 38 FeverLawyer Male 38 Hepatitis

Musician Female 30 FluMusician Female 30 HepatitisDancer Female 30 HepatitisDancer Female 30 Hepatitis

Page 18: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

186/24/2012

k-AnonymityRaw patient table

Job Sex Age

Disease

Engineer Male 35 FeverEngineer Male 38 FeverLawyer Male 38 Hepatitis

Musician Female 30 FluMusician Female 30 HepatitisDancer Female 30 HepatitisDancer Female 30 Hepatitis

Quasi-identifier (QID)

Page 19: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

196/24/2012

k-Anonymity3-anonymous patient table

Job Sex Age DiseaseProfessional Male [36-

40]Fever

Professional Male [36-40]

Fever

Professional Male [36-40]

Hepatitis

Artist Female [30-35]

Flu

Artist Female [30-35]

Hepatitis

Artist Female [30-35]

Hepatitis

Artist Female [30-35]

Hepatitis

Raw patient tableJob Sex Age Disease

Engineer Male 35 FeverEngineer Male 38 FeverLawyer Male 38 Hepatitis

Musician Female 30 FluMusician Female 30 HepatitisDancer Female 30 HepatitisDancer Female 30 Hepatitis

Page 20: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

206/24/2012

Differential PrivacyD D

Page 21: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

216/24/2012

Laplace Mechanism

D

Page 22: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

226/24/2012

Exponential Mechanism• McSherry and Talwar have

proposed the exponential mechanism that can choose an output that is close to the optimum with respect to a utility function while preserving differential privacy.

Page 23: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

236/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 24: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

246/24/2012

Two-Party Differentially Private Data Release• Generalizing the raw data• Adding noisy count

Page 25: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

256/24/2012

Generalizing the raw data

Distributed Exponential Mechanism(DEM)

Page 26: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

266/24/2012

GeneralizationDistributed Exponential Mechanism

(DEM)

Page 27: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

276/24/2012

Adding Noisy Count• Each party adds a Laplace noise

to its count .• Each party sends the result to

the other party.

Page 28: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

286/24/2012

Two-Party Protocol for Exponential Mechanism• Input:

1. Two raw data sets by two parties2. Set of candidates3. Privacy budget

• Output : Winner candidate

Page 29: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

296/24/2012

Max Utility Function

ID Class

Job Sex Age Surgery

1 N Janitor M 34 Transgender

2 Y Lawyer F 58 Plastic

3 Y Mover M 58 Urology

4 N Lawyer M 24 Vascular

5 Y Mover M 34 Transgender

6 Y Janitor M 44 Plastic

7 Y Doctor F 44 Vascular

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

D1

Page 30: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

306/24/2012

Max Utility Function

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

D2

ID Class Job Sex

Age Surgery

8 N Doctor

M 58 Plastic

9 Y Doctor

M 24 Urology

10 Y Janitor

F 63 Vascular

11 Y Mover F 63 Plastic

Page 31: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

316/24/2012

Max Utility FunctionMax

ClassJob Data SetY N

5 3 1 Blue-collarD12 1 White-

collar3 2 0 Blue-collar

D21 1 White-collar

8 5 1 Blue-collar Integrated D1 and D2

3 2 White-collar

ID Class

Job Sex

Age

Surgery

1 N Janitor M 34 Transgender

2 Y Lawyer F 58 Plastic

3 Y Mover M 58 Urology

4 N Lawyer M 24 Vascular

5 Y Mover M 34 Transgender

6 Y Janitor M 44 Plastic

7 Y Doctor F 44 Vascular

8 N Doctor M 58 Plastic

9 Y Doctor M 24 Urology

10 Y Janitor F 63 Vascular

11 Y Mover F 63 PlasticD1 & D2

Page 32: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

326/24/2012

Computing Max Utility FunctionBlue-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Page 33: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

336/24/2012

Computing Max Utility Functionmax=1 Blue-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Page 34: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

346/24/2012

Computing Max Utility Functionmax=1 Blue-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Page 35: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

356/24/2012

Computing Max Utility Functionmax=5, sum=5 Blue-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Page 36: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

366/24/2012

Computing Max Utility Functionsum=5 White-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Page 37: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

376/24/2012

Computing Max Utility Functionmax=2, sum=5 White-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Page 38: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

386/24/2012

Computing Max Utility Functionmax=2, sum=5 White-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Page 39: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

396/24/2012

Computing Max Utility Function• max=3, sum=8 White-collar

MaxClass

Job Data SetY N5 3 1 Blue-collar

D12 1 White-collar

3 2 0 Blue-collarD21 1 White-

collar8 5 1 Blue-collar Integrated D1

and D23 2 White-

collar

Result: Shares 1 and 2

Page 40: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

406/24/2012

Computing the Exponential Equation• Given the scores of all the

candidates, exponential mechanism selects the candidate having score u with the following probability:Shares 1 and 2

Page 41: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

416/24/2012

Computing the Exponential Equation

=

Taylor Series

=

Page 42: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

426/24/2012

Computing the Exponential Equation

Lowest common multiplier of {2!,…,w!}, no fraction

Approximating up to a predetermined number s after the decimal point

Page 43: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

436/24/2012

Computing the Exponential Equation

No fraction

Page 44: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

446/24/2012

Computing the Exponential Equation

Oblivious Polynomial Evaluation

First Party

Second Party ResultFirst Party Second Party

Page 45: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

456/24/2012

Computing the Exponential Equation

Second Party

First Party

Page 46: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

466/24/2012

Computing the Exponential Equation

0 10.50.30.2 0.7

Picking a random number[0,1]

Page 47: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

476/24/2012

Computing the Exponential Equation

0

Picking a random number[0, ]

Page 48: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

486/24/2012

Picking a Random Number

Second Party

Random Value Protocol

[Bunn and Ostrovsky 2007]

First Party

Second Party

First Party

Page 49: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

496/24/2012

Picking a Winner

Page 50: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

506/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 51: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

516/24/2012

Performance Analysis– Adult: is a Census data

• 6 numerical attributes.• 8 categorical attributes.• 45,222 census records

– Cost Estimates• 37.5 minutes of computation• 37.3 minutes of communication using

T1 line with 1.544 Mbits/second bandwidth.

Page 52: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

526/24/2012

Scaling Impact

Page 53: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

536/24/2012

Outline• Motivation• Problem Statement• Related Work• Background• Two-Party Differentially Private Data

Release• Performance Analysis• Conclusion

Page 54: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

546/24/2012

Conclusion• Data release algorithm

– Two-party – Differentially-private – Secure– Horizontally-partitioned – Non-interactive setting

Page 55: Secure Distributed Framework for Achieving  ϵ -Differential Privacy

556/24/2012

Future Work• Consider different scenarios

– Two parties vs. multiple parties– Semi-honest vs. malicious

adversary model– Horizontally vs. Vertically

partitioned data• For all these scenarios, we need

efficient algorithms