Local differential privacy

Similar documents
Tengyu Ma Facebook AI Research. Based on joint work with Rong Ge (Duke) and Jason D. Lee (USC)

Coalitional Game Theory

Approval Voting and Scoring Rules with Common Values

Congressional samples Juho Lamminmäki

Universality of election statistics and a way to use it to detect election fraud.

Sequential Voting with Externalities: Herding in Social Networks

Proving correctness of Stable Matching algorithm Analyzing algorithms Asymptotic running times

Exposure-Resilience for Free: The Hierarchical ID-based Encryption Case

Comparison Sorts. EECS 2011 Prof. J. Elder - 1 -

The Effectiveness of Receipt-Based Attacks on ThreeBallot

Tengyu Ma Facebook AI Research. Based on joint work with Yuanzhi Li (Princeton) and Hongyang Zhang (Stanford)

Social Rankings in Human-Computer Committees

How to Change a Group s Collective Decision?

Adapting the Social Network to Affect Elections

Lecture 6 Cryptographic Hash Functions

14.770: Introduction to Political Economy Lectures 8 and 9: Political Agency

Addressing the Challenges of e-voting Through Crypto Design

Hoboken Public Schools. Algebra II Honors Curriculum

Processes. Criteria for Comparing Scheduling Algorithms

Random Forests. Gradient Boosting. and. Bagging and Boosting

A comparative analysis of subreddit recommenders for Reddit

Voting System: elections

Nonexistence of Voting Rules That Are Usually Hard to Manipulate

CS269I: Incentives in Computer Science Lecture #4: Voting, Machine Learning, and Participatory Democracy

A matinee of cryptographic topics

Hat problem on a graph

An untraceable, universally verifiable voting scheme

Complexity of Manipulating Elections with Few Candidates

Data Sampling using Congressional sampling. by Juhani Heliö

Estonian National Electoral Committee. E-Voting System. General Overview

Women as Policy Makers: Evidence from a Randomized Policy Experiment in India

Improved Boosting Algorithms Using Confidence-rated Predictions

How hard is it to control sequential elections via the agenda?

Lecture 7 A Special Class of TU games: Voting Games

Migrant Wages, Human Capital Accumulation and Return Migration

Ushio: Analyzing News Media and Public Trends in Twitter

Topics on the Border of Economics and Computation December 18, Lecture 8

A Global Economy-Climate Model with High Regional Resolution

David R. M. Thompson, Omer Lev, Kevin Leyton-Brown & Jeffrey S. Rosenschein COMSOC 2012 Kraków, Poland

A Calculus for End-to-end Statistical Service Guarantees

Introduction to the declination function for gerrymanders

Primecoin: Cryptocurrency with Prime Number Proof-of-Work

Hoboken Public Schools. College Algebra Curriculum

HASHGRAPH CONSENSUS: DETAILED EXAMPLES

Skilled Worker Migration and Trade: Inequality and Welfare

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved

Evaluation of Election Outcomes under Uncertainty

File Systems: Fundamentals

Of the People: Voting Is More Effective with Representative Candidates. Yu Cheng Shaddin Dughmi David Kempe University of Southern California

Macroeconomic Implications of Shifts in the Relative Demand for Skills

Complexity of Strategic Behavior in Multi-Winner Elections

A New Method of the Single Transferable Vote and its Axiomatic Justification

Voting and Complexity

Remittances and Poverty. in Guatemala* Richard H. Adams, Jr. Development Research Group (DECRG) MSN MC World Bank.

Key Considerations for Implementing Bodies and Oversight Actors

Online Appendix for The Contribution of National Income Inequality to Regional Economic Divergence

Introduction to Computational Social Choice. Yann Chevaleyre. LAMSADE, Université Paris-Dauphine

Supreme Court of Florida

"OH SAY, WHOM WOULD YOU VOTE FOR?" SOME REMARKS ON HUNGARIAN ELECTORAL SYSTEM

Real-Time Wireless Control Networks for Cyber-Physical Systems

Tilburg University. Can a brain drain be good for growth? Mountford, A.W. Publication date: Link to publication

Maps and Hash Tables. EECS 2011 Prof. J. Elder - 1 -

SIMPLE LINEAR REGRESSION OF CPS DATA

Evaluation of election outcomes under uncertainty

Wage Trends among Disadvantaged Minorities

Chapter. Sampling Distributions Pearson Prentice Hall. All rights reserved

An Inter-group Conflict Model Integrating Perceived Threat, Vested Interests and Alternative Strategies for Cooperation

Illegal Migration and Policy Enforcement

Secure Electronic Voting

Strengthen Stewardship With Electronic Giving

The Political Economy of Trade Policy

Institutional aspects: What are the institutional actions to promote data sharing?

The Provision of Public Goods Under Alternative. Electoral Incentives

Manipulation of elections by minimal coalitions

Voting Protocol. Bekir Arslan November 15, 2008

HOTELLING-DOWNS MODEL OF ELECTORAL COMPETITION AND THE OPTION TO QUIT

Subreddit Recommendations within Reddit Communities

Why Do We Pay Attention to Candidate Race, Gender, and Party? A Theory of the Development of Political Categorization Schemes

A Qualitative and Quantitative Analysis of the Political Discourse on Nepalese Social Media

Super-Simple Simultaneous Single-Ballot Risk-Limiting Audits

Support Vector Machines

Complexity of Terminating Preference Elicitation

CPSC 467b: Cryptography and Computer Security

1. Introduction. The Stock Adjustment Model of Migration: The Scottish Experience

Cloning in Elections 1

Buying Supermajorities

1 Aggregating Preferences

Essential Questions Content Skills Assessments Standards/PIs. Identify prime and composite numbers, GCF, and prime factorization.

Combining national and constituency polling for forecasting

Caste Networks in the Modern Indian Economy

CSC304 Lecture 16. Voting 3: Axiomatic, Statistical, and Utilitarian Approaches to Voting. CSC304 - Nisarg Shah 1

Maps, Hash Tables and Dictionaries

Modeling Economic Systems. Aaron Salls

CS 5523: Operating Systems

Liberal political equality implies proportional representation

Enriqueta Aragones Harvard University and Universitat Pompeu Fabra Andrew Postlewaite University of Pennsylvania. March 9, 2000


Networked Games: Coloring, Consensus and Voting. Prof. Michael Kearns Networked Life NETS 112 Fall 2013

c M. J. Wooldridge, used by permission/updated by Simon Parsons, Spring

On the Complexity of Voting Manipulation under Randomized Tie-Breaking

Transcription:

Local differential privacy Adam Smith Penn State Bar-Ilan Winter School February 14, 2017

Outline Model Ø Implementations Question: what computations can we carry out in this model? Example: randomized response (again!) Ø SQ computations Simulating local algs via SQ Ø An exponential separation Averaging vectors Heavy hitters: succinct averaging Lower bounds: information Ø Example: selection Compression Learning and adaptivity 2

Local Model for Privacy A Q 5 Q 6 Untrusted aggregator local random coins Q 7 A Person i randomizes their own data, say on their own device 0 Requirement: Each Q # is (ε, δ)-differentially private. Ø We will ignore δ Ø Aggregator may talk to each person multiple times Ø For every pair of values of person i s data, for all events T: Pr R x T e 1 Pr R y T. 3

Local Model for Privacy A Q 5 Q 6 Untrusted aggregator local random coins Q 7 A Pros Ø No trusted curator Ø No single point of failure Ø Highly distributed Cons Ø Lower accuracy 4

Local differential privacy in practice https://developer.apple.com/ videos/play/wwdc2016/709/ https://github.com/google/rappor 5

Local Model for Privacy A Q 5 Q 6 Untrusted aggregator local random coins Q 7 A Open questions Ø Efficient, network-friendly MPC protocols for simulating exponential mechanism in local model Ø Interaction in optimization (tomorrow) Ø Other tasks? 6

Local Model for Privacy A Q 5 Q 6 Untrusted aggregator local random coins Q 7 A What can and can t we do in the local model? 7

Example: Randomized response Each person has data x # X Ø Analyst wants to know average of f: X 1,1 over x Randomization operator takes y { 1,1}: Q y = Observe: +yc 1 yc 1 e 1 w. p. e 1 + 1 1 w. p. e 1 + 1 Ø E Q 1 = 1 and E Q 1 = 1. Ø Q takes values in C 1, C 1 How can we estimate a proportion? Ø A x 5,., x 7 = 5 7 Q f x # # Proposition: A x 5 f x 5 7 # # = O P 1 7 (à la [Duchi Jordan Wainwright 2013]) where C 1 = e1 + 1 e 1 1. Centralized DP: O 5 71 via Laplace mechanism optimal 8

SQ algorithms An SQ algorithm interacts with a data set by asking a series of statistical queries Ø Query: f: X [ 1,1] Ø Response: at 5 7 f(x #) # ± α where α is the error Huge fraction of basic learning/optimization algorithms can be expressed in SQ form [Kearns 93] 9

SQ algorithms An SQ algorithm interacts with a data set by asking a series of statistical queries Ø Statistical Query: f: X [ 1,1] Ø Response: at 5 7 f(x #) # ± α where α is the error Huge fraction of basic learning/optimization algorithms can be expressed in SQ form [Kearns 93] Theorem: Every sequence of k SQ queries can be computed with local DP with error α = O Proof: X YZ[ X 1 \ 7. Central: Ø Randomly divide n people into k groups of size 7 X O k nε Ø Have each group answer 1 question. 10

SQ algorithms and Local Privacy Every SQ algorithm can be simulated by a LDP protocol. Can every centralized DP algorithm be simulated by LDP? Ø No! Theorem: Every LDP algorithm can be simulated by SQ with polynomial blow-up in n. Theorem: No SQ algorithm can learn parity with polynomially many samples (n = 2 _ ` ). Theorem: Centralized DP algorithms can learn parity with n = O ` samples. 1 Is research on local privacy over? Ø No! Polynomial factors matter Central DP LDP = SQ 11

Outline Some stuff we can do Ø Heavy hitters Some stuff we cannot do Ø LDP and SQ 1-bit randomizers suffice! Ø Information-theoretic lower bounds 12

Histograms Every participant has x # {1,2,, d}. Histogram is h x = n 5, n 6,, n` where n b = # i: x # = j Straightforward protocol: Map each x # to indicator vector e ef Ø So h x = # e ef Ø Q h x i : Apply Q to each entry of e ef. Proposition: Q ( ) is ε-ldp and E k Q h x # # [Mishra Sandler 2006, Hsu Khanna Roth 2012, Erlingsson, Pihur, Korolova 2014, Bassily Smith 2015, ] h x l e ef = (0,0,, 0,1,0,, 0) n log d ε x # Q (e ef ) = (Q(0),, Q(1),, Q(0)) optimal Central: log 1/δ O ε 13

Succinctness Randomized response has optimal error 7 YZ[ ` 1 Ø Problem: Communication and server-side storage O(d) Ø How much is really needed? Theorem [Thakurta et al]: Oq ε n logd space. Lower bound (for large d) Ø Have to store all the elements with counts at least ε 7 YZ[ `. Ø Each one takes log d bits. Upper bound idea: Ø [Hsu, Khanna, Roth 12, Bassily, S 15] Connection to heavy hitters algorithms from streaming Ø Adapt CountMin sketch of [Cormode Muthukrishnan] 14

Succinct Frequency Oracle Data structure that allow us to estimate n b for any j Ø Can get whole histogram in time O(d) Select k log d hash functions g t : d 1 7 YZ[ ` Ø Divide users into k groups Ø m-th group constructs histogram for g t (x # ) Aggregator stores k histograms Ø count z j = median count z t j m = 1,, k Ø Corresponds to CountMin hash [Cormode Muthukrishnan] 15

Efficient Histograms When d is large, want list of large counts Ø Explicit query for all items: O d time Time-efficient protocols with (near-)optimal error exist based on Ø error-correcting codes [Bassily S 15] Ø Prefix search (à la [Cormode Muthukrishnan 03]) All unattributed heuristics are probably due to Frank McSherry --A. Thakurta Worse error, better space Open question: exactly optimal error, optimal space 16

Other things we can do Estimating averages in other norms [DJW 13] Ø Useful special cases: Histogram with small l 5 error (in small domains) l 6 bounded vectors (problem set) Convex optimization [DJW 13, S Thakurta Uphadhyay 17] Ø Via gradient descent (tomorrow) Selection problems [other papers] Ø Find most-liked Facebook page Ø Find most-liked Facebook pages with k likes per user 17

Outline Some stuff we can do Ø Heavy hitters Some stuff we cannot do Ø LDP and SQ 1-bit randomizers suffice! Ø Information-theoretic lower bounds 18

SQ Algorithms simulate LDP protocols Roughly: Every LDP algorithm with n data points can be simulated by an SQ algorithm with O(n ) data points. Ø Actually a distributional statement: assume that data drawn i.i.d from some distribution P Key piece: Transform the randomizer so only 1 bit is sent to aggregator by each participant. 19

One-bit randomizer [Nissim Raskhodnikova S 2007, McGregor, Mironov, Pitassi, Reingold, Talwar, Vadhan 2010, Bassily S 15] Participant x R Aggregator Participant x R z R(0) b {0,1} Aggregator Outputs z iff b = 1 Theorem: There is a ε-dp R such that for every x: Ø Conditioned on B = 1, output Z distributed as R(x) Ø Pr B = 1 = 1/2 Replacing R by R Ø Lowers communication from participant to 1 bit; Ø Randomly drops an 1/2 fraction of data points Ø No need to send z: Use pseudorandom generator. 20

Proof Participant x R z R(0) b {0,1} Aggregator Outputs z iff b = 1 Algorithm R (x, z): Ø Compute p e, = 5 6 Ž e Ž Ø Return B = 1 with probability p x,z Notice that p is always in,, so R is ε-dp 6 6 Pr select z and B = 1 = 1 Pr R x = z Pr R 0 = z 2 Pr R 0 = z = 1 Pr (R x = z) 2 So Pr B = 1 = 5 6 and Z 5 R(x). 21

Connection to SQ An SQ query can evaluate the average of p ef, over a large set of data points x # When x 5,, x 7 drawn i.i.d. from P, we can sample Z R(X) where X P E e p e, = 1 Pr R X = z where X P 2 Pr R 0 = z This allows us to simulate each message to the LDP algorithm. Central DP LDP = SQ 22

Information-theoretic lower bounds As with (ε, 0)-DP, lower bounds for (ε, δ)-dp are relatively easy to prove via packing arguments For local algorithms, easier to use informationtheoretic framework [BNO 10, DJW 13] Ø Applies to δ > 0 case. Idea: Suppose X 5,, X 7 P i.i.d., show that protocol leaks little information about P 23

Information-theoretic framework Lemma: If R is ε-dp, then I X; R X O(ε 6 ) Proof: For any two distributions with p y e ±1 q(y), KL(p q = Stronger Lemma: If R is ε-dp, and x w. p. α W x = 0. w. p. 1 α, then I X; R W(X) O α 6 ε 6. Proof: Show R W is O(αε)-DP. 24

Bounding the information about the data Suppose we sample V from some distribution P and consider X 5 = X 6 = = X 7 = V Ø Let Z # = R(X # ) for some ε-dp randomizer R Then I V; Z 5,, Z 7 Theorem: I V; A(Z 5,, Z 7 ) ε 6 n 25

Lower bound for mode (and histograms) Every participant has x # {1,2,, d}. Consider V uniform in {1,, d} Ø X = (V, V,., V) Ø A histogram algorithm with relative error α 5 6 will output V (with high probability) Fano s inequality: If A = V with constant probability and V uniform on {1,, d}, then I V; A = Ω(log d) But I V; A ε 6 n, so we need n = Ω YZ[ ` 1 \ to get nontrivial error. Ø Upper bound O YZ[ ` 1 \ 7 is tight for constant α 26

Subconstant α Let V be uniform in {1,, d}, and consider data set Y # = W V (erase with prob 1 α) Ø Each data set has αn copies of V, the rest is 0. Ø An algorithm with error α/2 will output V with high prob A sees Z # = R(W V ) Ø By stronger lemma, I V; A O(α 6 ε 6 n) Ø So Ω log d O(α 6 ε 6 n), or α = Ω YZ[ ` 1 \ 7, as desired. 27

Outline Some stuff we can do Ø SQ learning Ø Heavy hitters Some stuff we cannot do Ø LDP and SQ 1-bit randomizers suffice! Ø Information-theoretic lower bounds 28

Local Model for Privacy A Q 5 Q 6 Untrusted aggregator local random coins Q 7 A Apple, Google deployments use local model Open questions Ø Efficient, network-friendly MPC protocols for simulating exponential mechanism in local model Ø Interaction in optimization (tomorrow) Ø Other tasks? 29