We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Talks - Vikram Waradpande: You've got trust issues, we've got solutions: Differential Privacy
Learn how Differential Privacy enables population data analysis while protecting individual privacy through noise injection, epsilon budgets, and the PyDP library.
-
Differential Privacy allows analyzing population-level data while protecting individual privacy by adding controlled noise to query results
-
Key mechanisms include Laplacian, Gaussian and exponential noise distribution, with noise proportional to data sensitivity and inversely proportional to privacy budget (epsilon)
-
Epsilon parameter controls privacy-utility tradeoff - smaller epsilon means more privacy but less accuracy, typical range is 0.1-5
-
Simple data anonymization (removing names/SSNs) is insufficient due to linkage attacks using auxiliary datasets
-
PyDP library provides Python implementation of differential privacy algorithms including:
- Bounded mean/sum calculations
- Support for incremental computation on large datasets
- Machine learning algorithm integration
- Multiple noise mechanisms
-
Important considerations when implementing:
- Understanding data sensitivity
- Choosing appropriate epsilon values
- Selecting right DP algorithm for use case
- Evaluating accuracy requirements
- Memory constraints with large datasets
-
Not suitable for:
- Individual-level analysis
- Fraud detection
- Cases requiring exact answers
- Very small datasets where noise would be too large
-
Two main approaches:
- Local DP: noise added before data storage
- Global DP: centralized trusted database adds noise to query results