Differentially Private Synthetic Data without Training
Generating differentially private (DP) synthetic data that closely resembles original data while preserving user privacy is a scalable solution to address privacy concerns in today’s data-driven world.
In this talk, I will introduce Private Evolution (PE), a new training-free framework for DP synthetic data generation, which contrasts with existing approaches that rely on training DP generative models. PE treats foundation models as blackboxes and only utilizes their inference APIs. We demonstrate that across both images and text, PE: (1) matches or even outperforms prior state-of-the-art (SoTA) methods in the fidelity-privacy trade-off without any model training; (2) enables the use of advanced open-source models (e.g., Mixtral) and API-based models (e.g., GPT-3.5), where previous SoTA approaches are inapplicable; and (3) is more computationally efficient than prior SoTA methods.
Additionally, I will discuss recent extensions of PE–both from our work and contributions from the broader community–including the integration of data simulators, fusion of knowledge from multiple models for DP data synthesis, and applications in federated learning. We hope that PE unlocks the full potential of foundation models in privacy-preserving machine learning and accelerates the adoption of DP synthetic data across industries.
- Series:
- Cryptography Talk Series
- Date:
Series: Cryptography Talk Series
-
-
How to Compress Garbled Circuit Input Labels, Efficiently
Speakers:- Hanjun Li
-
Attestations over TLS 1.3 and ZKP
Speakers:- Sofía Celi
-
A Closer Look at Falcon
Speakers:- Jonas Janneck
-
Quantum Lattice Enumeration in Limited Depth, Fernando Virdia
Speakers:- Fernando Virdia
-
-
Improving the Security of United States Elections with Robust Optimization
Speakers:- Brad Sturt
-
TrustRate: A Decentralized Platform for Hijack-Resistant Anonymous Reviews
Speakers:- Rohit Dwivedula