UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data

Ganev, Georgi; Oprisanu, Bristena; Cristofaro, Emiliano De; (2022) Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data. Proceedings of the 39th International Conference on Machine Learning (ICML 2022) , 162 pp. 6944-6959. Green open access

[thumbnail of De Cristofaro_Robin Hood and Matthew Effects_VoR.pdf]
Preview
Text
De Cristofaro_Robin Hood and Matthew Effects_VoR.pdf

Download (1MB) | Preview

Abstract

Generative models trained with Differential Privacy (DP) can be used to generate synthetic data while minimizing privacy risks. We analyze the impact of DP on these models vis-a-vis underrepresented classes/subgroups of data, specifically, studying: 1) the size of classes/subgroups in the synthetic data, and 2) the accuracy of classification tasks run on them. We also evaluate the effect of various levels of imbalance and privacy budgets. Our analysis uses three state-of-the-art DP models (PrivBayes, DP-WGAN, and PATE-GAN) and shows that DP yields opposite size distributions in the generated synthetic data. It affects the gap between the majority and minority classes/subgroups; in some cases by reducing it (a "Robin Hood" effect) and, in others, by increasing it (a "Matthew" effect). Either way, this leads to (similar) disparate impacts on the accuracy of classification tasks on the synthetic data, affecting disproportionately more the underrepresented subparts of the data. Consequently, when training models on synthetic data, one might incur the risk of treating different subpopulations unevenly, leading to unreliable or unfair conclusions.

Type: Article
Title: Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.mlr.press/v162/ganev22a.html
Language: English
Additional information: © The Author 2022. Original content in this paper is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10155764
Downloads since deposit
48Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item