UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Evaluating Methods for Privacy-Preserving Data Sharing in Genomics

Oprisanu, Maria-Bristena; (2022) Evaluating Methods for Privacy-Preserving Data Sharing in Genomics. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of main (3).pdf]
main (3).pdf - Submitted Version

Download (41MB) | Preview


The availability of genomic data is often essential to progress in biomedical re- search, personalized medicine, drug development, etc. However, its extreme sensitivity makes it problematic, if not outright impossible, to publish or share it. In this dissertation, we study and build systems that are geared towards privacy preserving genomic data sharing. We first look at the Matchmaker Exchange, a platform that connects multiple distributed databases through an API and allows researchers to query for genetic variants in other databases through the network. However, queries are broadcast to all researchers that made a similar query in any of the connected databases, which can lead to a reluctance to use the platform, due to loss of privacy or competitive advantage. In order to overcome this reluctance, we propose a framework to support anonymous querying on the platform. Since genomic data’s sensitivity does not degrade over time, we analyze the real-world guarantees provided by the only tool available for long term genomic data storage. We find that the system offers low security when the adversary has access to side information, and we support our claims by empirical evidence. We also study the viability of synthetic data for privacy preserving data sharing. Since for genomic data research, the utility of the data provided is of the utmost importance, we first perform a utility evaluation on generative models for different types of datasets (i.e., financial data, images, and locations). Then, we propose a privacy evaluation framework for synthetic data. We then perform a measurement study assessing state-of-the-art generative models specifically geared for human genomic data, looking at both utility and privacy perspectives. Overall, we find that there is no single approach for generating synthetic data that performs well across the board from both utility and privacy perspectives.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Evaluating Methods for Privacy-Preserving Data Sharing in Genomics
Event: UCL
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2021. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
Keywords: Genomics, Privacy-preserving technologies, Machine Learning, Cryptography
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10142367
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item