UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Towards Scalable, Flexible, and Interpretable Self-Supervised Learning for Multiview Biomedical Data

Chapman, James; (2024) Towards Scalable, Flexible, and Interpretable Self-Supervised Learning for Multiview Biomedical Data. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of main.pdf]
Preview
Text
main.pdf - Accepted Version

Download (5MB) | Preview

Abstract

Imagine a world where images of you, data from your smartwatch, and your electronic health records could seamlessly integrate to paint a comprehensive picture of your health. Now, envision this on a global scale, where vast amounts of diverse biomedical data are harnessed to target drug trials, personalize treatment, and improve the lives of millions. This is the promise of multiview learning in the era of big data, but it comes with a significant challenge: How can we effectively integrate and analyze these complex, heterogeneous data sources? The need for novel methods to tackle this challenge is paramount. Traditional approaches often struggle with the sheer scale and intricacy of modern biomedical datasets, limiting our ability to uncover crucial insights and advance personalized medicine. This thesis addresses this critical need by developing cutting-edge machine learning techniques that leverage the power of self-supervised and multiview learning, focusing on improving Canonical Correlation Analysis (CCA) for enormous datasets with rich structure and complex, possibly non-linear relationships. The primary contributions of this thesis are fourfold. First, a framework for regularised CCA using structured priors is developed, enhancing the interpretability of the results. Second, simulated data generation methods for CCA are unified under a latent variable model perspective, improving our understanding of the relationship between loadings and weights in CCA. Third, a new gradient descent approach for CCA and other generalised eigenvalue problems is formulated, tailored for large datasets. Finally, this gradient descent approach is extended to Deep CCA and Joint Embedding Self-Supervised Learning, enabling the integration of diverse data sources using modern deep learning techniques. Finally, we make all of our code and simulated data publicly available, ensuring that our research is reproducible and accessible to the wider scientific community.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Towards Scalable, Flexible, and Interpretable Self-Supervised Learning for Multiview Biomedical Data
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2024. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10201103
Downloads since deposit
Loading...
15Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item