Healing failures and improving generalization in deep generative modelling

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Healing failures and improving generalization in deep generative modelling

Zhang, Mingtian; (2023) Healing failures and improving generalization in deep generative modelling. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of PhD_Thesis_MingtianZhang.pdf]

Preview

Text
PhD_Thesis_MingtianZhang.pdf - Other
Download (2MB) | Preview

Abstract

Deep generative modeling is a crucial and rapidly developing area of machine learning, with numerous potential applications, including data generation, anomaly detection, data compression, and more. Despite the significant empirical success of many generative models, some limitations still need to be addressed to improve their performance in certain cases. This thesis focuses on understanding the limitations of generative modeling in common scenarios and proposes corresponding techniques to alleviate these limitations and improve performance in practical generative modeling applications. Specifically, the thesis is divided into two sub-topics: one focusing on the training and the other on the generalization of generative models. A brief introduction to each sub-topic is provided below. Generative models are typically trained by optimizing their fit to the data distribution. This is achieved by minimizing a statistical divergence between the model and data distributions. However, there are cases where these divergences fail to accurately capture the differences between the model and data distributions, resulting in poor performance of the trained model. In the first part of the thesis, we discuss the two situations where the classic divergences are ineffective for training the models: 1. KL divergence fails to train implicit models for manifold modeling tasks. 2. Fisher divergence cannot distinguish the mixture proportions for modeling target multi-modality distribution. For both failure modes, we investigate the theoretical reasons underlying the failures of KL and Fisher divergences in modelling certain types of data distributions. We propose techniques that address the limitations of these divergences, enabling more reliable estimation of the underlying data distributions. While the generalization of classification or regression models has been extensively studied in machine learning, the generalization of generative models is a relatively under-explored area. In the second part of this thesis, we aim to address this gap by investigating the generalization properties of generative models. Specifically, we investigate two generalization scenarios: 1. In-distribution (ID) generalization of probabilistic models, where the test data and the training data are from the same distribution. 2. Out-of-distribution (OOD) generalization of probabilistic models, where the test data and the training data can come from different distributions. In the context of ID generalization, our emphasis rests on the Variational Auto-Encoder (VAE) model, and for OOD generalization, we primarily explore autoregressive models. By studying the generalization properties of the models, we demonstrate how to design new models or training criteria that improve the performance of practical applications, such as lossless compression and OOD detection. The findings of this thesis shed light on the intricate challenges faced by generative models in both training and generalization scenarios. Our investigations into the inefficacies of classic divergences like KL and Fisher highlight the importance of tailoring modeling techniques to the specific characteristics of data distributions. Additionally, by delving into the generalization aspects of generative models, this work pioneers insights into the ID and OOD scenarios, a domain not extensively covered in current literature. Collectively, the insights and techniques presented in this thesis provide valuable contributions to the community, fostering an environment for the development of more robust and reliable generative models. It's our hope that these take-home messages will serve as a foundation for future research and applications in the realm of deep generative modeling.

Type:	Thesis (Doctoral)
Qualification:	Ph.D
Title:	Healing failures and improving generalization in deep generative modelling
Open access status:	An open access version is available from UCL Discovery
Language:	English
Additional information:	Copyright © The Author 2023. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10178454

Downloads since deposit

132Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item