eprintid: 10137834
rev_number: 22
eprint_status: archive
userid: 608
dir: disk0/10/13/78/34
datestamp: 2021-12-16 15:37:15
lastmod: 2021-12-16 15:37:15
status_changed: 2021-12-16 15:37:15
type: thesis
metadata_visibility: show
creators_name: Shah, Harshil Bharat
title: Deep Generative Models for Natural Language
ispublished: unpub
divisions: UCL
divisions: B04
divisions: C05
divisions: F48
note: Copyright © The Author 2021. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
abstract: Generative models aim to simulate the process by which a set of data is generated. They are intuitive, interpretable, and naturally suited to learning from unlabelled data. This is particularly appealing in natural language processing, where labels are often costly to obtain and can require significant manual input from trained annotators. However, traditional generative modelling approaches can often be inflexible due to the need to maintain tractable maximum likelihood training. On the other hand, deep learning methods are powerful, flexible, and have achieved significant success on a wide variety of natural language processing tasks. In recent years, algorithms have been developed for training generative models that incorporate neural networks to parametrise their conditional distributions. These approaches aim to take advantage of the intuitiveness and interpretability of generative models as well as the power and flexibility of deep learning. In this work, we investigate how to leverage such algorithms in order to develop deep generative models for natural language. Firstly, we present an attention-based latent variable model, trained using unlabelled data, for learning representations of sentences. Experiments such as missing word imputation and sentence similarity matching suggest that the representations are able to learn semantic information about the sentences. We then present an RNN-based latent variable model for per- forming machine translation. Trained using semi-supervised learning, our approach achieves strong results even with very limited labelled data. Finally, we present a locally-contextual conditional random field for performing sequence labelling tasks. Our method consistently outperforms the linear chain conditional random field and achieves state of the art performance on two out of the four tasks evaluated.
date: 2021-11-28
date_type: published
oa_status: green
full_text_type: other
thesis_class: doctoral_open
thesis_award: Ph.D
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 1897575
lyricists_name: Shah, Harshil
lyricists_id: HSHAH40
actors_name: Shah, Harshil
actors_id: HSHAH40
actors_role: owner
full_text_status: public
pages: 98
event_title: UCL (University College London)
institution: UCL (University College London)
department: Computer Science
thesis_type: Doctoral
citation: Shah, Harshil Bharat; (2021) Deep Generative Models for Natural Language. Doctoral thesis (Ph.D), UCL (University College London). Green open access

document_url: https://discovery.ucl.ac.uk/id/eprint/10137834/1/PhD_Thesis_Revised.pdf