UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Optimising Language Models for Downstream Tasks: A Post-Training Perspective

Shi, Zhengyan; (2025) Optimising Language Models for Downstream Tasks: A Post-Training Perspective. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of _Zhengxiang___PhD_Thesis__Adapting_Language_Models_to_downstream_tasks.pdf]
Preview
Text
_Zhengxiang___PhD_Thesis__Adapting_Language_Models_to_downstream_tasks.pdf - Accepted Version

Download (3MB) | Preview

Abstract

Language models (LMs) have demonstrated remarkable capabilities in natural language processing (NLP) tasks, yet harnessing their full potential for specific applications remains a significant challenge. As the scale and complexity of these models continue to grow, adapting them efficiently and robustly becomes increasingly challenging. The common paradigm of fine-tuning LMs on labelled data often fails to effectively leverage the vast amount of available unlabelled data and can lead to overfitting on small, task-specific datasets, and incurs substantial computational costs. These limitations are particularly problematic in real-world scenarios that present an open-ended landscape of language tasks and domains. This thesis proposes a series of innovative methods for tailoring LMs to task-specific applications, addressing key challenges in adapting these models to downstream tasks. First, we explore strategies for maximising the use of unlabelled data in scenarios with limited labelled resources. Our goal is to extract task-relevant knowledge from unlabelled data to improve LM performance on specific tasks, allowing for more robust alignment with task demands. This research led to the development of novel continued pre-training techniques that outperform state-of-the-art semi-supervised approaches. Next, we introduce a new parameter-efficient fine-tuning method that substantially reduces the memory and time costs associated with fine-tuning LMs. This method facilitates more efficient alignment of models to downstream tasks, making the fine-tuning process more feasible while maintaining competitive performance. Additionally, we improve supervised fine-tuning approaches to strengthen LMs’ ability to follow instructions, particularly in situations with limited learning resources. This enables LMs to perform more effectively across various NLP tasks, including open-ended generation tasks, enhancing their flexibility and usefulness in real-world applications. Furthermore, to better understand and assess LM performance on specific downstream tasks, we create new benchmarks and evaluation methods. These include tests for complex cognitive abilities such as multi-hop spatial reasoning, providing more comprehensive and nuanced ways to evaluate LM capabilities and adaptations. Through extensive empirical evaluations across a diverse set of NLP tasks, our findings demonstrate that these proposed methods largely improve the robustness, efficiency, and generalisation of LMs to a broad array of language tasks. The approaches presented in this thesis represent promising steps toward more robust and efficient LMs, bringing us closer to achieving artificial general intelligence.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Optimising Language Models for Downstream Tasks: A Post-Training Perspective
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Civil, Environ and Geomatic Eng
URI: https://discovery.ucl.ac.uk/id/eprint/10209508
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item