UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Investigating Sample Efficient Deep Reinforcement Learning

Mavor-Parker, Augustine N.; (2025) Investigating Sample Efficient Deep Reinforcement Learning. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of Mavor-Parker_Thesis.pdf]
Preview
Text
Mavor-Parker_Thesis.pdf - Accepted Version

Download (13MB) | Preview

Abstract

Deep reinforcement learning (RL) has achieved superhuman performance on games [Silver et al., 2017], learned to control nuclear fusion reactors [Degrave et al., 2022] and can be used to adapt foundation models to human preferences [Achiam et al., 2023]. However, adoption in the real world is hindered by the interconnected problems of sample efficiency and generalisation. In this thesis we study three methods for improving the sample efficiency of deep RL agents: curiosity driven exploration, abstractions of environment states-action pairs and network architectures that learn value functions more quickly. We improve the robustness of curiosity driven agents [Pathak et al., 2017] to noisy transitions by enabling agents to predict the amount of randomness in their environment and then decrease their curiosity about transitions that are predicted to be stochastic. Next, to reduce the number of states an RL agent is required to explore, we learn forwards and backwards models of the environment to abstract away equivalent state-action pairs, which we formalise with MDP homomorphisms [Ravindran, 2004]. Finally, we examine the use of periodic activation functions to improve sample efficiency [Li and Pathak, 2021, Yang et al., 2022], showing that representation frequencies become exceedingly large and hence generalise poorly. To improve their generalisation, we suggest penalising the magnitude of the frequency of periodic representations during learning.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Investigating Sample Efficient Deep Reinforcement Learning
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10211224
Downloads since deposit
36Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item