Mavor-Parker, Augustine N.;
(2025)
Investigating Sample Efficient Deep Reinforcement Learning.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Mavor-Parker_Thesis.pdf - Accepted Version Download (13MB) | Preview |
Abstract
Deep reinforcement learning (RL) has achieved superhuman performance on games [Silver et al., 2017], learned to control nuclear fusion reactors [Degrave et al., 2022] and can be used to adapt foundation models to human preferences [Achiam et al., 2023]. However, adoption in the real world is hindered by the interconnected problems of sample efficiency and generalisation. In this thesis we study three methods for improving the sample efficiency of deep RL agents: curiosity driven exploration, abstractions of environment states-action pairs and network architectures that learn value functions more quickly. We improve the robustness of curiosity driven agents [Pathak et al., 2017] to noisy transitions by enabling agents to predict the amount of randomness in their environment and then decrease their curiosity about transitions that are predicted to be stochastic. Next, to reduce the number of states an RL agent is required to explore, we learn forwards and backwards models of the environment to abstract away equivalent state-action pairs, which we formalise with MDP homomorphisms [Ravindran, 2004]. Finally, we examine the use of periodic activation functions to improve sample efficiency [Li and Pathak, 2021, Yang et al., 2022], showing that representation frequencies become exceedingly large and hence generalise poorly. To improve their generalisation, we suggest penalising the magnitude of the frequency of periodic representations during learning.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Investigating Sample Efficient Deep Reinforcement Learning |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science UCL |
URI: | https://discovery.ucl.ac.uk/id/eprint/10211224 |
Archive Staff Only
![]() |
View Item |