Pignatelli, Eduardo;
(2025)
The Credit Assignment Problem in Reinforcement Learning.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Pignatelli_10211495_Thesis.pdf Download (2MB) | Preview |
Abstract
Reinforcement Learning (RL) has made significant progress in a variety of domains, from playing games to controlling robots. However, while RL works reasonably well in problems when rewards are dense and immediate, it becomes significantly harder when these are sparse and delayed. This is the case of most real-world decision problems: they take a long time to complete, and they seldom provide immediate feedback, but often with delay and little insight as to which actions caused it. The problem of learning to associate actions with their long-term, outcomes is known as the temporal Credit Assignment Problem (CAP): to distribute the credit of success among the multitude of decisions involved (Minsky, 1961). This dissertation stems from the idea that improving the ability to predict – i.e., to assign credit – is the most effective way to enhance the agents’ ability to make optimal decisions – to control – in a broad range of tasks. The manuscript is then a collection of experiments and theoretical contributions with two aims: to better understand the CAP, and to propose new methods to address it. We provide a comprehensive survey of the field, the first after the CAP was first introduced by Minsky (1961). We realign the original CAP to Deep RL, organise the set of methods into a coherent perspective, and define a call for action for scaling RL to real-world problems. On this call, we then investigate AI-assisted RL, using the prior knowledge and reasoning capabilities of Large Language Models (LLMs) to assist and supervise RL training. Finally, we focus on closing the gap between the infeasible computational demand of RL and the limited resources available in academia, reimplementing MiniGrid, a popular RL benchmark, in a more efficient and scalable way.
| Type: | Thesis (Doctoral) |
|---|---|
| Qualification: | Ph.D |
| Title: | The Credit Assignment Problem in Reinforcement Learning |
| Open access status: | An open access version is available from UCL Discovery |
| Language: | English |
| Additional information: | Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Licence (https://creativecommons.org/licenses/by-nc-nd/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
| Keywords: | Reinforcement Learning, Credit Assignment, Language Models, LLMs, NAVIX, CALM, Survey |
| UCL classification: | UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng UCL |
| URI: | https://discovery.ucl.ac.uk/id/eprint/10211495 |
Archive Staff Only
![]() |
View Item |

