Muszynski, Rafal;
(2020)
The role of reward signal in deep reinforcement learning.
Masters thesis (M.Phil), UCL (University College London).
Preview |
Text
main.pdf - Accepted Version Download (826kB) | Preview |
Abstract
The goal of the thesis is to study the role of the reward signal in deep reinforcement learning. The reward signal is a scalar quantity received by the agent, and it has a big impact on both the training process of a reinforcement learning algorithm and its resulting behaviour. Firstly, we study the behaviour of an agent that is learning with different reward signals in the same environment with the same learning algorithm. We introduce and measure agents’ happiness as a relation between agents’ actual reward obtained from the environment, as compared to the possible maximum and minimum rewards in a given setting. The experiments show that the rewards intended to result in a given behaviour during training do not result in the same behaviour when agents interact with each other. Secondly, we use these observations to investigate the role of the reward signal further. Namely, we explore the space of all possible reward signals in a given environment through an evolutionary algorithm. Through experiments, we demonstrate that it is possible to learn complex behaviours of winning, losing, and cooperating through reward signal evolution. Some of the solutions found by the algorithm are surprising, in the sense that they would probably not have been chosen by a person trying to hand-code a given behaviour through a specific reward signal. The results presented in the thesis indicate that the role of the reward signal in reinforcement learning is likely bigger than indicated by its current coverage in the literature and is worth investigating in greater detail. Not only can it lead to programmes with less overfitting, but it can also improve our understanding of what reinforcement learning algorithms are really learning. This in turn will give us more robust, explainable, and overall safer systems.
Type: | Thesis (Masters) |
---|---|
Qualification: | M.Phil |
Title: | The role of reward signal in deep reinforcement learning |
Event: | UCL (University College London) |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2020. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10108297 |
Archive Staff Only
View Item |