UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Expected Eligibility Traces

van Hasselt, H; Madjiheurem, S; Hessel, M; Barreto, A; Silver, D; Borsa, D; (2021) Expected Eligibility Traces. In: Proceedings of the AAAI Conference on Artificial Intelligence. Association for the Advancement of Artificial Intelligence (AAAI) (In press). Green open access

[thumbnail of Madjiheurem_2007.01839.pdf]
Preview
Text
Madjiheurem_2007.01839.pdf - Accepted Version

Download (629kB) | Preview

Abstract

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence. Eligibility traces enable efficient credit assignment to the recent sequence of states and actions experienced by the agent, but not to counterfactual sequences that could also have led to the current state. In this work, we introduce expected eligibility traces. Expected traces allow, with a single update, to update states and actions that could have preceded the current state, even if they did not do so on this occasion. We discuss when expected traces provide benefits over classic (instantaneous) traces in temporal-difference learning, and show that some- times substantial improvements can be attained. We provide a way to smoothly interpolate between instantaneous and expected traces by a mechanism similar to bootstrapping, which ensures that the resulting algorithm is a strict generalisation of TD(λ). Finally, we discuss possible extensions and connections to related ideas, such as successor features.

Type: Proceedings paper
Title: Expected Eligibility Traces
Event: AAAI Conference on Artificial Intelligence
Location: Virtual
Dates: 02 February 2021 - 09 February 2021
Open access status: An open access version is available from UCL Discovery
Publisher version: https://ojs.aaai.org/index.php/AAAI/article/view/1...
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Reinforcement Learning
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI: https://discovery.ucl.ac.uk/id/eprint/10129837
Downloads since deposit
181Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item