UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Operator World Models for Reinforcement Learning

Novelli, Pietro; Pratticò, Marco; Pontil, Massimiliano; Ciliberto, Carlo; (2024) Operator World Models for Reinforcement Learning. In: Globersons, Amir and Mackey, Lester and Belgrave, Danielle and Fan, Angela and Paquet, Ulrich and Tomczak, Jakub M and Zhang, Cheng, (eds.) Proceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024). (pp. pp. 1-32). NeurIPS Green open access

[thumbnail of 20052_Operator_World_Models_fo.pdf]
Preview
PDF
20052_Operator_World_Models_fo.pdf - Published Version

Download (774kB) | Preview

Abstract

Policy Mirror Descent (PMD) is a powerful and theoretically sound methodology for sequential decision-making. However, it is not directly applicable to Reinforcement Learning (RL) due to the inaccessibility of explicit action-value functions. We address this challenge by introducing a novel approach based on learning a world model of the environment using conditional mean embeddings. Leveraging tools from operator theory we derive a closed-form expression of the action-value function in terms of the world model via simple matrix operations. Combining these estimators with PMD leads to POWR, a new RL algorithm for which we prove convergence rates to the global optimum. Preliminary experiments in finite and infinite state settings support the effectiveness of our method.

Type: Proceedings paper
Title: Operator World Models for Reinforcement Learning
Event: 38th Conference on Neural Information Processing Systems (NeurIPS 2024)
Open access status: An open access version is available from UCL Discovery
Publisher version: https://papers.nips.cc/paper_files/paper/2024
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10205687
Downloads since deposit
Loading...
3Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item