Operator World Models for Reinforcement Learning

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Operator World Models for Reinforcement Learning

Novelli, Pietro; Pratticò, Marco; Pontil, Massimiliano; Ciliberto, Carlo; (2024) Operator World Models for Reinforcement Learning. In: Globersons, Amir and Mackey, Lester and Belgrave, Danielle and Fan, Angela and Paquet, Ulrich and Tomczak, Jakub M and Zhang, Cheng, (eds.) Proceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024). (pp. pp. 1-32). NeurIPS Green open access

[thumbnail of 20052_Operator_World_Models_fo.pdf]

Preview

PDF
20052_Operator_World_Models_fo.pdf - Published Version
Download (774kB) | Preview

Abstract

Policy Mirror Descent (PMD) is a powerful and theoretically sound methodology for sequential decision-making. However, it is not directly applicable to Reinforcement Learning (RL) due to the inaccessibility of explicit action-value functions. We address this challenge by introducing a novel approach based on learning a world model of the environment using conditional mean embeddings. Leveraging tools from operator theory we derive a closed-form expression of the action-value function in terms of the world model via simple matrix operations. Combining these estimators with PMD leads to POWR, a new RL algorithm for which we prove convergence rates to the global optimum. Preliminary experiments in finite and infinite state settings support the effectiveness of our method.

Type:	Proceedings paper
Title:	Operator World Models for Reinforcement Learning
Event:	38th Conference on Neural Information Processing Systems (NeurIPS 2024)
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://papers.nips.cc/paper_files/paper/2024
Language:	English
Additional information:	This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10205687

Downloads since deposit

3Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item