UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Efficient Reinforcement Learning with Large Language Model Priors

Yan, X; Song, Y; Feng, X; Yang, M; Zhang, H; Ammar, HB; Wang, J; (2025) Efficient Reinforcement Learning with Large Language Model Priors. In: Proceedings of the 13th International Conference on Representation Learning 2025 (ICLR 2025). (pp. pp. 30818-30842). ICLR Green open access

[thumbnail of 13493_Efficient_Reinforcement_.pdf]
Preview
PDF
13493_Efficient_Reinforcement_.pdf - Accepted Version

Download (783kB) | Preview

Abstract

In sequential decision-making tasks, methods like reinforcement learning (RL) and heuristic search have made notable advances in specific cases. However, they often require extensive exploration and face challenges in generalizing across diverse environments due to their limited grasp of the underlying decision dynamics. In contrast, large language models (LLMs) have recently emerged as powerful general-purpose tools, due to their capacity to maintain vast amounts of domain-specific knowledge. To harness this rich prior knowledge for efficiently solving complex sequential decision-making tasks, we propose treating LLMs as prior action distributions and integrating them into RL frameworks through Bayesian inference methods, making use of variational inference and direct posterior sampling. The proposed approaches facilitate the seamless incorporation of fixed LLM priors into both policy-based and value-based RL frameworks. Our experiments show that incorporating LLM-based action priors significantly reduces exploration and optimization complexity, substantially improving sample efficiency compared to traditional RL techniques, e.g., using LLM priors decreases the number of required samples by over 90% in offline learning scenarios.

Type: Proceedings paper
Title: Efficient Reinforcement Learning with Large Language Model Priors
Event: 13th International Conference on Representation Learning 2025 (ICLR 2025)
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.iclr.cc/paper_files/paper/2025...
Language: English
Additional information: © The Authors 2025. Original content in this paper is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-sa/4.0/).
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10212276
Downloads since deposit
7Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item