UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

Wang, H; Sit, MK; He, C; Wen, Y; Zhang, W; Wang, J; Yang, Y; (2023) GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models. In: Proceedings of Machine Learning Research (PMLR). (pp. pp. 36380-36390). ML Research Press Green open access

[thumbnail of wang23aj.pdf]
Preview
PDF
wang23aj.pdf - Published Version

Download (997kB) | Preview

Abstract

This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning (RL) with large sequence models (such as transformers). With such models, existing systems such as Reverb face considerable bottlenecks in memory, computation, and communication. GEAR, however, optimizes memory efficiency by enabling the memory resources on GPU servers (including host memory and device memory) to manage trajectory data. Furthermore, it facilitates decentralized GPU devices to expedite various trajectory selection strategies, circumventing computational bottlenecks. GEAR is equipped with GPU kernels capable of collecting trajectories using zero-copy access to host memory, along with remote-directed-memory access over InfiniBand, improving communication efficiency. Cluster experiments have shown that GEAR can achieve performance levels up to 6× greater than Reverb when training state-of-the-art large RL models. GEAR is open-sourced at https://github.com/bigrl-team/gear.

Type: Proceedings paper
Title: GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
Event: Proceedings of the 40th International Conference on Machine Learning
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.mlr.press/v202/wang23aj.html
Language: English
Additional information: This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third-party material in this article are included in the Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10180074
Downloads since deposit
11Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item