Jiang, M;
Grefenstette, E;
Rocktäschel, T;
(2021)
Prioritized Level Replay.
In: Meila, M and Zhang, T, (eds.)
Proceedings of the 38th International Conference on Machine Learning.
(pp. pp. 4940-4950).
PMLR: Proceedings of Machine Learning Research: Online Only.
Preview |
Text
jiang21b.pdf - Published Version Available under License : See the attached licence file. Download (2MB) | Preview |
Abstract
Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning. In this setting, each level is an algorithmically created environment instance with a unique configuration of its factors of variation. Training on a prespecified subset of levels allows for testing generalization to unseen levels. What can be learned from a level depends on the current policy, yet prior work defaults to uniform sampling of training levels independently of the policy. We introduce Prioritized Level Replay (PLR), a general framework for selectively sampling the next training level by prioritizing those with higher estimated learning potential when revisited in the future. We show TD-errors effectively estimate a level’s future learning potential and, when used to guide the sampling procedure, induce an emergent curriculum of increasingly difficult levels. By adapting the sampling of training levels, PLR significantly improves sample-efficiency and generalization on Procgen Benchmark—matching the previous state-of-the-art in test return—and readily combines with other methods. Combined with the previous leading method, PLR raises the state-of-the-art to over 76% improvement in test return relative to standard RL baselines.
Type: | Proceedings paper |
---|---|
Title: | Prioritized Level Replay |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | http://proceedings.mlr.press/v139/http://proceedin... |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10132236 |




Archive Staff Only
![]() |
View Item |