UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

Jiang, M; Dennis, M; Parker-Holder, J; Lupu, A; Küttler, H; Grefenstette, E; Rocktäschel, T; (2022) Grounding Aleatoric Uncertainty for Unsupervised Environment Design. In: Advances in Neural Information Processing Systems. NIPS Green open access

[thumbnail of 5783_grounding_aleatoric_uncertaint.pdf]
Preview
PDF
5783_grounding_aleatoric_uncertaint.pdf - Published Version

Download (1MB) | Preview

Abstract

Adaptive curricula in reinforcement learning (RL) have proven effective for producing policies robust to discrepancies between the train and test environment. Recently, the Unsupervised Environment Design (UED) framework generalized RL curricula to generating sequences of entire environments, leading to new methods with robust minimax regret properties. Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution. We formalize this phenomenon as curriculum-induced covariate shift (CICS), and describe how its occurrence in aleatoric parameters can lead to suboptimal policies. Directly sampling these parameters from the ground-truth distribution avoids the issue, but thwarts curriculum learning. We propose SAMPLR, a minimax regret UED method that optimizes the ground-truth utility function, even when the underlying training data is biased due to CICS. We prove, and validate on challenging domains, that our approach preserves optimality under the ground-truth distribution, while promoting robustness across the full range of environment settings.

Type: Proceedings paper
Title: Grounding Aleatoric Uncertainty for Unsupervised Environment Design
Event: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Open access status: An open access version is available from UCL Discovery
Publisher version: https://papers.nips.cc/
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10173890
Downloads since deposit
7Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item