UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

DiCE: The Infinitely Differentiable Monte-Carlo Estimator

Foerster, J; Farquhar, G; Al-Shedivat, M; Rocktäschel, T; Xing, EP; Whiteson, S; (2018) DiCE: The Infinitely Differentiable Monte-Carlo Estimator. In: Dy, Jennifer and Krause, Andreas, (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research: Stockholm Sweden. Green open access

[thumbnail of Whiteson+et+al,+DiCE+-+The+infinitely+differentiable+Monte+Carlo+estimator.pdf]
Preview
Text
Whiteson+et+al,+DiCE+-+The+infinitely+differentiable+Monte+Carlo+estimator.pdf - Published Version

Download (1MB) | Preview

Abstract

The score function estimator is widely used for estimating gradients of stochastic objectives in stochastic computation graphs (SCG), eg, in reinforcement learning and meta-learning. While deriving the first-order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher-order derivatives is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order derivative involves increasingly cumbersome graph manipulations. Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives. To address all these shortcomings in a unified way, we introduce DiCE, which provides a single objective that can be differentiated repeatedly, generating correct estimators of derivatives of any order in SCGs. Unlike SL, DiCE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DiCE both through a proof and numerical evaluation of the DiCE derivative estimates. We also use DiCE to propose and evaluate a novel approach for multi-agent learning. Our code is available at https://www.github.com/alshedivat/lola.

Type: Proceedings paper
Title: DiCE: The Infinitely Differentiable Monte-Carlo Estimator
Event: 35th International Conference on Machine Learning
Open access status: An open access version is available from UCL Discovery
Publisher version: http://proceedings.mlr.press/v80/foerster18a.html
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10074614
Downloads since deposit
20Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item