Reinforcement Learning in Presence of Discrete Markovian Context Evolution

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Bookmark & Share

Reinforcement Learning in Presence of Discrete Markovian Context Evolution

Ren, H; Sootla, A; Jafferjee, T; Shen, J; Wang, J; Bou-Ammar, H; (2022) Reinforcement Learning in Presence of Discrete Markovian Context Evolution. In: ICLR 2022 - 10th International Conference on Learning Representations. ICLR Green open access

[thumbnail of 2638_reinforcement_learning_in_pres.pdf]

Preview

Text
2638_reinforcement_learning_in_pres.pdf - Published Version
Download (3MB) | Preview

Abstract

We consider a context-dependent Reinforcement Learning (RL) setting, which is characterized by: a) an unknown finite number of not directly observable contexts; b) abrupt (discontinuous) context changes occurring during an episode; and c) Markovian context evolution. We argue that this challenging case is often met in applications and we tackle it using a Bayesian model-based approach and variational inference. We adapt a sticky Hierarchical Dirichlet Process (HDP) prior for model learning, which is arguably best-suited for infinite Markov chain modeling. We then derive a context distillation procedure, which identifies and removes spurious contexts in an unsupervised fashion. We argue that the combination of these two components allows inferring the number of contexts from data thus dealing with the context cardinality assumption. We then find the representation of the optimal policy enabling efficient policy learning using off-the-shelf RL algorithms. Finally, we demonstrate empirically (using gym environments cart-pole swing-up, drone, intersection) that our approach succeeds where state-of-the-art methods of other frameworks fail and elaborate on the reasons for such failures.

Type:	Proceedings paper
Title:	Reinforcement Learning in Presence of Discrete Markovian Context Evolution
Event:	ICLR 2022 - 10th International Conference on Learning Representations
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://openreview.net/forum?id=CmsfC7u054S
Language:	English
Additional information:	This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords:	context-dependent Reinforcement Learning, model-based reinforcement learning, hierarchical Dirichlet process
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10168071

Downloads since deposit

30Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item