UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Probabilistic recursive reasoning for multi-agent reinforcement learning

Wen, Y; Yang, Y; Luo, R; Wang, J; Pan, W; (2019) Probabilistic recursive reasoning for multi-agent reinforcement learning. In: Proceedings of the 7th International Conference on Learning Representations (ICLR 2019). International Conference on Learning Representations (ICLR): New Orleans, LA, USA. Green open access

[thumbnail of probabilistic_recursive_reasoning_for_multi_agent_reinforcement_learning.pdf]
Preview
Text
probabilistic_recursive_reasoning_for_multi_agent_reinforcement_learning.pdf - Published Version

Download (2MB) | Preview

Abstract

Humans are capable of attributing latent mental contents such as beliefs, or intentions to others. The social skill is critical in everyday life to reason about the potential consequences of their behaviors so as to plan ahead. It is known that humans use this reasoning ability recursively, i.e. considering what others believe about their own beliefs. In this paper, we start from level-1 recursion and introduce a probabilistic recursive reasoning (PR2) framework for multi-agent reinforcement learning. Our hypothesis is that it is beneficial for each agent to account for how the opponents would react to its future behaviors. Under the PR2 framework, we adopt variational Bayes methods to approximate the opponents' conditional policy, to which each agent finds the best response and then improve their own policy. We develop decentralized-training-decentralized-execution algorithms, PR2-Q and PR2-Actor-Critic, that are proved to converge in the self-play scenario when there is one Nash equilibrium. Our methods are tested on both the matrix game and the differential game, which have a non-trivial equilibrium where common gradient-based methods fail to converge. Our experiments show that it is critical to reason about how the opponents believe about what the agent believes. We expect our work to contribute a new idea of modeling the opponents to the multi-agent reinforcement learning community.

Type: Proceedings paper
Title: Probabilistic recursive reasoning for multi-agent reinforcement learning
Event: 7th International Conference on Learning Representations (ICLR 2019)
Open access status: An open access version is available from UCL Discovery
Publisher version: https://iclr.cc/Conferences/2019
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > UCL School of Management
URI: https://discovery.ucl.ac.uk/id/eprint/10091835
Downloads since deposit
53Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item