A regularized opponent model with maximum entropy objective

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

A regularized opponent model with maximum entropy objective

Tian, Z; Wen, Y; Gong, Z; Punakkath, F; Zou, S; Wang, J; (2019) A regularized opponent model with maximum entropy objective. In: Kraus, S, (ed.) Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19). (pp. pp. 602-608). International Joint Conferences on Artifical Intelligence (IJCAI): Macao, China. Green open access

Preview

Text
0085.pdf - Published Version
Download (427kB) | Preview

Abstract

In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the “optimality”. In this paper, we redefine the binary random variable o in multi-agent setting and formalize multi-agent reinforcement learning (MARL) as probabilistic inference. We derive a variational lower bound of the likelihood of achieving the optimality and name it as Regularized Opponent Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel perspective on opponent modeling and show how it can improve the performance of training agents theoretically and empirically in cooperative games. To optimize ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of convergence. We extend the exact algorithm to complex environments by proposing an approximate version, ROMMEO-AC. We evaluate these two algorithms on the challenging iterated matrix game and differential game respectively and show that they can outperform strong MARL baselines.

Type:	Proceedings paper
Title:	A regularized opponent model with maximum entropy objective
Event:	Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19)
ISBN-13:	978-0-9992411-4-1
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://www.ijcai.org/Proceedings/2019/
Language:	English
Additional information:	This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > UCL School of Management
URI:	https://discovery.ucl.ac.uk/id/eprint/10091833

Downloads since deposit

83Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item