UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Open-ended learning in symmetric zero-sum games

Balduzzi, D; Garnelo, M; Bachrach, Y; Czarnecki, W; Pérolat, J; Jaderberg, M; Graepel, T; (2019) Open-ended learning in symmetric zero-sum games. In: Chaudhuri, K and Salakhutdinov, R, (eds.) Proceedings of the 36th International Conference on Machine Learning. (pp. pp. 434-443). PMLR Green open access

[thumbnail of balduzzi19a.pdf]
Preview
Text
balduzzi19a.pdf - Published Version

Download (790kB) | Preview

Abstract

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them ‘winner’ and ‘loser’. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective – we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.

Type: Proceedings paper
Title: Open-ended learning in symmetric zero-sum games
Event: 36th International Conference on Machine Learning
Open access status: An open access version is available from UCL Discovery
Publisher version: http://proceedings.mlr.press/v97/balduzzi19a.html
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10091504
Downloads since deposit
169Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item