Balduzzi, D;
Garnelo, M;
Bachrach, Y;
Czarnecki, W;
Pérolat, J;
Jaderberg, M;
Graepel, T;
(2019)
Open-ended learning in symmetric zero-sum games.
In: Chaudhuri, K and Salakhutdinov, R, (eds.)
Proceedings of the 36th International Conference on Machine Learning.
(pp. pp. 434-443).
PMLR
Preview |
Text
balduzzi19a.pdf - Published Version Download (790kB) | Preview |
Abstract
Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them ‘winner’ and ‘loser’. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective – we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.
Type: | Proceedings paper |
---|---|
Title: | Open-ended learning in symmetric zero-sum games |
Event: | 36th International Conference on Machine Learning |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | http://proceedings.mlr.press/v97/balduzzi19a.html |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10091504 |
Archive Staff Only
View Item |