UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Multiagent Training in N-Player General-Sum Games

Marris, Luke; (2024) Multiagent Training in N-Player General-Sum Games. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of thesis_final_2024.pdf]
Preview
Text
thesis_final_2024.pdf - Accepted Version

Download (11MB) | Preview

Abstract

Recent successes in two-player, purely competitive (so-called zero-sum) games has made headlines around the world. Initially, perfect information games, with vast state spaces, such as Go and Chess, were conquered using a combination of search, deep neural network function approximation, and self-play reinforcement learning. Subsequently more challenging, imperfect information games such as StarCraft were tackled using recurrent deep neural networks, human-play priors, and policy-space response oracle reinforcement learning. Although these are important advances, from a game theoretic perspective two-player zero-sum games are the simplest class, and many algorithms are known to converge to a solution concept called Nash Equilibrium in this setting. For two-player zero-sum games the Nash Equilibrium is unexploitable (there is no strategy that an opponent could play to that would reduce one's reward) and interchangeable (if there is more than one Nash equilibrium in a game, all equilibria are equally good, and it is not necessary for players to coordinate on the equilibria being played). But most importantly, Nash equilibrium coincides with the Minimax solution in two-player zero-sum, which is a single-objective optimization. Therefore when playing against rational opponents, the Nash equilibrium is the obvious, perhaps fundamental, objective to optimize for when training learning agents in two-player zero-sum games. Progress on n-player general-sum games, however, has remained limited. It is both unclear what objective to optimize for and difficult to build algorithms that converge to any interesting objective. This is disappointing because the majority of the world's interactions are not purely competitive, have interesting mixed-motive dynamics, and have more than two players. This is an area of interest to economists, sociologists, policy makers, artificial intelligence researchers, and any other discipline that concerns multiple agents that have to compete or cooperate together to achieve their objectives. Previous work has mainly involved applying single agent techniques, or two-player zero-sum techniques and hoping for the best. The overall aim of this work is to unlock progress beyond the narrow domains of two-player zero-sum, to the most general space of n-player, general-sum games using principled game theoretic techniques. In this end, this work utilizes more flexible mediated equilibrium solution concepts, correlated equilibrium and coarse correlated equilibrium, which are more suitable for general-sum games. This choice is for convenience and to enable the main goal of this work: building algorithms that scale in n-player general-sum settings. This thesis a) builds an intuition for the space of games by studying transforms that do not alter their equilibrium, b) proposes a novel normal-form game embedding, c) develops tools for visualizing large extensive-form games, d) proposes an efficient equilibrium selection criteria, e) builds fast neural network solvers for finding equilibria in all normal-form games, f) proposes population based learning algorithms for training policies in n-player general-sum extensive-form games, g) proposes learning algorithms for training policies in Markov games games, and h) develops novel ratings algorithms to evaluate strategies in n-player general-sum games.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Multiagent Training in N-Player General-Sum Games
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2024. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
Keywords: Game Theory, Artificial Intelligence, Multiagent Systems, Equilibrium Computation, Reinforcement Learning, Deep Learning, Normal-Form Games, Extensive-Form Games, Rating Systems
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10190981
Downloads since deposit
50Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item