Song, Y;
Jiang, H;
Tian, Z;
Zhang, H;
Zhang, Y;
Zhu, J;
Dai, Z;
... Wang, J; + view all
(2024)
An Empirical Study on Google Research Football Multi-agent Scenarios.
Machine Intelligence Research
, 21
pp. 549-570.
10.1007/s11633-023-1426-8.
Preview |
Text
2305.09458.pdf - Other Download (4MB) | Preview |
Abstract
Few multi-agent reinforcement learning (MARL) researches on Google research football (GRF) focus on the 11-vs-11 multi-agent full-game scenario and to the best of our knowledge, no open benchmark on this scenario has been released to the public. In this work, we fill the gap by providing a population-based MARL training pipeline and hyperparameter settings on multi-agent football scenario that outperforms the bot with difficulty 1.0 from scratch within 2 million steps. Our experiments serve as a reference for the expected performance of independent proximal policy optimization (IPPO), a state-of-the-art multi-agent reinforcement learning algorithm where each agent tries to maximize its own policy independently across various training configurations. Meanwhile, we release our training framework Light-MALib which extends the MALib codebase by distributed and asynchronous implementation with additional analytical tools for football games. Finally, we provide guidance for building strong football AI with population-based training and release diverse pretrained policies for benchmarking. The goal is to provide the community with a head start for whoever experiment their works on GRF and a simple-to-use population-based training framework for further improving their agents through self-play. The implementation is available at https://github.com/Shanghai-Digital-Brain-Laboratory/DB-Football .
Type: | Article |
---|---|
Title: | An Empirical Study on Google Research Football Multi-agent Scenarios |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1007/s11633-023-1426-8 |
Publisher version: | http://dx.doi.org/10.1007/s11633-023-1426-8 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Multi-agent reinforcement learning (RL), distributed RL system, population-based training, reward shaping, game theory |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10188587 |
Archive Staff Only
View Item |