Sunehag, P;
Lever, G;
Liu, S;
Merel, J;
Heess, N;
Leibo, JZ;
Hughes, E;
... Graepel, T; + view all
(2019)
Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems.
In: Fellermann, H and Bacardit, J and GoniMoreno, A and Fuchslin, R, (eds.)
Proceedings of the Artificial Life Conference.
(pp. pp. 103-110).
MIT Press
Preview |
Text
isal_a_00148 (1).pdf - Published Version Download (1MB) | Preview |
Abstract
In nature, group behaviours such as flocking as well as cross-species symbiotic partnerships are observed in vastly different forms and circumstances. We hypothesize that such strategies can arise in response to generic predator-prey pressures in a spatial environment with range-limited sensation and action. We evaluate whether these forms of coordination can emerge by independent multi-agent reinforcement learning in simple multiple-species ecosystems. In contrast to prior work, we avoid hand-crafted shaping rewards, specific actions, or dynamics that would directly encourage coordination across agents. Instead we test whether coordination emerges as a consequence of adaptation without encouraging these specific forms of coordination, which only has indirect benefit. Our simulated ecosystems consist of a generic food chain involving three trophic levels: apex predator, mid-level predator, and prey. We conduct experiments on two different platforms, a 3D physics engine with tens of agents as well as in a 2D grid world with up to thousands. The results clearly confirm our hypothesis and show substantial coordination both within and across species. To obtain these results, we leverage and adapt recent advances in deep reinforcement learning within an ecosystem training protocol featuring homogeneous groups of independent agents from different species (sets of policies), acting in many different random combinations in parallel habitats. The policies utilize neural network architectures that are invariant to agent individuality but not type (species) and that generalize across varying numbers of observed other agents. While the emergence of complexity in artificial ecosystems have long been studied in the artificial life community, the focus has been more on individual complexity and genetic algorithms or explicit modelling, and less on group complexity and reinforcement learning emphasized in this article. Unlike what the name and intuition suggests, reinforcement learning adapts over evolutionary history rather than a life-time and is here addressing the sequential optimization of fitness that is usually approached by genetic algorithms in the artificial life community. We utilize a shift from procedures to objectives, allowing us to bring new powerful machinery to bare, and we see emergence of complex behaviour from a sequence of simple optimization problems.
Type: | Proceedings paper |
---|---|
Title: | Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems |
Event: | Conference on Artificial Life (ALIFE) - How Can Artificial Life Help Solve Societal Challenges? |
Location: | Newcastle, United Kingdom |
Dates: | 29th July - 2nd August 2019 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1162/isal_a_00148 |
Publisher version: | https://doi.org/10.1162/isal_a_00148 |
Language: | English |
Additional information: | © 2019 Massachusetts Institute of Technology Published under a Creative Commons Attribution 4.0 International (CC BY 4.0) license (https://creativecommons.org/licenses/by/4.0/). |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10090817 |




Archive Staff Only
![]() |
View Item |