UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems

Sunehag, P; Lever, G; Liu, S; Merel, J; Heess, N; Leibo, JZ; Hughes, E; ... Graepel, T; + view all (2019) Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems. In: Fellermann, H and Bacardit, J and GoniMoreno, A and Fuchslin, R, (eds.) Proceedings of the Artificial Life Conference. (pp. pp. 103-110). MIT Press Green open access

[thumbnail of isal_a_00148 (1).pdf]
Preview
Text
isal_a_00148 (1).pdf - Published Version

Download (1MB) | Preview

Abstract

In nature, group behaviours such as flocking as well as cross-species symbiotic partnerships are observed in vastly different forms and circumstances. We hypothesize that such strategies can arise in response to generic predator-prey pressures in a spatial environment with range-limited sensation and action. We evaluate whether these forms of coordination can emerge by independent multi-agent reinforcement learning in simple multiple-species ecosystems. In contrast to prior work, we avoid hand-crafted shaping rewards, specific actions, or dynamics that would directly encourage coordination across agents. Instead we test whether coordination emerges as a consequence of adaptation without encouraging these specific forms of coordination, which only has indirect benefit. Our simulated ecosystems consist of a generic food chain involving three trophic levels: apex predator, mid-level predator, and prey. We conduct experiments on two different platforms, a 3D physics engine with tens of agents as well as in a 2D grid world with up to thousands. The results clearly confirm our hypothesis and show substantial coordination both within and across species. To obtain these results, we leverage and adapt recent advances in deep reinforcement learning within an ecosystem training protocol featuring homogeneous groups of independent agents from different species (sets of policies), acting in many different random combinations in parallel habitats. The policies utilize neural network architectures that are invariant to agent individuality but not type (species) and that generalize across varying numbers of observed other agents. While the emergence of complexity in artificial ecosystems have long been studied in the artificial life community, the focus has been more on individual complexity and genetic algorithms or explicit modelling, and less on group complexity and reinforcement learning emphasized in this article. Unlike what the name and intuition suggests, reinforcement learning adapts over evolutionary history rather than a life-time and is here addressing the sequential optimization of fitness that is usually approached by genetic algorithms in the artificial life community. We utilize a shift from procedures to objectives, allowing us to bring new powerful machinery to bare, and we see emergence of complex behaviour from a sequence of simple optimization problems.

Type: Proceedings paper
Title: Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems
Event: Conference on Artificial Life (ALIFE) - How Can Artificial Life Help Solve Societal Challenges?
Location: Newcastle, United Kingdom
Dates: 29th July - 2nd August 2019
Open access status: An open access version is available from UCL Discovery
DOI: 10.1162/isal_a_00148
Publisher version: https://doi.org/10.1162/isal_a_00148
Language: English
Additional information: © 2019 Massachusetts Institute of Technology Published under a Creative Commons Attribution 4.0 International (CC BY 4.0) license (https://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10090817
Downloads since deposit
Loading...
504Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item