Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour

Macmillan-Scott, Olivia; Musolesi, Mirco; (2025) Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour. PLoS Computational Biology , 21 (8) , Article e1013302. 10.1371/journal.pcbi.1013302. Green open access

[thumbnail of PS_signalling___PLOS-5.pdf]

Preview

Text
PS_signalling___PLOS-5.pdf - Accepted Version
Download (1MB) | Preview

Abstract

The coevolution of signalling is a complex problem within animal behaviour, and is also central to communication between artificial agents. The Sir Philip Sidney game was designed to model this dyadic interaction from an evolutionary biology perspective, and was formulated to demonstrate the emergence of honest signalling. We use Multi-Agent Reinforcement Learning (MARL) to show that in the majority of cases, the resulting behaviour adopted by agents is not that shown in the original derivation of the model. This paper demonstrates that MARL can be a powerful tool to study evolutionary dynamics and understand the underlying mechanisms of learning over generations; particularly advantageous is the interpretability of this type of approach, as well as that fact that it allows us to study emergent behaviour without the need to constrain the strategy space from the outset. Although it originally set out to exemplify honest signalling, we show that the game provides no incentive for such behaviour. In the majority of cases, the optimal outcome is one that does not require a signal for the resource to be given. This type of interaction is observed within animal behaviour, and is sometimes denoted proactive prosociality. High learning and low discount rates of the reinforcement learning model are shown to be optimal in order to achieve the outcome that maximises both agents’ reward, and proximity to the given threshold leads to suboptimal learning

Type:	Article
Title:	Maynard Smith revisited: A multi-agent reinforcement learning approach to the coevolution of signalling behaviour
Location:	United States
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1371/journal.pcbi.1013302
Publisher version:	https://doi.org/10.1371/journal.pcbi.1013302
Language:	English
Additional information:	Copyright: © 2025 Macmillan-Scott, Musolesi. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10213527

Downloads since deposit

0Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item