?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.title=Reinforcement+Learning+Agents+acquire+Flocking+and+Symbiotic+Behaviour+in+Simulated+Ecosystems&rft.creator=Sunehag%2C+P&rft.creator=Lever%2C+G&rft.creator=Liu%2C+S&rft.creator=Merel%2C+J&rft.creator=Heess%2C+N&rft.creator=Leibo%2C+JZ&rft.creator=Hughes%2C+E&rft.creator=Eccles%2C+T&rft.creator=Graepel%2C+T&rft.description=In+nature%2C+group+behaviours+such+as+flocking+as+well+as+cross-species+symbiotic+partnerships+are+observed+in+vastly+different+forms+and+circumstances.+We+hypothesize+that+such+strategies+can+arise+in+response+to+generic+predator-prey+pressures+in+a+spatial+environment+with+range-limited+sensation+and+action.+We+evaluate+whether+these+forms+of+coordination+can+emerge+by+independent+multi-agent+reinforcement+learning+in+simple+multiple-species+ecosystems.+In+contrast+to+prior+work%2C+we+avoid+hand-crafted+shaping+rewards%2C+specific+actions%2C+or+dynamics+that+would+directly+encourage+coordination+across+agents.+Instead+we+test+whether+coordination+emerges+as+a+consequence+of+adaptation+without+encouraging+these+specific+forms+of+coordination%2C+which+only+has+indirect+benefit.+Our+simulated+ecosystems+consist+of+a+generic+food+chain+involving+three+trophic+levels%3A+apex+predator%2C+mid-level+predator%2C+and+prey.+We+conduct+experiments+on+two+different+platforms%2C+a+3D+physics+engine+with+tens+of+agents+as+well+as+in+a+2D+grid+world+with+up+to+thousands.+The+results+clearly+confirm+our+hypothesis+and+show+substantial+coordination+both+within+and+across+species.+To+obtain+these+results%2C+we+leverage+and+adapt+recent+advances+in+deep+reinforcement+learning+within+an+ecosystem+training+protocol+featuring+homogeneous+groups+of+independent+agents+from+different+species+(sets+of+policies)%2C+acting+in+many+different+random+combinations+in+parallel+habitats.+The+policies+utilize+neural+network+architectures+that+are+invariant+to+agent+individuality+but+not+type+(species)+and+that+generalize+across+varying+numbers+of+observed+other+agents.+While+the+emergence+of+complexity+in+artificial+ecosystems+have+long+been+studied+in+the+artificial+life+community%2C+the+focus+has+been+more+on+individual+complexity+and+genetic+algorithms+or+explicit+modelling%2C+and+less+on+group+complexity+and+reinforcement+learning+emphasized+in+this+article.+Unlike+what+the+name+and+intuition+suggests%2C+reinforcement+learning+adapts+over+evolutionary+history+rather+than+a+life-time+and+is+here+addressing+the+sequential+optimization+of+fitness+that+is+usually+approached+by+genetic+algorithms+in+the+artificial+life+community.+We+utilize+a+shift+from+procedures+to+objectives%2C+allowing+us+to+bring+new+powerful+machinery+to+bare%2C+and+we+see+emergence+of+complex+behaviour+from+a+sequence+of+simple+optimization+problems.&rft.publisher=MIT+Press&rft.contributor=Fellermann%2C+H&rft.contributor=Bacardit%2C+J&rft.contributor=GoniMoreno%2C+A&rft.contributor=Fuchslin%2C+R&rft.date=2019-07&rft.type=Proceedings+paper&rft.publisher=Conference+on+Artificial+Life+(ALIFE)+-+How+Can+Artificial+Life+Help+Solve+Societal+Challenges%3F&rft.language=eng&rft.source=+++++In%3A+Fellermann%2C+H+and+Bacardit%2C+J+and+GoniMoreno%2C+A+and+Fuchslin%2C+R%2C+(eds.)+Proceedings+of+the+Artificial+Life+Conference.++(pp.+pp.+103-110).++MIT+Press+(2019)+++++&rft.format=text&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10090817%2F1%2Fisal_a_00148%2520%25281%2529.pdf&rft.identifier=https%3A%2F%2Fdiscovery.ucl.ac.uk%2Fid%2Feprint%2F10090817%2F&rft.rights=open