Yan, X;
Guo, J;
Lou, X;
Wang, J;
Zhang, H;
Du, Y;
(2023)
An Efficient End-to-End Training Approach for Zero-Shot Human-AI Coordination.
In:
Advances in Neural Information Processing Systems 36 (NeurIPS 2023).
NeurIPS Proceedings: New Orleans, LA, USA.
Preview |
Text
NeurIPS-2023-an-efficient-end-to-end-training-approach-for-zero-shot-human-ai-coordination-Paper-Conference.pdf - Published Version Download (910kB) | Preview |
Abstract
The goal of zero-shot human-AI coordination is to develop an agent capable of collaborating with humans without relying on human data. Prevailing two-stage population-based methods require a diverse population of mutually distinct policies to simulate diverse human behaviors. The necessity of such populations severely limits their computational efficiency. To address this issue, we propose E3T, an Efficient End-to-End Training approach for zero-shot human-AI coordination. E3T employs a mixture of ego policy and random policy to construct the partner policy, making it both skilled in coordination and diverse. This way, the ego agent is trained end-to-end with this mixture policy, eliminating the need for a pre-trained population, and thus significantly improving training efficiency. In addition, we introduce a partner modeling module designed to predict the partner's actions based on historical contexts. With the predicted partner's action, the ego policy can adapt its strategy and take actions accordingly when collaborating with humans exhibiting different behavior patterns. Empirical results on the Overcooked environment demonstrate that our method substantially improves the training efficiency while preserving comparable or superior performance than the population-based baselines. Demo videos are available at https://sites.google.com/view/e3t-overcooked.
Type: | Proceedings paper |
---|---|
Title: | An Efficient End-to-End Training Approach for Zero-Shot Human-AI Coordination |
Event: | 37th Conference on Neural Information Processing Systems (NeurIPS 2023) |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://papers.nips.cc/paper_files/paper/2023/hash... |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10192033 |
Archive Staff Only
View Item |