Zhang, H;
Wang, J;
Zhou, Z;
Zhang, W;
Wen, Y;
Yu, Y;
Li, W;
(2018)
Learning to design games: Strategic environments in reinforcement learning.
In:
(pp. pp. 3068-3074).
ArXiv
Preview |
Text
1707.01310.pdf - Accepted Version Download (2MB) | Preview |
Abstract
In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment. In this paper, we extend this setting by considering the environment is not given, but controllable and learnable through its interaction with the agent at the same time. This extension is motivated by environment design scenarios in the real-world, including game design, shopping space design and traffic signal design. Theoretically, we find a dual Markov decision process (MDP) w.r.t. the environment to that w.r.t. the agent, and derive a policy gradient solution to optimizing the parametrized environment. Furthermore, discontinuous environments are addressed by a proposed general generative framework. Our experiments on a Maze game design task show the effectiveness of the proposed algorithms in generating diverse and challenging Mazes against various agent settings.
Type: | Proceedings paper |
---|---|
Title: | Learning to design games: Strategic environments in reinforcement learning |
ISBN-13: | 9780999241127 |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://arxiv.org/abs/1707.01310 |
Language: | English |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > UCL School of Management |
URI: | https://discovery.ucl.ac.uk/id/eprint/10066099 |
Archive Staff Only
View Item |