UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Model-based contextual policy search for data-efficient generalization of robot skills

Kupcsik, A; Deisenroth, MP; Peters, J; Poh, LA; Vadakkepat, P; Neumann, G; (2017) Model-based contextual policy search for data-efficient generalization of robot skills. Artificial Intelligence , 247 pp. 415-439. 10.1016/j.artint.2014.11.005. Green open access

[thumbnail of Deisenroth gpreps_journal_final.pdf]
Preview
Text
Deisenroth gpreps_journal_final.pdf - Accepted Version

Download (5MB) | Preview

Abstract

In robotics, lower-level controllers are typically used to make the robot solve a specific task in a fixed context. For example, the lower-level controller can encode a hitting movement while the context defines the target coordinates to hit. However, in many learning problems the context may change between task executions. To adapt the policy to a new context, we utilize a hierarchical approach by learning an upper-level policy that generalizes the lower-level controllers to new contexts. A common approach to learn such upper-level policies is to use policy search. However, the majority of current contextual policy search approaches are model-free and require a high number of interactions with the robot and its environment. Model-based approaches are known to significantly reduce the amount of robot experiments, however, current model-based techniques cannot be applied straightforwardly to the problem of learning contextual upper-level policies. They rely on specific parametrizations of the policy and the reward function, which are often unrealistic in the contextual policy search formulation. In this paper, we propose a novel model-based contextual policy search algorithm that is able to generalize lower-level controllers, and is data-efficient. Our approach is based on learned probabilistic forward models and information theoretic policy search. Unlike current algorithms, our method does not require any assumption on the parametrization of the policy or the reward function. We show on complex simulated robotic tasks and in a real robot experiment that the proposed learning framework speeds up the learning process by up to two orders of magnitude in comparison to existing methods, while learning high quality policies.

Type: Article
Title: Model-based contextual policy search for data-efficient generalization of robot skills
Open access status: An open access version is available from UCL Discovery
DOI: 10.1016/j.artint.2014.11.005
Publisher version: https://doi.org/10.1016/j.artint.2014.11.005
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Science & Technology, Technology, Computer Science, Artificial Intelligence, Computer Science, Robotics, Reinforcement learning, Contextual policy search, Model-based policy search, Robot skill generalization, Gaussian processes, Movement primitives, Robot table tennis, Robot hockey
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10083723
Downloads since deposit
113Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item