UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Investigating exploration for deep reinforcement learning of concentric tube robot control

Iyengar, K; Dwyer, G; Stoyanov, D; (2020) Investigating exploration for deep reinforcement learning of concentric tube robot control. International Journal of Computer Assisted Radiology and Surgery , 15 pp. 1157-1165. 10.1007/s11548-020-02194-z. Green open access

[thumbnail of Iyengar2020_Article_InvestigatingExplorationForDee.pdf]
Preview
Text
Iyengar2020_Article_InvestigatingExplorationForDee.pdf - Published Version

Download (1MB) | Preview

Abstract

PURPOSE: Concentric tube robots are composed of multiple concentric, pre-curved, super-elastic, telescopic tubes that are compliant and have a small diameter suitable for interventions that must be minimally invasive like fetal surgery. Combinations of rotation and extension of the tubes can alter the robot's shape but the inverse kinematics are complex to model due to the challenge of incorporating friction and other tube interactions or manufacturing imperfections. We propose a model-free reinforcement learning approach to form the inverse kinematics solution and directly obtain a control policy. METHOD: Three exploration strategies are shown for deep deterministic policy gradient with hindsight experience replay for concentric tube robots in simulation environments. The aim is to overcome the joint to Cartesian sampling bias and be scalable with the number of robotic tubes. To compare strategies, evaluation of the trained policy network to selected Cartesian goals and associated errors are analyzed. The learned control policy is demonstrated with trajectory following tasks. RESULTS: Separation of extension and rotation joints for Gaussian exploration is required to overcome Cartesian sampling bias. Parameter noise and Ornstein-Uhlenbeck were found to be optimal strategies with less than 1 mm error in all simulation environments. Various trajectories can be followed with the optimal exploration strategy learned policy at high joint extension values. Our inverse kinematics solver in evaluation has 0.44 mm extension and [Formula: see text] rotation error. CONCLUSION: We demonstrate the feasibility of effective model-free control for concentric tube robots. Directly using the control policy, arbitrary trajectories can be followed and this is an important step towards overcoming the challenge of concentric tube robot control for clinical use in minimally invasive interventions.

Type: Article
Title: Investigating exploration for deep reinforcement learning of concentric tube robot control
Location: Germany
Open access status: An open access version is available from UCL Discovery
DOI: 10.1007/s11548-020-02194-z
Publisher version: http://doi.org/10.1007/s11548-020-02194-z
Language: English
Additional information: This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
Keywords: Concentric tube robots, Deep reinforcement learning, Robot control, Surgical robotics
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10102276
Downloads since deposit
58Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item