UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Gesture Recognition in Robotic Surgery with Multimodal Attention

Van Amsterdam, B; Funke, I; Edwards, E; Speidel, S; Collins, J; Sridhar, A; Kelly, J; ... Stoyanov, D; + view all (2022) Gesture Recognition in Robotic Surgery with Multimodal Attention. IEEE Transactions on Medical Imaging 10.1109/TMI.2022.3147640. (In press). Green open access

[thumbnail of FINAL VERSION.pdf]
Preview
PDF
FINAL VERSION.pdf - Accepted Version

Download (3MB) | Preview

Abstract

Automatically recognising surgical gestures from surgical data is an important building block of automated activity recognition and analytics, technical skill assessment, intra-operative assistance and eventually robotic automation. The complexity of articulated instrument trajectories and the inherent variability due to surgical style and patient anatomy make analysis and fine-grained segmentation of surgical motion patterns from robot kinematics alone very difficult. Surgical video provides crucial information from the surgical site with context for the kinematic data and the interaction between the instruments and tissue. Yet sensor fusion between the robot data and surgical video stream is non-trivial because the data have different frequency, dimensions and discriminative capability. In this paper, we integrate multimodal attention mechanisms in a two-stream temporal convolutional network to compute relevance scores and weight kinematic and visual feature representations dynamically in time, aiming to aid multimodal network training and achieve effective sensor fusion. We report the results of our system on the JIGSAWS benchmark dataset and on a new in vivo dataset of suturing segments from robotic prostatectomy procedures. Our results are promising and obtain multimodal prediction sequences with higher accuracy and better temporal structure than corresponding unimodal solutions. Visualization of attention scores also gives physically interpretable insights on network understanding of strengths and weaknesses of each sensor.

Type: Article
Title: Gesture Recognition in Robotic Surgery with Multimodal Attention
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/TMI.2022.3147640
Publisher version: https://doi.org/10.1109/TMI.2022.3147640
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Robots, Kinematics, Robot sensing systems, Needles, Gesture recognition, Surgery, Visualization
UCL classification: UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Div of Surgery and Interventional Sci
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Med Phys and Biomedical Eng
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10144363
Downloads since deposit
1,070Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item