UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Keep Your Eye on the Best: Contrastive Regression Transformer for Skill Assessment in Robotic Surgery

Anastasiou, Dimitrios; Mazomenos, Evangelos; Stoyanov, Danail; Jin, Yueming; (2023) Keep Your Eye on the Best: Contrastive Regression Transformer for Skill Assessment in Robotic Surgery. IEEE Robotics and Automation Letters pp. 1-8. 10.1109/LRA.2023.3242466. Green open access

[thumbnail of Anastasiou_Keep Your Eye on the Best_AAM.pdf]
Preview
Text
Anastasiou_Keep Your Eye on the Best_AAM.pdf

Download (655kB) | Preview

Abstract

This letter proposes a novel video-based, contrastive regression architecture, Contra-Sformer, for automated surgical skill assessment in robot-assisted surgery. The proposed framework is structured to capture the differences in the surgical performance, between a test video and a reference video which represents optimal surgical execution. A feature extractor combining a spatial component (ResNet-18), supervised on frame-level with gesture labels, and a temporal component (TCN), generates spatio-temporal feature matrices of the test and reference videos. These are then fed into an action-aware Transformer with multi-head attention that produces inter-video contrastive features at frame level, representative of the skill similarity/deviation between the two videos. Moments of sub-optimal performance can be identified and temporally localized in the obtained feature vectors, which are ultimately used to regress the manually assigned skill scores. Validated on the JIGSAWS dataset, Contra-Sformer achieves competitive performance (Spearman 0.65 - 0.89), with a normalized mean absolute error between 5.8% - 13.4% on all tasks and across validation setups. Source code and models are available at https://github.com/anastadimi/Contra-Sformer.git .

Type: Article
Title: Keep Your Eye on the Best: Contrastive Regression Transformer for Skill Assessment in Robotic Surgery
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/LRA.2023.3242466
Publisher version: https://doi.org/10.1109/LRA.2023.3242466
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords: Computer Vision for Medical Robotics, Deep Learning Methods, Surgical Skill Assessment, Contrastive Regression
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Med Phys and Biomedical Eng
URI: https://discovery.ucl.ac.uk/id/eprint/10164755
Downloads since deposit
Loading...
0Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
1.United States
76
2.China
32
3.United Kingdom
25
4.France
14
5.India
7
6.Israel
7
7.Germany
6
8.Hong Kong
5
9.Taiwan
5
10.Iran, Islamic Republic of
5

Archive Staff Only

View Item View Item