Anastasiou, D;
Caramalau, R;
Sirajudeen, N;
Boal, M;
Edwards, P;
Collins, J;
Kelly, J;
... Mazomenos, EB; + view all
(2026)
Exploring Pre-training Across Domains for Few-Shot Surgical Skill Assessment.
In: Bhattarai, B and Rau, A and Caramalau, R and Reinke, A and Nguyen, A and Namburete, A and Gyawali, P and Stoyanov, D, (eds.)
Data Engineering in Medical Imaging.
(pp. pp. 212-222).
Springer: Cham, Switzerland.
|
Text
2509.09327v1 (1).pdf - Accepted Version Access restricted to UCL open access staff until 13 October 2026. Download (747kB) |
Abstract
Automated surgical skill assessment (SSA) is a central task in surgical computer vision. Developing robust SSA models is challenging due to the scarcity of skill annotations, which are time-consuming to produce and require expert consensus. Few-shot learning (FSL) offers a scalable alternative enabling model development with minimal supervision, though its success critically depends on effective pre-training. While widely studied for several surgical downstream tasks, pre-training has remained largely unexplored in SSA. In this work, we formulate SSA as a few-shot task and investigate how self-supervised pre-training strategies affect downstream few-shot SSA performance. We annotate a publicly available robotic surgery dataset with Objective Structured Assessment of Technical Skill (OSATS) scores, and evaluate various pre-training sources across three few-shot settings. We quantify domain similarity and analyze how domain gap and the inclusion of procedure-specific data into pre-training influence transferability. Our results show that small but domain-relevant datasets can outperform large-scale, less aligned ones, achieving accuracies of 60.16%, 66.03%, and 73.65% in the 1-, 2-, and 5-shot settings, respectively. Moreover, incorporating procedure-specific data into pre-training with a domain-relevant external dataset significantly boosts downstream performance, with an average gain of +1.22% in accuracy and +2.28% in F1-score; however, applying the same strategy with less similar but large-scale sources can instead lead to performance degradation. Code and models are available at ssa-fsl.
Archive Staff Only
![]() |
View Item |

