UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

PitVis-2023 challenge: Workflow recognition in videos of endoscopic pituitary surgery

Das, Adrito; Khan, Danyal Z; Psychogyios, Dimitrios; Zhang, Yitong; Hanrahan, John G; Vasconcelos, Francisco; Pang, You; ... Bano, Sophia; + view all (2025) PitVis-2023 challenge: Workflow recognition in videos of endoscopic pituitary surgery. Medical Image Analysis , 106 , Article 103716. 10.1016/j.media.2025.103716. Green open access

[thumbnail of 1-s2.0-S1361841525002634-main.pdf]
Preview
Text
1-s2.0-S1361841525002634-main.pdf - Published Version

Download (8MB) | Preview

Abstract

The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery, including: which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery or during live surgery. The Pituitary Vision (PitVis) 2023 Challenge tasks the community to step and instrument recognition in videos of endoscopic pituitary surgery. This is a particularly challenging task when compared to other minimally invasive surgeries due to: the smaller working space, which limits and distorts vision; and higher frequency of instrument and step switching, which requires more precise model predictions. Participants were provided with 25-videos, with results presented at the MICCAI-2023 conference as part of the Endoscopic Vision 2023 Challenge in Vancouver, Canada, on 08-Oct-2023. There were 18-submissions from 9-teams across 6-countries, using a variety of deep learning models. The top performing model for step recognition utilised a transformer based architecture, uniquely using an autoregressive decoder with a positional encoding input. The top performing model for instrument recognition utilised a spatial encoder followed by a temporal encoder, which uniquely used a 2-layer temporal architecture. In both cases, these models outperformed purely spatial based models, illustrating the importance of sequential and temporal information. This PitVis-2023 therefore demonstrates state-of-the-art computer vision models in minimally invasive surgery are transferable to a new dataset. Benchmark results are provided in the paper, and the dataset is publicly available at: https://doi.org/10.5522/04/26531686.

Type: Article
Title: PitVis-2023 challenge: Workflow recognition in videos of endoscopic pituitary surgery
Open access status: An open access version is available from UCL Discovery
DOI: 10.1016/j.media.2025.103716
Publisher version: https://doi.org/10.1016/j.media.2025.103716
Language: English
Additional information: © 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Keywords: Endoscopic vision, Instrument recognition, Step recognition, Surgical AI, Surgical vision, Workflow analysis
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
URI: https://discovery.ucl.ac.uk/id/eprint/10212075
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item