UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

MSDESIS: Multi-task stereo disparity estimation and surgical instrument segmentation

Psychogyios, Dimitrios; Mazomenos, Evangelos; Vasconcelos, Francisco; Stoyanov, Danail; (2022) MSDESIS: Multi-task stereo disparity estimation and surgical instrument segmentation. IEEE Transactions on Medical Imaging 10.1109/TMI.2022.3181229. (In press). Green open access

[thumbnail of Psychogyios_Multi-task stereo disparity estimation and surgical instrument segmentation_AAM.pdf]
Preview
Text
Psychogyios_Multi-task stereo disparity estimation and surgical instrument segmentation_AAM.pdf

Download (12MB) | Preview

Abstract

Reconstructing the 3D geometry of the surgical site and detecting instruments within it are important tasks for surgical navigation systems and robotic surgery automation. Traditional approaches treat each problem in isolation and do not account for the intrinsic relationship between segmentation and stereo matching. In this paper, we present a learning-based framework that jointly estimates disparity and binary tool segmentation masks. The core component of our architecture is a shared feature encoder which allows strong interaction between the aforementioned tasks. Experimentally, we train two variants of our network with different capacities and explore different training schemes including both multi-task and single-task learning. Our results show that supervising the segmentation task improves our network's disparity estimation accuracy. We demonstrate a domain adaptation scheme where we supervise the segmentation task with monocular data and achieve domain adaptation of the adjacent disparity task, reducing disparity End-Point-Error and depth mean absolute error by 77.73% and 61.73% respectively compared to the pre-trained baseline model. Our best overall multi-task model, trained with both disparity and segmentation data in subsequent phases, achieves 89.15% mean Intersection-over-Union in RIS and 3.18 millimetre depth mean absolute error in SCARED test sets. Our proposed multi-task architecture is real-time, able to process (1280x1024) stereo input and simultaneously estimate disparity maps and segmentation masks at 22 frames per second. The model code and pre-trained models are made available: https://github.com/dimitrisPs/msdesis.

Type: Article
Title: MSDESIS: Multi-task stereo disparity estimation and surgical instrument segmentation
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/TMI.2022.3181229
Publisher version: https://doi.org/10.1109/TMI.2022.3181229
Language: English
Additional information: © 2022 IEEE. This work is licensed under a Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/).
Keywords: Computer assisted interventions, Computational stereo, Surgical vision, Instrument segmentation
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10150235
Downloads since deposit
279Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item