UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

MonoLoT: Self-Supervised Monocular Depth Estimation in Low-Texture Scenes for Automatic Robotic Endoscopy

He, Q; Feng, G; Bano, S; Stoyanov, D; Zuo, S; (2024) MonoLoT: Self-Supervised Monocular Depth Estimation in Low-Texture Scenes for Automatic Robotic Endoscopy. IEEE Journal of Biomedical and Health Informatics 10.1109/JBHI.2024.3423791. (In press). Green open access

[thumbnail of MonoLoT_Self-Supervised_Monocular_Depth_Estimation_in_Low-Texture_Scenes_for_Automatic_Robotic_Endoscopy.pdf]
Preview
PDF
MonoLoT_Self-Supervised_Monocular_Depth_Estimation_in_Low-Texture_Scenes_for_Automatic_Robotic_Endoscopy.pdf - Accepted Version

Download (6MB) | Preview

Abstract

The self-supervised monocular depth estimation framework is well-suited for medical images that lack ground-truth depth, such as those from digestive endoscopes, facilitating navigation and 3D reconstruction in the gastrointestinal tract. However, this framework faces several limitations, including poor performance in low-texture environments, limited generalisation to real-world datasets, and unclear applicability in downstream tasks like visual servoing. To tackle these challenges, we propose MonoLoT, a self-supervised monocular depth estimation framework featuring two key innovations: point matching loss and batch image shuffle. Extensive ablation studies on two publicly available datasets, namely C3VD and SimCol, have shown that methods enabled by MonoLoT achieve substantial improvements, with accuracies of 0.944 on C3VD and 0.959 on SimCol, surpassing both depth-supervised and self-supervised baselines on C3VD. Qualitative evaluations on real-world endoscopic data underscore the generalisation capabilities of our methods, outperforming both depth-supervised and self-supervised baselines. To demonstrate the feasibility of using monocular depth estimation for visual servoing, we have successfully integrated our method into a proof-of-concept robotic platform, enabling real-time automatic intervention and control in digestive endoscopy. In summary, our method represents a significant advancement in monocular depth estimation for digestive endoscopy, overcoming key challenges and opening promising avenues for medical applications.

Type: Article
Title: MonoLoT: Self-Supervised Monocular Depth Estimation in Low-Texture Scenes for Automatic Robotic Endoscopy
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/JBHI.2024.3423791
Publisher version: http://dx.doi.org/10.1109/jbhi.2024.3423791
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Estimation, Endoscopes, Training, Data models, Robots, Feature extraction, Image reconstruction
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10194872
Downloads since deposit
52Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item