He, Q;
Feng, G;
Bano, S;
Stoyanov, D;
Zuo, S;
(2024)
MonoLoT: Self-Supervised Monocular Depth Estimation in Low-Texture Scenes for Automatic Robotic Endoscopy.
IEEE Journal of Biomedical and Health Informatics
10.1109/JBHI.2024.3423791.
(In press).
Preview |
PDF
MonoLoT_Self-Supervised_Monocular_Depth_Estimation_in_Low-Texture_Scenes_for_Automatic_Robotic_Endoscopy.pdf - Accepted Version Download (6MB) | Preview |
Abstract
The self-supervised monocular depth estimation framework is well-suited for medical images that lack ground-truth depth, such as those from digestive endoscopes, facilitating navigation and 3D reconstruction in the gastrointestinal tract. However, this framework faces several limitations, including poor performance in low-texture environments, limited generalisation to real-world datasets, and unclear applicability in downstream tasks like visual servoing. To tackle these challenges, we propose MonoLoT, a self-supervised monocular depth estimation framework featuring two key innovations: point matching loss and batch image shuffle. Extensive ablation studies on two publicly available datasets, namely C3VD and SimCol, have shown that methods enabled by MonoLoT achieve substantial improvements, with accuracies of 0.944 on C3VD and 0.959 on SimCol, surpassing both depth-supervised and self-supervised baselines on C3VD. Qualitative evaluations on real-world endoscopic data underscore the generalisation capabilities of our methods, outperforming both depth-supervised and self-supervised baselines. To demonstrate the feasibility of using monocular depth estimation for visual servoing, we have successfully integrated our method into a proof-of-concept robotic platform, enabling real-time automatic intervention and control in digestive endoscopy. In summary, our method represents a significant advancement in monocular depth estimation for digestive endoscopy, overcoming key challenges and opening promising avenues for medical applications.
Type: | Article |
---|---|
Title: | MonoLoT: Self-Supervised Monocular Depth Estimation in Low-Texture Scenes for Automatic Robotic Endoscopy |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1109/JBHI.2024.3423791 |
Publisher version: | http://dx.doi.org/10.1109/jbhi.2024.3423791 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Estimation, Endoscopes, Training, Data models, Robots, Feature extraction, Image reconstruction |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10194872 |
Archive Staff Only
View Item |