García Peraza Herrera, Luis Carlos;
(2021)
Deep Learning for Real-time Image Understanding in Endoscopic Vision.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Garcia Peraza Herrera_Thesis.pdf Download (130MB) | Preview |
Abstract
Understanding what is happening in endoscopic scenes while it is happening is a key problem in Computer-Assisted Interventions (CAI). It is through semantic information extraction that we will be able to characterise and eventually improve on what is the current clinical practice. In this thesis, we focus on three main topics, real-time segmentation of surgical tools, automatic generation of synthetic instrument segmentation labels, and tissue classification. In most endoscopic scenes, surgical instruments play a crucial role. Instrument segmentation has a wealth of potential applications. It is already, or in some cases is bound to become, an essential building block of many computer-assisted clinical systems. These applications require real-time segmentation. Two approaches are introduced to achieve it, one multiplexing deep learning with optical flow and another employing a lightweight model and deep supervision. In keyhole surgery, surgeons cannot manipulate the endoscope themselves. Their hands are occupied with other instruments. Additional human camera operators pose a number of disadvantages. As an exemplar application of tool segmentation, a method that employs it as a building block for autonomous robotic endoscopy is introduced. Throughout our experimentation, we notice that even though we are able to achieve real-time segmentations, their quality is limited by the scarcity of labelled data, which leads to a poor generalisation to different endoscopic domains. We propose a self-supervised mechanism to automatise the generation of synthetic instrument segmentation labels. Equally important to segmenting the surgical instruments is to understand the rest of the surgical scene. As an exemplar addressing such need, we propose the first open dataset for endoscopic detection of early squamous cell neoplasia and provide a baseline method to distinguish between normal and abnormal video frames. However, in a clinical context, we are also required to explain the behaviour of our network. We propose a method to display network activation that allows us to study whether the input features looked at by the network align with the endoscopic markers employed in clinical practice.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Deep Learning for Real-time Image Understanding in Endoscopic Vision |
Event: | UCL (University College London) |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2021. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Med Phys and Biomedical Eng |
URI: | https://discovery.ucl.ac.uk/id/eprint/10127360 |
Archive Staff Only
View Item |