UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Deep Learning for Real-time Image Understanding in Endoscopic Vision

García Peraza Herrera, Luis Carlos; (2021) Deep Learning for Real-time Image Understanding in Endoscopic Vision. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of Garcia Peraza Herrera_Thesis.pdf]
Garcia Peraza Herrera_Thesis.pdf

Download (130MB) | Preview


Understanding what is happening in endoscopic scenes while it is happening is a key problem in Computer-Assisted Interventions (CAI). It is through semantic information extraction that we will be able to characterise and eventually improve on what is the current clinical practice. In this thesis, we focus on three main topics, real-time segmentation of surgical tools, automatic generation of synthetic instrument segmentation labels, and tissue classification. In most endoscopic scenes, surgical instruments play a crucial role. Instrument segmentation has a wealth of potential applications. It is already, or in some cases is bound to become, an essential building block of many computer-assisted clinical systems. These applications require real-time segmentation. Two approaches are introduced to achieve it, one multiplexing deep learning with optical flow and another employing a lightweight model and deep supervision. In keyhole surgery, surgeons cannot manipulate the endoscope themselves. Their hands are occupied with other instruments. Additional human camera operators pose a number of disadvantages. As an exemplar application of tool segmentation, a method that employs it as a building block for autonomous robotic endoscopy is introduced. Throughout our experimentation, we notice that even though we are able to achieve real-time segmentations, their quality is limited by the scarcity of labelled data, which leads to a poor generalisation to different endoscopic domains. We propose a self-supervised mechanism to automatise the generation of synthetic instrument segmentation labels. Equally important to segmenting the surgical instruments is to understand the rest of the surgical scene. As an exemplar addressing such need, we propose the first open dataset for endoscopic detection of early squamous cell neoplasia and provide a baseline method to distinguish between normal and abnormal video frames. However, in a clinical context, we are also required to explain the behaviour of our network. We propose a method to display network activation that allows us to study whether the input features looked at by the network align with the endoscopic markers employed in clinical practice.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Deep Learning for Real-time Image Understanding in Endoscopic Vision
Event: UCL (University College London)
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Copyright © The Author 2021. Original content in this thesis is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Med Phys and Biomedical Eng
URI: https://discovery.ucl.ac.uk/id/eprint/10127360
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item