UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Digging into self-supervised monocular depth estimation

Godard, C; Aodha, OM; Firman, M; Brostow, G; (2019) Digging into self-supervised monocular depth estimation. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). (pp. pp. 3827-3837). The Institute of Electrical and Electronics Engineers (IEEE) Green open access

[thumbnail of 1806.01260.pdf]
Preview
Text
1806.01260.pdf - Accepted Version

Download (5MB) | Preview

Abstract

Per-pixel ground-truth depth data is challenging to acquire at scale. To overcome this limitation, self-supervised learning has emerged as a promising alternative for training models to perform monocular depth estimation. In this paper, we propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. Research on self-supervised monocular training usually explores increasingly complex architectures, loss functions, and image formation models, all of which have recently helped to close the gap with fully-supervised methods. We show that a surprisingly simple model, and associated design choices, lead to superior predictions. In particular, we propose (i) a minimum reprojection loss, designed to robustly handle occlusions, (ii) a full-resolution multi-scale sampling method that reduces visual artifacts, and (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions. We demonstrate the effectiveness of each component in isolation, and show high quality, state-of-the-art results on the KITTI benchmark.

Type: Proceedings paper
Title: Digging into self-supervised monocular depth estimation
Event: 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
ISBN-13: 978-1-7281-4803-8
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/ICCV.2019.00393
Publisher version: https://doi.org/10.1109/ICCV.2019.00393
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords: Training, Estimation, Predictive models, Cameras, Image color analysis, Image reconstruction, Image matching
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10117263
Downloads since deposit
64Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item