Godard, C;
              
      
            
                Aodha, OM;
              
      
            
                Firman, M;
              
      
            
                Brostow, G;
              
      
        
        
  
(2019)
  Digging into self-supervised monocular depth estimation.
    
    
      In: 
      Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
      
      (pp. pp. 3827-3837).
    
 The Institute of Electrical and Electronics Engineers (IEEE)
  
  
       
    
  
| Preview | Text 1806.01260.pdf - Accepted Version Download (5MB) | Preview | 
Abstract
Per-pixel ground-truth depth data is challenging to acquire at scale. To overcome this limitation, self-supervised learning has emerged as a promising alternative for training models to perform monocular depth estimation. In this paper, we propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. Research on self-supervised monocular training usually explores increasingly complex architectures, loss functions, and image formation models, all of which have recently helped to close the gap with fully-supervised methods. We show that a surprisingly simple model, and associated design choices, lead to superior predictions. In particular, we propose (i) a minimum reprojection loss, designed to robustly handle occlusions, (ii) a full-resolution multi-scale sampling method that reduces visual artifacts, and (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions. We demonstrate the effectiveness of each component in isolation, and show high quality, state-of-the-art results on the KITTI benchmark.
| Type: | Proceedings paper | 
|---|---|
| Title: | Digging into self-supervised monocular depth estimation | 
| Event: | 2019 IEEE/CVF International Conference on Computer Vision (ICCV) | 
| ISBN-13: | 978-1-7281-4803-8 | 
| Open access status: | An open access version is available from UCL Discovery | 
| DOI: | 10.1109/ICCV.2019.00393 | 
| Publisher version: | https://doi.org/10.1109/ICCV.2019.00393 | 
| Language: | English | 
| Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions. | 
| Keywords: | Training, Estimation, Predictive models, Cameras, Image color analysis, Image reconstruction, Image matching | 
| UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science | 
| URI: | https://discovery.ucl.ac.uk/id/eprint/10117263 | 
Archive Staff Only
|  | View Item | 
 
                      
