Firman, MD;
(2016)
Learning to Complete 3D Scenes from Single Depth Images.
Doctoral thesis , UCL (University College London).
Preview |
Text
main.pdf Download (93MB) | Preview |
Abstract
Building a complete 3D model of a scene given only a single depth image is underconstrained. To acquire a full volumetric model, one typically needs either multiple views, or a single view together with a library of unambiguous 3D models that will fit the shape of each individual object in the scene. In this thesis, we present alternative methods for inferring the hidden geometry of table-top scenes. We first introduce two depth-image datasets consisting of multiple scenes, each with a ground truth voxel occupancy grid. We then introduce three methods for predicting voxel occupancy. The first predicts the occupancy of each voxel using a novel feature vector which measures the relationship between the query voxel and surfaces in the scene observed by the depth camera. We use a Random Forest to map each voxel of unknown state to a prediction of occupancy. We observed that predicting the occupancy of each voxel independently can lead to noisy solutions. We hypothesize that objects of dissimilar semantic classes often share similar 3D shape components, enabling a limited dataset to model the shape of a wide range of objects, and hence estimate their hidden geometry. Demonstrating this hypothesis, we propose an algorithm that can make structured completions of unobserved geometry. Finally, we propose an alternative framework for understanding the 3D geometry of scenes using the observation that individual objects can appear in multiple different scenes, but in different configurations. We introduce a supervised method to find regions corresponding to the same object across different scenes. We demonstrate that it is possible to then use these groupings of partially observed objects to reconstruct missing geometry. We then perform a critical review of the approaches we have taken, including an assessment of our metrics and datasets, before proposing extensions and future work.
Type: | Thesis (Doctoral) |
---|---|
Title: | Learning to Complete 3D Scenes from Single Depth Images |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Keywords: | Machine learning, computer vision, rgbd, shape prediction, 3d shape, kinect |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/1532193 |
Archive Staff Only
View Item |