UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

iMAPPER: Interaction-guided Scene Mapping from Monocular Videos

Monszpart, A; Guerrero, P; Ceylan, D; Yumer, E; Mitra, NJ; (2019) iMAPPER: Interaction-guided Scene Mapping from Monocular Videos. ACM Transactions on Graphics , 38 (4) , Article 92. 10.1145/3306346.3322961. Green open access

[thumbnail of 1806.07889v1.pdf]
Preview
Text
1806.07889v1.pdf - Accepted Version

Download (60MB) | Preview

Abstract

Next generation smart and augmented reality systems demand a computational understanding of monocular footage that captures humans in physical spaces to reveal plausible object arrangements and human-object interactions. Despite recent advances, both in scene layout and human motion analysis, the above setting remains challenging to analyze due to regular occlusions that occur between objects and human motions. We observe that the interaction between object arrangements and human actions is often strongly correlated, and hence can be used to help recover from these occlusions. We present iMapper, a data-driven method to identify such human-object interactions and utilize them to infer layouts of occluded objects. Starting from a monocular video with detected 2D human joint positions that are potentially noisy and occluded, we first introduce the notion of interaction-saliency as space-time snapshots where informative human-object interactions happen. Then, we propose a global optimization to retrieve and fit interactions from a database to the detected salient interactions in order to best explain the input video. We extensively evaluate the approach, both quantitatively against manually annotated ground truth and through a user study, and demonstrate that iMapper produces plausible scene layouts for scenes with medium to heavy occlusion. Code and data are available on the project page.

Type: Article
Title: iMAPPER: Interaction-guided Scene Mapping from Monocular Videos
Open access status: An open access version is available from UCL Discovery
DOI: 10.1145/3306346.3322961
Publisher version: https://doi.org/10.1145/3306346.3322961
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: shape analysis, interaction, scene layout, 3D pose estimation, monocular video, occlusion
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment > Bartlett School Env, Energy and Resources
URI: https://discovery.ucl.ac.uk/id/eprint/10081679
Downloads since deposit
47Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item