UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Image Completion Network Considering Global and Local Information

Penn, Alan; Lin, Yubo; Chen, Ke; (2025) Image Completion Network Considering Global and Local Information. Buildings , 15 (20) , Article 3746. 10.3390/buildings15203746. Green open access

[thumbnail of buildings-15-03746-v2.pdf]
Preview
Text
buildings-15-03746-v2.pdf - Published Version

Download (1MB) | Preview

Abstract

Accurate depth image inpainting in complex urban environments remains a critical challenge due to occlusions, reflections, and sensor limitations, which often result in significant data loss. We propose a hybrid deep learning framework that explicitly combines local and global modelling through Convolutional Neural Networks (CNNs) and Transformer modules. The model employs a multi-branch parallel architecture, where the CNN branch captures fine-grained local textures and edges, while the Transformer branch models global semantic structures and long-range dependencies. We introduce an optimized attention mechanism, Agent Attention, which differs from existing efficient/linear attention methods by using learnable proxy tokens tailored for urban scene categories (e.g., façades, sky, ground). A content-guided dynamic fusion module adaptively combines multi-scale features to enhance structural alignment and texture recovery. The frame-work is trained with a composite loss function incorporating pixel accuracy, perceptual similarity, adversarial realism, and structural consistency. Extensive experiments on the Paris StreetView dataset demonstrate that the proposed method achieves state-of-the-art performance, outperforming existing approaches in PSNR, SSIM, and LPIPS metrics. The study highlights the potential of multi-scale modeling for urban depth inpainting and discusses challenges in real-world deployment, ethical considerations, and future directions for multimodal integration.

Type: Article
Title: Image Completion Network Considering Global and Local Information
Open access status: An open access version is available from UCL Discovery
DOI: 10.3390/buildings15203746
Publisher version: https://doi.org/10.3390/buildings15203746
Language: English
Additional information: © 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Keywords: Image inpainting; depth completion; multi-scale modeling; Transformer-CNN fusion; urban scene understanding
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment > The Bartlett School of Architecture
URI: https://discovery.ucl.ac.uk/id/eprint/10215771
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item