eprintid: 10184511
rev_number: 14
eprint_status: archive
userid: 699
dir: disk0/10/18/45/11
datestamp: 2024-01-03 17:32:13
lastmod: 2024-05-21 14:43:16
status_changed: 2024-01-03 17:32:13
type: proceedings_section
metadata_visibility: show
sword_depositor: 699
creators_name: Wang, Hai
creators_name: Xiang, Xiaoyu
creators_name: Fan, Yuchen
creators_name: Xue, Jinghao
title: Customizing 360-Degree Panoramas Through Text-to-Image Diffusion Models
ispublished: pub
divisions: UCL
divisions: B04
divisions: C06
divisions: F61
note: This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions.
abstract: Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion.
date: 2024-04-09
date_type: published
publisher: IEEE
official_url: https://doi.org/10.1109/WACV57701.2024.00486
oa_status: green
full_text_type: pub
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 2135598
doi: 10.1109/WACV57701.2024.00486
lyricists_name: Xue, Jinghao
lyricists_id: JXUEX60
actors_name: Xue, Jinghao
actors_id: JXUEX60
actors_role: owner
full_text_status: public
pres_type: paper
place_of_pub: Waikoloa, HI, USA
pagerange: 4933-4943
event_title: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
event_dates: 4th-8th January 2024
book_title: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
citation:        Wang, Hai;    Xiang, Xiaoyu;    Fan, Yuchen;    Xue, Jinghao;      (2024)    Customizing 360-Degree Panoramas Through Text-to-Image Diffusion Models.                     In:  2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).  (pp. pp. 4933-4943).  IEEE: Waikoloa, HI, USA.       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10184511/1/Wang_Customizing_360-Degree_Panoramas_Through_Text-to-Image_Diffusion_Models_WACV_2024_paper.pdf