eprintid: 10184511 rev_number: 14 eprint_status: archive userid: 699 dir: disk0/10/18/45/11 datestamp: 2024-01-03 17:32:13 lastmod: 2024-05-21 14:43:16 status_changed: 2024-01-03 17:32:13 type: proceedings_section metadata_visibility: show sword_depositor: 699 creators_name: Wang, Hai creators_name: Xiang, Xiaoyu creators_name: Fan, Yuchen creators_name: Xue, Jinghao title: Customizing 360-Degree Panoramas Through Text-to-Image Diffusion Models ispublished: pub divisions: UCL divisions: B04 divisions: C06 divisions: F61 note: This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions. abstract: Personalized text-to-image (T2I) synthesis based on diffusion models has attracted significant attention in recent research. However, existing methods primarily concentrate on customizing subjects or styles, neglecting the exploration of global geometry. In this study, we propose an approach that focuses on the customization of 360-degree panoramas, which inherently possess global geometric properties, using a T2I diffusion model. To achieve this, we curate a paired image-text dataset specifically designed for the task and subsequently employ it to fine-tune a pre-trained T2I diffusion model with LoRA. Nevertheless, the fine-tuned model alone does not ensure the continuity between the leftmost and rightmost sides of the synthesized images, a crucial characteristic of 360-degree panoramas. To address this issue, we propose a method called StitchDiffusion. Specifically, we perform pre-denoising operations twice at each time step of the denoising process on the stitch block consisting of the leftmost and rightmost image regions. Furthermore, a global cropping is adopted to synthesize seamless 360-degree panoramas. Experimental results demonstrate the effectiveness of our customized model combined with the proposed StitchDiffusion in generating high-quality 360-degree panoramic images. Moreover, our customized model exhibits exceptional generalization ability in producing scenes unseen in the fine-tuning dataset. Code is available at https://github.com/littlewhitesea/StitchDiffusion. date: 2024-04-09 date_type: published publisher: IEEE official_url: https://doi.org/10.1109/WACV57701.2024.00486 oa_status: green full_text_type: pub language: eng primo: open primo_central: open_green verified: verified_manual elements_id: 2135598 doi: 10.1109/WACV57701.2024.00486 lyricists_name: Xue, Jinghao lyricists_id: JXUEX60 actors_name: Xue, Jinghao actors_id: JXUEX60 actors_role: owner full_text_status: public pres_type: paper place_of_pub: Waikoloa, HI, USA pagerange: 4933-4943 event_title: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) event_dates: 4th-8th January 2024 book_title: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) citation: Wang, Hai; Xiang, Xiaoyu; Fan, Yuchen; Xue, Jinghao; (2024) Customizing 360-Degree Panoramas Through Text-to-Image Diffusion Models. In: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). (pp. pp. 4933-4943). IEEE: Waikoloa, HI, USA. Green open access document_url: https://discovery.ucl.ac.uk/id/eprint/10184511/1/Wang_Customizing_360-Degree_Panoramas_Through_Text-to-Image_Diffusion_Models_WACV_2024_paper.pdf