UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

TiV-ODE: A Neural ODE-based Approach for Controllable Video Generation from Text-Image Pairs

Xu, Y; Li, N; Goel, A; Yao, Z; Guo, Z; Kasaei, H; Kasaei, M; (2024) TiV-ODE: A Neural ODE-based Approach for Controllable Video Generation from Text-Image Pairs. In: Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA). (pp. pp. 14645-14652). IEEE: Yokohama, Japan. Green open access

[thumbnail of VideoImagination_Yucheng_ICRA2024.pdf]
Preview
Text
VideoImagination_Yucheng_ICRA2024.pdf - Accepted Version

Download (2MB) | Preview

Abstract

Videos capture the evolution of continuous dynamical systems over time in the form of discrete image sequences. Recently, video generation models have been widely used in robotic research. However, generating controllable videos from image-text pairs is an important yet underexplored research topic in both robotic and computer vision communities. This paper introduces an innovative and elegant framework named TiV-ODE, formulating this task as modeling the dynamical system in a continuous space. Specifically, our framework leverages the ability of Neural Ordinary Differential Equations (Neural ODEs) to model the complex dynamical system depicted by videos as a nonlinear ordinary differential equation. The resulting framework offers control over the generated videos' dynamics, content, and frame rate, a feature not provided by previous methods. Experiments demonstrate the ability of the proposed method to generate highly controllable and visually consistent videos and its capability of modeling dynamical systems. Overall, this work is a significant step towards developing advanced controllable video generation models that can handle complex and dynamic scenes.

Type: Proceedings paper
Title: TiV-ODE: A Neural ODE-based Approach for Controllable Video Generation from Text-Image Pairs
Event: 2024 IEEE International Conference on Robotics and Automation (ICRA)
Location: JAPAN, Yokohama
Dates: 13 May 2024 - 17 May 2024
ISBN-13: 979-8-3503-8458-1
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/ICRA57147.2024.10610149
Publisher version: https://doi.org/10.1109/icra57147.2024.10610149
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Computer vision, Computational modeling, Ordinary differential equations, Aerospace electronics, Mathematical models, Nonlinear dynamical systems, Image sequences
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10210109
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item