Pedestrian Intention Prediction via Vision-Language Foundation Models

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Pedestrian Intention Prediction via Vision-Language Foundation Models

Azarmi, Mohsen; Rezaei, Mandi; Wang, He; (2025) Pedestrian Intention Prediction via Vision-Language Foundation Models. In: Proceedings of the 2025 IEEE INTELLIGENT VEHICLES SYMPOSIUM, IV. (pp. pp. 1899-1904). IEEE: Cluj-Napoca, Romania. Green open access

Preview

PDF
2507.04141v1.pdf - Accepted Version
Download (931kB) | Preview

Abstract

Prediction of pedestrian crossing intention is a critical function in autonomous vehicles. Conventional vision-based methods of crossing intention prediction often struggle with generalizability, context understanding, and causal reasoning. This study explores the potential of vision-language foundation models (VLFMs) for predicting pedestrian crossing intentions by integrating multimodal data through hierarchical prompt templates. The methodology incorporates contextual information, including visual frames, physical cues observations, and ego-vehicle dynamics, into systematically refined prompts to guide VLFMs effectively in intention prediction. Experiments were conducted on three common datasets—JAAD, PIE, and FU-PIP. Results demonstrate that incorporating vehicle speed, its variations over time, and time-conscious prompts significantly enhances the prediction accuracy up to 19.8%. Additionally, optimised prompts generated via an automatic prompt engineering framework yielded 12.5% further accuracy gains. These findings highlight the superior performance of VLFMs compared to conventional vision-based models, offering enhanced generalisation and contextual understanding for autonomous driving applications.

Type:	Proceedings paper
Title:	Pedestrian Intention Prediction via Vision-Language Foundation Models
Event:	36th Intelligent Vehicles Symposium-IV-Annual
Location:	ROMANIA, Cluj-Napoca
Dates:	22 Jun 2025 - 25 Jun 2025
ISBN-13:	979-8-3315-3804-0
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1109/IV64158.2025.11097349
Publisher version:	https://doi.org/10.1109/iv64158.2025.11097349
Language:	English
Additional information:	This version is the author-accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10214825

Downloads since deposit

10Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item