UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Self-adapting Large Visual-Language Models to Edge Devices Across Visual Modalities

Cai, K; Duan, Z; Liu, G; Fleming, C; Lu, CX; (2025) Self-adapting Large Visual-Language Models to Edge Devices Across Visual Modalities. In: Leonardis, A and Ricci, E and Roth, S and Russakovsky, O and Sattler, T and Varol, G, (eds.) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). (pp. pp. 301-318). Springer Nature: Cham, Switzerland.

[thumbnail of 2403.04908v3.pdf] Text
2403.04908v3.pdf - Accepted Version
Access restricted to UCL open access staff until 1 November 2025.

Download (17MB)

Abstract

Recent advancements in Vision-Language (VL) models have sparked interest in their deployment on edge devices, yet challenges in handling diverse visual modalities, manual annotation, and computational constraints remain. We introduce EdgeVL, a novel framework that bridges this gap by seamlessly integrating dual-modality knowledge distillation and quantization-aware contrastive learning. This approach enables the adaptation of large VL models, like CLIP, for efficient use with both RGB and non-RGB images on resource-limited devices without the need for manual annotations. EdgeVL not only transfers visual language alignment capabilities to compact models but also maintains feature quality post-quantization, significantly enhancing open-vocabulary classification performance across various visual modalities. Our work represents the first systematic effort to adapt large VL models for edge deployment, showcasing up to 15.4% accuracy improvements on multiple datasets and up to 93-fold reduction in model size. Code available at https://github.com/ramdrop/edgevl.

Type: Proceedings paper
Title: Self-adapting Large Visual-Language Models to Edge Devices Across Visual Modalities
Event: Computer Vision – ECCV 2024
ISBN-13: 9783031733895
DOI: 10.1007/978-3-031-73390-1_18
Publisher version: http://dx.doi.org/10.1007/978-3-031-73390-1_18
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10200841
Downloads since deposit
1Download
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item