Cai, K;
Duan, Z;
Liu, G;
Fleming, C;
Lu, CX;
(2025)
Self-adapting Large Visual-Language Models to Edge Devices Across Visual Modalities.
In: Leonardis, A and Ricci, E and Roth, S and Russakovsky, O and Sattler, T and Varol, G, (eds.)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).
(pp. pp. 301-318).
Springer Nature: Cham, Switzerland.
Text
2403.04908v3.pdf - Accepted Version Access restricted to UCL open access staff until 1 November 2025. Download (17MB) |
Abstract
Recent advancements in Vision-Language (VL) models have sparked interest in their deployment on edge devices, yet challenges in handling diverse visual modalities, manual annotation, and computational constraints remain. We introduce EdgeVL, a novel framework that bridges this gap by seamlessly integrating dual-modality knowledge distillation and quantization-aware contrastive learning. This approach enables the adaptation of large VL models, like CLIP, for efficient use with both RGB and non-RGB images on resource-limited devices without the need for manual annotations. EdgeVL not only transfers visual language alignment capabilities to compact models but also maintains feature quality post-quantization, significantly enhancing open-vocabulary classification performance across various visual modalities. Our work represents the first systematic effort to adapt large VL models for edge deployment, showcasing up to 15.4% accuracy improvements on multiple datasets and up to 93-fold reduction in model size. Code available at https://github.com/ramdrop/edgevl.
Type: | Proceedings paper |
---|---|
Title: | Self-adapting Large Visual-Language Models to Edge Devices Across Visual Modalities |
Event: | Computer Vision – ECCV 2024 |
ISBN-13: | 9783031733895 |
DOI: | 10.1007/978-3-031-73390-1_18 |
Publisher version: | http://dx.doi.org/10.1007/978-3-031-73390-1_18 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10200841 |
Archive Staff Only
View Item |