UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Lightweight Neural App Control

Christianos, F; Papoudakis, G; Coste, T; Hao, J; Wang, J; Shao, K; (2025) Lightweight Neural App Control. In: 13th International Conference on Learning Representations ICLR 2025. ICLR: Singapore. Green open access

[thumbnail of 9741_Lightweight_Neural_App_Co.pdf]
Preview
PDF
9741_Lightweight_Neural_App_Co.pdf - Accepted Version

Download (2MB) | Preview

Abstract

This paper introduces a novel mobile phone control architecture, Lightweight Multi-modal App Control (LiMAC), for efficient interactions and control across various Android apps. LiMAC takes as input a textual goal and a sequence of past mobile observations, such as screenshots and corresponding UI trees, to generate precise actions. To address the computational constraints inherent to smartphones, we introduce a small Action Transformer (AcT) integrated with a fine-tuned vision-language model (VLM) for real-time decision-making and task execution. We evaluate LiMAC on two open-source mobile control datasets, demonstrating the superior performance of our small-form-factor approach against fine-tuned versions of open-source VLMs, such as Florence2 and Qwen2-VL. It also significantly outperforms prompt engineering baselines utilising closed-source foundation models like GPT-4o. More specifically, LiMAC increases the overall action accuracy by up to 19% compared to fine-tuned VLMs, and up to 42% compared to prompt-engineering baselines.

Type: Proceedings paper
Title: Lightweight Neural App Control
Event: ICLR 2025
Open access status: An open access version is available from UCL Discovery
Publisher version: https://openreview.net/forum?id=BL4WBIfyrz
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: vision-language model, multi-modal, android control, app agent
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10212511
Downloads since deposit
11Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item