UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework

Zu, Weiqin; Song, Wenbin; Chen, Ruiqing; Guo, Ze; Sun, Fanglei; Tian, Zheng; Pan, Wei; (2024) Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2024. (pp. pp. 1019-1025). IEEE Green open access

[thumbnail of 2311.08244v2.pdf]
Preview
Text
2311.08244v2.pdf - Accepted Version

Download (34MB) | Preview

Abstract

The socially-aware navigation system has evolved to adeptly avoid various obstacles while performing multiple tasks, such as point-to-point navigation, human-following, and-guiding. However, a prominent gap persists: in Human-Robot Interaction (HRI), the procedure of communicating commands to robots demands intricate mathematical formulations. Furthermore, the transition between tasks does not quite possess the intuitive control and user-centric interactivity that one would desire. In this work, we propose an LLM-driven interactive multimodal multitask robot navigation framework, termed LIM2N, to solve the above new challenge in the navigation field. We achieve this by first introducing a multimodal interaction framework where language and hand-drawn inputs can serve as navigation constraints and control objectives. Next, a reinforcement learning agent is built to handle multiple tasks with the received information. Crucially, LIM2N creates smooth cooperation among the reasoning of multimodal input, multitask planning, and adaptation and processing of the intelligent sensing modules in the complicated system. Detailed experiments are conducted in both simulation and the real world demonstrating that LIM2N has solid user needs understanding, alongside an enhanced interactive experience.

Type: Proceedings paper
Title: Language and Sketching: An LLM-driven Interactive Multimodal Multitask Robot Navigation Framework
Event: IEEE International Conference on Robotics and Automation (ICRA)
Location: Yokohama, Japan
Dates: 13th-17th May 2024
ISBN-13: 979-8-3503-8458-1
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/ICRA57147.2024.10611462
Publisher version: https://doi.org/10.1109/icra57147.2024.10611462
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10206799
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item