UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Rethinking the Potential of Multimodality in Collaborative Problem Solving Diagnosis with Large Language Models

Wong, K; Wu, B; Bulathwela, S; Cukurova, M; (2025) Rethinking the Potential of Multimodality in Collaborative Problem Solving Diagnosis with Large Language Models. In: Lecture Notes in Computer Science. (pp. pp. 18-32). Springer Nature

[thumbnail of AIED2025_Cukurova[4938].pdf] Text
AIED2025_Cukurova[4938].pdf - Accepted Version
Access restricted to UCL open access staff until 16 July 2026.

Download (663kB)

Abstract

Detecting collaborative and problem-solving behaviours from digital traces to interpret students’ collaborative problem solving (CPS) competency is a long-term goal in the Artificial Intelligence in Education field. Although multimodal data and advanced models are argued to have the potential to detect complex CPS behaviours, empirical evidence on their value remains limited with some contrasting evidence. In this study, we investigated the potential of multimodal data to improve model performance in diagnosing secondary school students’ CPS subskills in authentic educational settings. In particular, text and acoustic embeddings from audio data were used in a multimodal classification model for CPS diagnosis. Both unimodal and multimodal transformer-based models outperformed traditional models in detecting CPS classes. Although the inclusion of multimodality did not improve the performance of traditional unimodal models, its integration into transformer-based models demonstrated improved performance for diagnosing social-cognitive CPS classes compared to unimodal transformer-based models. Based on the results, the paper argues that the value of multimodality and the selection of a particular modelling technique are limited to certain types of CPS indicators, affected by the complexity of the labels, and dependent on the composition of indicators in the dataset. We conclude by discussing the required nuance when considering the value of large language models and multimodality in automated CPS diagnosis, highlighting the need for human-AI complementarity, and proposing the exploration of relevant model architectures and techniques to improve CPS diagnosis.

Type: Proceedings paper
Title: Rethinking the Potential of Multimodality in Collaborative Problem Solving Diagnosis with Large Language Models
ISBN-13: 9783031984167
DOI: 10.1007/978-3-031-98417-4_2
Publisher version: https://doi.org/10.1007/978-3-031-98417-4_2
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Education
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education
UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education > IOE - Culture, Communication and Media
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10212253
Downloads since deposit
1Download
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item