UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

UCLWI at the NTCIR-18 AEOLLM Task: A Low-Cost Comparison of RAGs

Fu, Xiao; Bedi, Navdeep Singh; Kando, Noriko; Crestani, Fabio; Lipani, Aldo; (2025) UCLWI at the NTCIR-18 AEOLLM Task: A Low-Cost Comparison of RAGs. In: Kato, Makoto P and Kando, Noriko and Clarke, Charles LA and Liu, Yiqun, (eds.) NTCIR. National Institute of Informatics (NII) Green open access

[thumbnail of 02-NTCIR18-AEOLLM-FuX.pdf]
Preview
Text
02-NTCIR18-AEOLLM-FuX.pdf - Published Version

Download (769kB) | Preview

Abstract

The UCLWI team participated in the Automatic Evaluation of LLMs (AEOLLM) task of the NTCIR-18 [2]. We propose an efficient evaluation pipeline for Retrieval-Augmented Generation (RAG) systems tailored for low-resource settings. Our method uses ensemble similarity measures combined with a logistic regression classifier to assess answer quality from multiple system outputs using only the available queries and replies. Experiments across diverse tasks demonstrate competitive accuracy and reasonable correlation with ground truth rankings, establishing our approach as a reliable metric.

Type: Proceedings paper
Title: UCLWI at the NTCIR-18 AEOLLM Task: A Low-Cost Comparison of RAGs
Event: NTCIR-18: 2025 NTCIR Conference on Evaluation of Information Access Technologies
Open access status: An open access version is available from UCL Discovery
DOI: 10.20736/0002002026
Publisher version: https://doi.org/10.20736/0002002026
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Information Retrieval, Evaluation, RAG system, Ensemble
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Civil, Environ and Geomatic Eng
URI: https://discovery.ucl.ac.uk/id/eprint/10212525
Downloads since deposit
18Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item