UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Evaluating the Cranfield Paradigm for Conversational Search Systems

Fu, Xiao; Yilmaz, Emine; Lipani, Aldo; (2022) Evaluating the Cranfield Paradigm for Conversational Search Systems. In: Proceedings of the 2022 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR ’22). Association for Computing Machinery (ACM): New York, NY, USA. (In press). Green open access

[thumbnail of ICTIR2022.pdf]
Preview
Text
ICTIR2022.pdf - Accepted Version

Download (457kB) | Preview

Abstract

Due to the sequential and interactive nature of conversations, the application of traditional Information Retrieval (IR) methods like the Cranfield paradigm require stronger assumptions. When building a test collection for Ad Hoc search, it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance judgments perceived by an actual user of the search engine. However, when building a test collection for conversational search, we do not know if it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance judgments perceived by an actual user of the conversational search system. In this paper, we perform a crowdsourcing study to evaluate the applicability of the Cranfield paradigm to conversational search systems. Our main aim is to understand what is the agreement in terms of user satisfaction between the users performing a search task in a conversational search system (i.e., directly assessing the system) and the users observing the search task being performed (i.e., indirectly assessing the system). The result of this study is paramount because it underpins and guides 1) the development of more realistic user models and simulators, and 2) the design of more reliable and robust evaluation measures for conversational search systems. Our results show that there is a fair agreement between direct and indirect assessments in terms of user satisfaction and that these two kinds of assessments share similar conversational patterns. Indeed, by collecting relevance assessments for each system utterance, we tested several conversational patterns that show a promising ability to predict user satisfaction.

Type: Proceedings paper
Title: Evaluating the Cranfield Paradigm for Conversational Search Systems
Event: 2022 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR ’22)
ISBN-13: 978-1-4503-9412-3
Open access status: An open access version is available from UCL Discovery
DOI: 10.1145/3539813.3545126
Publisher version: https://www.ictir2022.org/
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Dialogue systems, evaluation, relevance, satisfaction
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Civil, Environ and Geomatic Eng
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10151173
Downloads since deposit
264Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item