UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Challenges in Generalization in Open Domain Question Answering

Liu, Linqing; Lewis, Patrick; Riedel, Sebastian; Stenetorp, Pontus; (2022) Challenges in Generalization in Open Domain Question Answering. In: Findings of the Association for Computational Linguistics: NAACL 2022. (pp. pp. 2014-2019). Association for Computational Linguistics: Online. Green open access

[thumbnail of 2022.findings-naacl.155.pdf]
Preview
PDF
2022.findings-naacl.155.pdf - Published Version

Download (500kB) | Preview

Abstract

Recent work on Open Domain Question Answering has shown that there is a large discrepancy in model performance between novel test questions and those that largely overlap with training questions. However, it is unclear which aspects of novel questions make them challenging. Drawing upon studies on systematic generalization, we introduce and annotate questions according to three categories that measure different levels and kinds of generalization: training set overlap, compositional generalization (comp-gen), and novel-entity generalization (novel-entity). When evaluating six popular parametric and non-parametric models, we find that for the established Natural Questions and TriviaQA datasets, even the strongest model performance for comp-gen/novel-entity is 13.1/5.4% and 9.6/1.5% lower compared to that for the full test set – indicating the challenge posed by these types of questions. Furthermore, we show that whilst non-parametric models can handle questions containing novel entities relatively well, they struggle with those requiring compositional generalization. Lastly, we find that key question difficulty factors are: cascading errors from the retrieval component, frequency of question pattern, and frequency of the entity.

Type: Proceedings paper
Title: Challenges in Generalization in Open Domain Question Answering
Event: NAACL 2022 - 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Open access status: An open access version is available from UCL Discovery
Publisher version: https://aclanthology.org/2022.findings-naacl.155/
Language: English
Additional information: © 1963–2022 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
Keywords: cs.CL, cs.CL, cs.AI
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10154328
Downloads since deposit
19Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item