Kocijan, Vid;
Camburu, Oana-Maria;
Cretu, Ana-Maria;
Yordanov, Yordan;
Blunsom, Phil;
Lukasiewicz, Thomas;
(2019)
WikiCREM: A Large Unsupervised Corpus for Coreference Resolution.
In:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
(pp. pp. 4302-4312).
Association for Computational Linguistics: Hong Kong, China.
Preview |
Text
wiki.pdf - Published Version Download (316kB) | Preview |
Abstract
Pronoun resolution is a major area of natural language understanding. However, large-scale training sets are still scarce, since manually labelling data is costly. In this work, we introduce WikiCREM (Wikipedia CoREferences Masked) a large-scale, yet accurate dataset of pronoun disambiguation instances. We use a language-model-based approach for pronoun resolution in combination with our WikiCREM dataset. We compare a series of models on a collection of diverse and challenging coreference resolution problems, where we match or outperform previous state-of-the-art approaches on 6 out of 7 datasets, such as GAP, DPR, WNLI, PDP, WinoBias, and WinoGender. We release our model to be used off-the-shelf for solving pronoun disambiguation.
Type: | Proceedings paper |
---|---|
Title: | WikiCREM: A Large Unsupervised Corpus for Coreference Resolution |
Event: | Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) |
Dates: | Nov 2019 - Nov 2019 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.18653/v1/d19-1439 |
Publisher version: | http://dx.doi.org/10.18653/v1/d19-1439 |
Language: | English |
Additional information: | ACL materials are Copyright © 1963–2024 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10184087 |



1. | ![]() | 2 |
2. | ![]() | 1 |
3. | ![]() | 1 |
Archive Staff Only
![]() |
View Item |