UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

E-NER - An Annotated Named Entity Recognition Corpus of Legal Text

Au, TWT; Lampos, V; Cox, IJ; (2022) E-NER - An Annotated Named Entity Recognition Corpus of Legal Text. In: Proceedings of the Natural Legal Language Processing Workshop 2022. (pp. pp. 246-255). Association for Computational Linguistics: Abu Dhabi, United Arab Emirates (Hybrid). Green open access

[thumbnail of 2022.nllp-1.22.pdf]
Preview
Text
2022.nllp-1.22.pdf - Published Version

Download (1MB) | Preview

Abstract

Identifying named entities such as a person, location or organization, in documents can highlight key information to readers. Training Named Entity Recognition (NER) models requires an annotated data set, which can be a time-consuming labour-intensive task. Nevertheless, there are publicly available NER data sets for general English. Recently there has been interest in developing NER for legal text. However, prior work and experimental results reported here indicate that there is a significant degradation in performance when NER methods trained on a general English data set are applied to legal text. We describe a publicly available legal NER data set, called E-NER, based on legal company filings available from the US Securities and Exchange Commission's EDGAR data set. Training a number of different NER algorithms on the general English CoNLL-2003 corpus but testing on our test collection confirmed significant degradations in accuracy, as measured by the F1-score, of between 29.4% and 60.4%, compared to training and testing on the E-NER collection.

Type: Proceedings paper
Title: E-NER - An Annotated Named Entity Recognition Corpus of Legal Text
Event: NLLP 2022 - Natural Legal Language Processing Workshop 2022
ISBN-13: 9781959429180
Open access status: An open access version is available from UCL Discovery
Publisher version: https://aclanthology.org/2022.nllp-1.22/
Language: English
Additional information: Copyright © 1963–2023 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10169745
Downloads since deposit
Loading...
23Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item