UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Automatic ICD-10 classification of diseases from Dutch discharge letters

Bagheri, A; Sammani, A; Van der Heijden, PGM; Asselbergs, FW; Oberski, DL; (2020) Automatic ICD-10 classification of diseases from Dutch discharge letters. In: Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: C2C. (pp. pp. 281-289). SciTePress: Valletta, Malta. Green open access

[thumbnail of C2C_2020_2.pdf]
Preview
Text
C2C_2020_2.pdf - Published Version

Download (624kB) | Preview

Abstract

The international classification of diseases (ICD) is a widely used tool to describe patient diagnoses. At University Medical Center Utrecht (UMCU), for example, trained medical coders translate information from hospital discharge letters into ICD-10 codes for research and national disease epidemiology statistics, at considerable cost. To mitigate these costs, automatic ICD coding from discharge letters would be useful. However, this task has proven challenging in practice: it is a multi-label task with a large number of very sparse categories, presented in a hierarchical structure. Moreover, existing ICD systems have been benchmarked only on relatively easier versions of this task, such as single-label performance and performance on the higher “chapter” level of the ICD hierarchy, which contains fewer categories. In this study, we benchmark the state-of-the-art ICD classification systems and two baseline systems on a large dataset constructed from Dutch cardiology discharge letters a t UMCU hospital. Performance of all systems is evaluated for both the easier chapter-level ICD codes and single-label version of the task found in the literature, as well as for the lower-level ICD hierarchy and multi-label task that is needed in practice. We find that state-of-the-art methods outperform the baseline for the single-label version of the task only. For the multi-label task, the baselines are not defeated by any state-of-the-art system, with the exception of HA-GRU, which does perform best in the most difficult task on accuracy. We conclude that practical performance may have been somewhat overstated in the literature, although deep learning techniques are sufficiently good to complement, though not replace, human ICD coding in our application.

Type: Proceedings paper
Title: Automatic ICD-10 classification of diseases from Dutch discharge letters
Event: 13th International Joint Conference on Biomedical Engineering Systems and Technologies
ISBN-13: 978-989-758-398-8
Open access status: An open access version is available from UCL Discovery
DOI: 10.5220/0009372602810289
Publisher version: https://doi.org/10.5220/0009372602810289
Language: English
Additional information: This is an Open Access paper published under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Licence (https://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Automated ICD Coding, Multi-label Classification, Clinical Text Mining, Dutch Discharge Letters
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
URI: https://discovery.ucl.ac.uk/id/eprint/10098370
Downloads since deposit
801Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item