UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Data consistency in the English Hospital Episodes Statistics database

Hardy, Flavien; Heyl, Johannes; Tucker, Katie; Hopper, Adrian; Marchã, Maria J; Briggs, Tim WR; Yates, Jeremy; ... Gray, William K; + view all (2022) Data consistency in the English Hospital Episodes Statistics database. BMJ Health & Care Informatics , 29 (1) , Article e100633. 10.1136/bmjhci-2022-100633. Green open access

[thumbnail of e100633.full.pdf]
Preview
Text
e100633.full.pdf - Published Version

Download (4MB) | Preview

Abstract

BACKGROUND: To gain maximum insight from large administrative healthcare datasets it is important to understand their data quality. Although a gold standard against which to assess criterion validity rarely exists for such datasets, internal consistency can be evaluated. We aimed to identify inconsistencies in the recording of mandatory International Statistical Classification of Diseases and Related Health Problems, tenth revision (ICD-10) codes within the Hospital Episodes Statistics dataset in England. METHODS: Three exemplar medical conditions where recording is mandatory once diagnosed were chosen: autism, type II diabetes mellitus and Parkinson's disease dementia. We identified the first occurrence of the condition ICD-10 code for a patient during the period April 2013 to March 2021 and in subsequent hospital spells. We designed and trained random forest classifiers to identify variables strongly associated with recording inconsistencies. RESULTS: For autism, diabetes and Parkinson's disease dementia respectively, 43.7%, 8.6% and 31.2% of subsequent spells had inconsistencies. Coding inconsistencies were highly correlated with non-coding of an underlying condition, a change in hospital trust and greater time between the spell with the first coded diagnosis and the subsequent spell. For patients with diabetes or Parkinson's disease dementia, the code recording for spells without an overnight stay were found to have a higher rate of inconsistencies. CONCLUSIONS: Data inconsistencies are relatively common for the three conditions considered. Where these mandatory diagnoses are not recorded in administrative datasets, and where clinical decisions are made based on such data, there is potential for this to impact patient care.

Type: Article
Title: Data consistency in the English Hospital Episodes Statistics database
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1136/bmjhci-2022-100633
Publisher version: https://doi.org/10.1136/bmjhci-2022-100633
Language: English
Additional information: © Author(s) (or their employer[s]) 2022. This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license (https://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10158428
Downloads since deposit
66Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item