UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Probabilistic linkage to enhance deterministic algorithms and reduce data linkage errors in hospital administrative data

Hagger-Johnson, G; Harron, K; Goldstein, H; Aldridge, R; Gilbert, R; (2017) Probabilistic linkage to enhance deterministic algorithms and reduce data linkage errors in hospital administrative data. Journal of Innovation in Health Informatics , 24 (2) pp. 234-246. 10.14236/jhi.v24i2.891. Green open access

[thumbnail of Hagger-Johnson_Probabilistic_linkage_enhance.pdf]
Preview
Text
Hagger-Johnson_Probabilistic_linkage_enhance.pdf - Published Version

Download (659kB) | Preview

Abstract

BACKGROUND: The pseudonymisation algorithm used to link together episodes of care belonging to the same patients in England (HESID) has never undergone any formal evaluation, to determine the extent of data linkage error. OBJECTIVE: To quantify improvements in linkage accuracy from adding probabilistic linkage to existing deterministic HESID algorithms. METHODS: Inpatient admissions to NHS hospitals in England (Hospital Episode Statistics, HES) over 17 years (1998 to 2015) for a sample of patients (born 13/28th of months in 1992/1998/2005/2012). We compared the existing deterministic algorithm with one that included an additional probabilistic step, in relation to a reference standard created using enhanced probabilistic matching with additional clinical and demographic information. Missed and false matches were quantified and the impact on estimates of hospital readmission within one year were determined. RESULTS: HESID produced a high missed match rate, improving over time (8.6% in 1998 to 0.4% in 2015). Missed matches were more common for ethnic minorities, those living in areas of high socio-economic deprivation, foreign patients and those with 'no fixed abode'. Estimates of the readmission rate were biased for several patient groups owing to missed matches, which was reduced for nearly all groups. CONCLUSION: Probabilistic linkage of HES reduced missed matches and bias in estimated readmission rates, with clear implications for commissioning, service evaluation and performance monitoring of hospitals. The existing algorithm should be modified to address data linkage error, and a retrospective update of the existing data would address existing linkage errors and their implications.

Type: Article
Title: Probabilistic linkage to enhance deterministic algorithms and reduce data linkage errors in hospital administrative data
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.14236/jhi.v24i2.891
Publisher version: http://dx.doi.org/10.14236/jhi.v24i2.891
Language: English
Additional information: Copyright © 2017 The Author(s). Published by BCS, The Chartered Institute for IT under a Creative Commons license (http://creativecommons.org/licenses/by/4.0/)
Keywords: Probabilistic record linkage; Deterministic record linkage; Hospital discharge; Evaluation
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health > Population, Policy and Practice Dept
URI: https://discovery.ucl.ac.uk/id/eprint/1567555
Downloads since deposit
121Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item