UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Linking Data for Mothers and Babies in De-Identified Electronic Health Data

Harron, K; Gilbert, R; Cromwell, D; van der Meulen, J; (2016) Linking Data for Mothers and Babies in De-Identified Electronic Health Data. PLoS ONE , 11 (10) , Article e0164667. 10.1371/journal.pone.0164667. Green open access

[thumbnail of Harron-K_Linking Data for Mothers and Babies in De-Identified Electronic Health Data.pdf]
Preview
Text
Harron-K_Linking Data for Mothers and Babies in De-Identified Electronic Health Data.pdf - Published Version

Download (1MB) | Preview

Abstract

Objective Linkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England. Design and Setting Retrospective linkage study using electronic hospital records of mothers and babies admitted to NHS hospitals in England, captured in Hospital Episode Statistics between April 2001 and March 2013. Results Of 672,955 baby records in 2012/13, 280,470 (42%) linked deterministically to a maternal record using hospital, GP practice, maternal age, birthweight, gestation, birth order and sex. A further 380,164 (56%) records linked using probabilistic methods incorporating additional variables that could differ between mother/baby records (admission dates, ethnicity, 3/4-character postcode district) or that include missing values (delivery variables). The false-match rate was estimated at 0.15% using synthetic data. Data quality improved over time: for 2001/02, 91% of baby records were linked (holding the estimated false-match rate at 0.15%). The linked cohort was representative of national distributions of gender, gestation, birth weight and maternal age, and captured approximately 97% of births in England. Conclusion Probabilistic linkage of maternal and baby healthcare characteristics offers an efficient way to enrich maternity data, improve data quality, and create longitudinal cohorts for research and service evaluation. This approach could be extended to linkage of other datasets that have non-disclosive characteristics in common.

Type: Article
Title: Linking Data for Mothers and Babies in De-Identified Electronic Health Data
Open access status: An open access version is available from UCL Discovery
DOI: 10.1371/journal.pone.0164667
Publisher version: http://doi.org/10.1371/journal.pone.0164667
Language: English
Additional information: This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
Keywords: Science & Technology, Multidisciplinary Sciences, Science & Technology - Other Topics, RISK-FACTORS, BIRTH, PREGNANCY, COHORT, RECORD, PRETERM, INFANTS, LINKAGE
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health > Population, Policy and Practice Dept
URI: https://discovery.ucl.ac.uk/id/eprint/1527230
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item