UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

The analysis of record-linked data using multiple imputation with data value priors

Goldstein, H; Harron, K; Wade, A; (2012) The analysis of record-linked data using multiple imputation with data value priors. Statistics in Medicine , 31 (28) 3481 - 3493. 10.1002/sim.5508. Green open access

This is the latest version of this eprint.

[thumbnail of The analysis of record linked data SIM final revision May 2012 full document.pdf]
Preview
Text
The analysis of record linked data SIM final revision May 2012 full document.pdf
Available under License : See the attached licence file.

Download (642kB)

Abstract

Probabilistic record linkage techniques assign match weights to one or more potential matches for those individual records that cannot be assigned 'unequivocal matches' across data files. Existing methods select the single record having the maximum weight provided that this weight is higher than an assigned threshold. We argue that this procedure, which ignores all information from matches with lower weights and for some individuals assigns no match, is inefficient and may also lead to biases in subsequent analysis of the linked data. We propose that a multiple imputation framework be utilised for data that belong to records that cannot be matched unequivocally. In this way, the information from all potential matches is transferred through to the analysis stage. This procedure allows for the propagation of matching uncertainty through a full modelling process that preserves the data structure. For purposes of statistical modelling, results from a simulation example suggest that a full probabilistic record linkage is unnecessary and that standard multiple imputation will provide unbiased and efficient parameter estimates.

Type: Article
Title: The analysis of record-linked data using multiple imputation with data value priors
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1002/sim.5508
Publisher version: http://dx.doi.org/10.1002/sim.5508
Language: English
Additional information: This is the peer reviewed version of the following article: stein, H; Harron, K; Wade, A; (2012) The analysis of record-linked data using multiple imputation with data value priors. Statistics in Medicine , 31 (28) 3481 - 3493, which has been published in final form at http://dx.doi.org/10.1002/sim.5508. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.
Keywords: Bias (Epidemiology), Computer Simulation, Data Collection, Data Interpretation, Statistical, Humans, Markov Chains, Models, Statistical, Monte Carlo Method
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > UCL GOS Institute of Child Health > Population, Policy and Practice Dept
URI: https://discovery.ucl.ac.uk/id/eprint/1467145

Available Versions of this Item

Downloads since deposit
312Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item