UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Missing Data in Clinical Research: A Tutorial on Multiple Imputation

Austin, PC; White, IR; Lee, DS; Van Buuren, S; (2020) Missing Data in Clinical Research: A Tutorial on Multiple Imputation. Canadian Journal of Cardiology 10.1016/j.cjca.2020.11.010. (In press). Green open access

[thumbnail of 1-s2.0-S0828282X20311119-main.pdf]
Preview
Text
1-s2.0-S0828282X20311119-main.pdf - Accepted Version

Download (3MB) | Preview

Abstract

Missing data is a common occurrence in clinical research. Missing data occurs when the value of the variables of interest are not measured or recorded for all subjects in the sample. Common approaches to addressing the presence of missing data include complete-case analyses, in which subjects with missing data are excluded, or mean-value imputation, where missing values are replaced with the mean value of that variable in those subjects for whom it is not missing. However, in many settings, these approaches can lead to biased estimates of statistics (e.g., of regression coefficients) and/or to confidence intervals that are artificially narrow. Multiple imputation (MI) is a popular approach for addressing the presence of missing data. With MI, multiple plausible values of a given variable are imputed or filled-in for each subject who has missing data for that variable. This results in the creation of multiple completed datasets. Identical statistical analyses are conducted in each of these complete datasets and the results are pooled across complete datasets. We provide an introduction to MI and discuss issues in its implementation, including developing the imputation model, how many imputed datasets to create, and addressing derived variables. We illustrate the application of MI through an analysis of data on patients hospitalized with heart failure. We focus on developing a model to estimate the probability of one-year mortality in the presence of missing data. Statistical software code for conducting multiple imputation in R, SAS, and Stata are provided.

Type: Article
Title: Missing Data in Clinical Research: A Tutorial on Multiple Imputation
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1016/j.cjca.2020.11.010
Publisher version: https://doi.org/10.1016/j.cjca.2020.11.010
Language: English
Additional information: This is an Open Access article published under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence (https://creativecommons.org/licenses/by/4.0/).
Keywords: Missing data, multiple imputation, tutorial
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Inst of Clinical Trials and Methodology
URI: https://discovery.ucl.ac.uk/id/eprint/10117689
Downloads since deposit
152Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item