UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Implementation, evaluation and application of multiple imputation for missing data in longitudinal electronic health record research

Welch, CA; (2015) Implementation, evaluation and application of multiple imputation for missing data in longitudinal electronic health record research. Doctoral thesis , UCL (University College London). Green open access

[thumbnail of Catherine.Welch.thesis_final[1].pdf.REDACTED.pdf] PDF
Catherine.Welch.thesis_final[1].pdf.REDACTED.pdf

Download (2MB)

Abstract

Longitudinal electronic health records are a valuable resource for research because they contain information on many patients over long follow-up periods. Missing data commonly occur in these data because it was collected for clinical and not research purposes. Analysing data with missing values can potentially bias estimates and standard errors resulting in invalid inferences. Multiple imputation, commonly used in research to impute missing values, is increasingly regarded as the standard method for handling missing data in medical research because of its practicality and flexibility under the assumption the data is missing at random (MAR). Until now, few imputation approaches are sufficiently flexible to account for the longitudinal and dynamic structure of electronic health records. However, the two-fold fully conditional specification (FCS) algorithm was proposed to impute missing values in longitudinal data, but this methods was not currently validated in the complex setting of longitudinal electronic health records. I propose to adapt, evaluate and implement the two-fold FCS algorithm to impute missing data from large primary care database. To achieve this, first I investigate the extent and patterns of missing data in a longitudinal clinical database for health indicators associated with cardiovascular disease risk to determine if the MAR assumption is plausible. Additionally, I develop methods to identify and remove outliers, which can potentially bias imputations, from data with repeated measurements before imputation. Next, I adapt and develop the two-fold FCS multiple imputation algorithm to impute missing values in longitudinal clinical data for health indicators associated with cardiovascular disease risk and I validate the two-fold FCS algorithm to assess bias and precision through challenging simulation studies. I develop a new software programme which implements this adapted version of the two-fold FCS algorithm to impute missing values in longitudinal data. Finally, I apply the two-fold FCS algorithm in THIN to (i) model cardiovascular disease risk and (ii) understand factors associated with greater total cholesterol reduction in patients with type II diabetes

Type: Thesis (Doctoral)
Title: Implementation, evaluation and application of multiple imputation for missing data in longitudinal electronic health record research
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Third party copyright material has been removed from ethesis.
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Epidemiology and Health
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Epidemiology and Health > Primary Care and Population Health
URI: https://discovery.ucl.ac.uk/id/eprint/1464072
Downloads since deposit
860Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item