Pham, TM;
(2018)
Exploring strategies for incorporating population-level external information in multiple imputation of missing data.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
updated thesis - tmp.pdf - Accepted Version Download (4MB) | Preview |
Abstract
Multiple imputation (MI) is increasingly used for handling missing data in medical research. The standard implementation of MI assumes that data are missing at random (MAR). However, under missing not at random (MNAR) mechanisms, standard MI might not be satisfactory. When there are external data sources providing population-level information about the incomplete variables, it is desirable to utilise such information in MI. This thesis aims to explore how knowledge about the incomplete covariate's population marginal distribution from an external dataset can be used to improve standard MI under MNAR mechanisms. Two univariate MI methods are proposed for an incomplete binary/categorical covariate to anchor inference to the population: weighted MI and calibrated-δ adjustment MI. Chapter 3 demonstrates how, in weighted MI, the incomplete covariate's population distribution can be incorporated as probability weights in the imputation process to closely match the post-imputation distribution to the population level. Results from analytic and simulation studies of a 2x2 contingency table show that weighted MI can produce more accurate inferences under two general MNAR mechanisms. Weighted MI is also integrated into the multivariate imputation by chained equations (MICE) algorithm for imputing several incomplete covariates, accounting for their population marginal distributions from external data. Chapter 4 develops and evaluates calibrated-δ adjustment MI, which incorporates the incomplete covariate's population distribution as a δ adjustment in the imputation model’s intercept. In a 2x2 contingency table, it is shown analytically and via simulation that appropriately adjusting the imputation model's intercept fully corrects bias when the incomplete covariate is MNAR dependent on its values and the (complete) outcome. An adaptation of the method in the MICE algorithm for multivariate imputation is also explored. Chapter 5 investigates another univariate missing data setting, with a continuous outcome. Under the above MNAR mechanism, the presence of a second sensitivity parameter for the covariate – outcome association in the imputation model is introduced, rendering the calibrated-δ intercept adjustment insufficient. The sensitivity analysis then involves eliciting values of the second sensitivity parameter and deriving the calibrated-δ adjustment in the intercept. Chapter 6 presents two case studies using electronic health records to illustrate the application of the proposed population-calibrated MI methods.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Exploring strategies for incorporating population-level external information in multiple imputation of missing data |
Event: | UCL |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
UCL classification: | UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Inst of Clinical Trials and Methodology UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Inst of Clinical Trials and Methodology > MRC Clinical Trials Unit at UCL |
URI: | https://discovery.ucl.ac.uk/id/eprint/10044801 |
Archive Staff Only
View Item |