Saxe, Jeremy Morris Schorr;
(2025)
Selected topics on missing data in randomised clinical trials.
Doctoral thesis (Ph.D), UCL (University College London).
![]() |
Text
Saxe_10211152_Thesis.pdf Access restricted to UCL open access staff until 1 August 2026. Download (4MB) |
Abstract
Evidence-based medicine regards randomised clinical trials (RCTs) as the ``gold standard'' of experiments because they are considered the most straightforward way to estimate the causal treatment effect. Even so, missing data is ubiquitous among clinical trials and can lead to significant consequences, including potential bias, decreased efficiency, and a loss of power. The primary concern with missing data is whether it introduces bias that may lead to different inferences than had the missing data been observed. Analytical methods for missing data require untestable assumptions about the missing data, and sometimes, these methods require complex analyses to generate less biased estimates of the treatment effect. Multiple imputation (MI) is a commonly used missing data method that produces multiple plausible values of the missing observations and propagates the uncertainty of those estimates into the main analysis to yield unbiased estimates of the treatment effect and accurate quantification of the uncertainty. In clinical trials, the design and analysis are often pre-registered. The precise description of all analysis procedures and decisions must be pre-specified to encourage reproducible studies, prevent $p$-hacking, and avoid allegations of this nature. This thesis aims to improve upon the existing guidance for pre-specifying what missing data methods to use, under which conditions/assumptions, and how to diagnose when imputation procedures are underperforming. Many decisions must be made when implementing any missing data method to address missing values, and it is not yet clear how to handle them. Project 1: We present a simulation study and case studies exploring variable selection in imputation models. We aim to inform the pre-specification of MI models by comparing the variable selection methods of auxiliary variables in MI models under randomised clinical trial conditions. We compare agnostic variable selection approaches to a principled order of inclusion method and posit that penalised regression is a suitable approach to auxiliary variable selection when there are no strongly held beliefs about the covariates that are associated with missingness in advance. Project 2: We demonstrate, using a simulation study and a numerical example, the comparative performance of missing data methods in the context of a \textit{non-collapsible} conditional estimand and a missing covariate. \textit{Non-collapsible} estimands include odds and hazard ratios. We compare MI methods to complete case analysis, mean imputation, and the missing indicator method under different missingness mechanisms. Project 3: We introduce a sensitivity analysis approach using regression by composition when there is expert-provided information about the difference in the distributions of the outcome between the observed data and the unobserved data. We imagine a scenario where a trialist knows the true value of this quantity on either the risk ratio or risk difference scale. We perform a simulation study to examine the regression by composition approach and compare it to methods that do not incorporate expert beliefs. Project 4: We investigate how the stability of distributions used to sample imputations for a multivariate imputation using chained equations (MICE) procedure is affected by the number of iterations used. The motivation is to guide what quantities trialists should monitor to diagnose issues with the stability of the distributions used for MICE, particularly in RCTs. We aim to use iterative simulation diagnostic measures of the imputed values and of the within-imputation parameter estimates to identify and diagnose issues with the MICE procedure.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Selected topics on missing data in randomised clinical trials |
Language: | English |
Additional information: | Copyright © The Author 2025. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Inst of Clinical Trials and Methodology |
URI: | https://discovery.ucl.ac.uk/id/eprint/10211152 |
Archive Staff Only
![]() |
View Item |