Dashti, S Ghazaleh;
Lee, Katherine J;
Simpson, Julie A;
White, Ian R;
Carlin, John B;
Moreno-Betancur, Margarita;
(2024)
Handling missing data when estimating causal effects
with targeted maximum likelihood estimation.
American Journal of Epidemiology
, Article kwae012. 10.1093/aje/kwae012.
(In press).
Preview |
PDF
kwae012.pdf - Published Version Download (1MB) | Preview |
Abstract
Targeted maximum likelihood estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on data (1992-1998) from the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate 8 missing-data methods in this context: complete-case analysis, extended TMLE incorporating an outcome-missingness model, the missing covariate missing indicator method, and 5 multiple imputation (MI) approaches using parametric or machine-learning models. We considered 6 scenarios that varied in terms of exposure/outcome generation models (presence of confounder-confounder interactions) and missingness mechanisms (whether outcome influenced missingness in other variables and presence of interaction/nonlinear terms in missingness models). Complete-case analysis and extended TMLE had small biases when outcome did not influence missingness in other variables. Parametric MI without interactions had large bias when exposure/outcome generation models included interactions. Parametric MI including interactions performed best in bias and variance reduction across all settings, except when missingness models included a nonlinear term. When choosing a method for handling missing data in the context of TMLE, researchers must consider the missingness mechanism and, for MI, compatibility with the analysis method. In many settings, a parametric MI approach that incorporates interactions and nonlinearities is expected to perform well.
Type: | Article |
---|---|
Title: | Handling missing data when estimating causal effects with targeted maximum likelihood estimation |
Location: | United States |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1093/aje/kwae012 |
Publisher version: | https://doi.org/10.1093/aje/kwae012 |
Language: | English |
Additional information: | © The Author(s) 2024. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
Keywords: | causal inference, missing data, multiple imputation, targeted maximum likelihood estimation |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Inst of Clinical Trials and Methodology UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Inst of Clinical Trials and Methodology > MRC Clinical Trials Unit at UCL |
URI: | https://discovery.ucl.ac.uk/id/eprint/10192195 |
Archive Staff Only
![]() |
View Item |