Eriksson, Maria H;
Ripart, Mathilde;
Piper, Rory J;
Moeller, Friederike;
Das, Krishna B;
Eltze, Christin;
Cooray, Gerald;
... Wagstyl, Konrad; + view all
(2023)
Predicting seizure outcome after epilepsy surgery: do we need more complex models, larger samples, or better data?
Epilepsia
, 64
(8)
pp. 2014-2026.
10.1111/epi.17637.
Preview |
Text
Cross_Epilepsia - 2023 - Eriksson - Predicting seizure outcome after epilepsy surgery Do we need more complex models larger.pdf Download (2MB) | Preview |
Abstract
OBJECTIVE: The accurate prediction of seizure freedom after epilepsy surgery remains challenging. We investigated if 1) training more complex models, 2) recruiting larger sample sizes, or 3) using data-driven selection of clinical predictors would improve our ability to predict post-operative seizure outcome using clinical features. We also conducted the first substantial external validation of a machine learning model trained to predict post-operative seizure outcome. METHODS: We performed a retrospective cohort study of 797 children who had undergone resective or disconnective epilepsy surgery at a tertiary center. We extracted patient information from medical records and trained three models - a logistic regression, a multilayer perceptron, and an XGBoost model - to predict one-year post-operative seizure outcome on our dataset. We evaluated the performance of a recently published XGBoost model on the same patients. We further investigated the impact of sample size on model performance, using learning curve analysis to estimate performance at samples up to N=2,000. Finally, we examined the impact of predictor selection on model performance. RESULTS: Our logistic regression achieved an accuracy of 72% (95% CI=68-75%,AUC=0.72), while our multilayer perceptron and XGBoost both achieved accuracies of 71% (95% CIMLP =67-74%,AUCMLP =0.70; 95% CIXGBoost own =68-75%,AUCXGBoost own =0.70). There was no significant difference in performance between our three models (all P>0.4) and they all performed better than the external XGBoost, which achieved an accuracy of 63% (95% CI=59-67%,AUC=0.62; PLR =0.005,PMLP =0.01,PXGBoost own =0.01) on our data. All models showed improved performance with increasing sample size, but limited improvements beyond our current sample. The best model performance was achieved with data-driven feature selection. SIGNIFICANCE: We show that neither the deployment of complex machine learning models nor the assembly of thousands of patients alone is likely to generate significant improvements in our ability to predict post-operative seizure freedom. We instead propose that improved feature selection alongside collaboration, data standardization, and model sharing is required to advance the field.
Archive Staff Only
View Item |