Kurbucz, Marcell T;
Tzivanakis, Nikolaos;
Sari Aslam, Nilufer;
Sykulski, Adam M;
(2025)
SplitWise regression for capturing nonlinear effects in interpretable model selection.
Scientific Reports
, 15
(1)
, Article 42454. 10.1038/s41598-025-26597-7.
Preview |
Text
SplitWise.pdf - Published Version Download (2MB) | Preview |
Abstract
Capturing nonlinear relationships while maintaining interpretability remains a persistent challenge in regression modeling. We introduce SplitWise, a stepwise regression framework that adaptively transforms numeric predictors into threshold-based binary features using shallow decision trees—only when such transformations improve model fit according to the Akaike or Bayesian Information Criterion. This design preserves the transparency of linear models while flexibly capturing threshold-based nonlinear effects, positioning SplitWise between classical linear and interpretable nonlinear regression. SplitWise retains a single, globally linear equation that selectively incorporates data-driven thresholds—yielding models that remain straightforward to interpret and verify. Across synthetic scenarios with nonlinear signal patterns, SplitWise reduced median RMSE by 7–14% relative to the best-performing interpretable linear baseline and improved variable-selection accuracy (median Matthews Correlation Coefficient up to ~ 0.79 vs. ~ 0.51 for LASSO). On real datasets, SplitWise matched or slightly improved RMSE while selecting fewer predictors. For instance, on Wine Quality (White), it improved RMSE from 0.756 to 0.752 and on Wine Quality (Red) from 0.654 to 0.649, using 6–10 predictors. On Bodyfat, it achieved 3.48–3.49 RMSE with four predictors, comparable to Elastic Net (3.41–3.48 RMSE) but with smaller models.
| Type: | Article |
|---|---|
| Title: | SplitWise regression for capturing nonlinear effects in interpretable model selection |
| Open access status: | An open access version is available from UCL Discovery |
| DOI: | 10.1038/s41598-025-26597-7 |
| Publisher version: | https://doi.org/10.1038/s41598-025-26597-7 |
| Language: | English |
| Additional information: | © The Author(s) 2025. This article is licensed under a Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/). |
| Keywords: | Stepwise regression, Interpretable modeling, Dummy variables, Threshold effects, Model selection, Software |
| UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment > UCL Institute for Global Prosperity |
| URI: | https://discovery.ucl.ac.uk/id/eprint/10218011 |
Archive Staff Only
![]() |
View Item |

