UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

SplitWise regression for capturing nonlinear effects in interpretable model selection

Kurbucz, Marcell T; Tzivanakis, Nikolaos; Sari Aslam, Nilufer; Sykulski, Adam M; (2025) SplitWise regression for capturing nonlinear effects in interpretable model selection. Scientific Reports , 15 (1) , Article 42454. 10.1038/s41598-025-26597-7. Green open access

[thumbnail of SplitWise.pdf]
Preview
Text
SplitWise.pdf - Published Version

Download (2MB) | Preview

Abstract

Capturing nonlinear relationships while maintaining interpretability remains a persistent challenge in regression modeling. We introduce SplitWise, a stepwise regression framework that adaptively transforms numeric predictors into threshold-based binary features using shallow decision trees—only when such transformations improve model fit according to the Akaike or Bayesian Information Criterion. This design preserves the transparency of linear models while flexibly capturing threshold-based nonlinear effects, positioning SplitWise between classical linear and interpretable nonlinear regression. SplitWise retains a single, globally linear equation that selectively incorporates data-driven thresholds—yielding models that remain straightforward to interpret and verify. Across synthetic scenarios with nonlinear signal patterns, SplitWise reduced median RMSE by 7–14% relative to the best-performing interpretable linear baseline and improved variable-selection accuracy (median Matthews Correlation Coefficient up to ~ 0.79 vs. ~ 0.51 for LASSO). On real datasets, SplitWise matched or slightly improved RMSE while selecting fewer predictors. For instance, on Wine Quality (White), it improved RMSE from 0.756 to 0.752 and on Wine Quality (Red) from 0.654 to 0.649, using 6–10 predictors. On Bodyfat, it achieved 3.48–3.49 RMSE with four predictors, comparable to Elastic Net (3.41–3.48 RMSE) but with smaller models.

Type: Article
Title: SplitWise regression for capturing nonlinear effects in interpretable model selection
Open access status: An open access version is available from UCL Discovery
DOI: 10.1038/s41598-025-26597-7
Publisher version: https://doi.org/10.1038/s41598-025-26597-7
Language: English
Additional information: © The Author(s) 2025. This article is licensed under a Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
Keywords: Stepwise regression, Interpretable modeling, Dummy variables, Threshold effects, Model selection, Software
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment > UCL Institute for Global Prosperity
URI: https://discovery.ucl.ac.uk/id/eprint/10218011
Downloads since deposit
2Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item