eprintid: 10123829
rev_number: 13
eprint_status: archive
userid: 608
dir: disk0/10/12/38/29
datestamp: 2021-03-12 14:59:51
lastmod: 2022-06-19 06:10:19
status_changed: 2021-03-12 14:59:51
type: article
metadata_visibility: show
creators_name: Gkioulekas, I
creators_name: Papageorgiou, LG
title: Tree regression models using statistical testing and mixed integer programming
ispublished: pub
divisions: UCL
divisions: A01
divisions: B04
divisions: C05
divisions: F43
keywords: Mathematical programming; Regression analysis; Decision trees; Subset selection; Optimisation
note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
abstract: Regression analysis is a statistical procedure that fits a mathematical function to a set of data in order to capture the relationship between dependent and independent variables. In tree regression, tree structures are constructed by repeated splits of the input space into two subsets, creating if-then-else rules. Such models are popular in the literature due to their ability to be computed quickly and their simple interpretations. This work introduces a tree regression algorithm that exploits an optimisation model of an existing literature method called Mathematical Programming Tree (MPtree) to optimally split nodes into subsets and applies a statistical test to assess the quality of the partitioning. Additionally, an approach of splitting nodes using multivariate decision rules is explored in this work and compared in terms of performance and computational efficiency. Finally, a novel mathematical model is introduced that performs subset selection on each node in order to select an optimal set of variables to considered for splitting, that improves the computational performance of the proposed algorithm.
date: 2021-03-18
date_type: published
official_url: https://doi.org/10.1016/j.cie.2020.107059
oa_status: green
full_text_type: other
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 1845242
doi: 10.1016/j.cie.2020.107059
lyricists_name: Papageorgiou, Lazaros
lyricists_id: LPAPA33
actors_name: Papageorgiou, Lazaros
actors_id: LPAPA33
actors_role: owner
full_text_status: public
publication: Computers and Industrial Engineering
volume: 153
article_number: 107059
issn: 0360-8352
citation:        Gkioulekas, I;    Papageorgiou, LG;      (2021)    Tree regression models using statistical testing and mixed integer programming.                   Computers and Industrial Engineering , 153     , Article 107059.  10.1016/j.cie.2020.107059 <https://doi.org/10.1016/j.cie.2020.107059>.       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10123829/1/CAIE_Tree_Regression_Accepted.pdf