Cooper, J;
Mitic, P;
Reinert, G;
Temčinas, T;
(2022)
Topological Analysis of Credit Data: Preliminary Findings.
In:
Intelligent Data Engineering and Automated Learning – IDEAL 2022. IDEAL 2022.
(pp. pp. 432-442).
Springer: Cham, Switzerland.
Preview |
Text
Mitic_Cooper_et_al_2022_topological_analysis_of.pdf Download (767kB) | Preview |
Abstract
Intuitively, similar customers should have similar credit risk. Capturing this similarity is often attempted using Euclidean distances between customer features and predicting credit default via logistic regression. Here we explore the use of topological data analysis for describing this similarity. In particular, persistent homology algorithms provide summaries of point clouds which relate to their topology. This approach has been shown to be useful in many applications but to the best of our knowledge, applying topological data analysis to prediction of credit risk is novel. We develop a pipeline which is based on the topological analysis of neighbourhoods of customers, with the neighbourhoods determined by a geometric network construction. We find a modest signal using three data sets from the Lending Club, and the Japan Credit Screening data set. The Cleveland oncological data set is used to validate the pipeline. The results have high variance, but they indicate that including such topological features could improve credit risk prediction when used as additional explanatory variable in a logistic regression.
Type: | Proceedings paper |
---|---|
Title: | Topological Analysis of Credit Data: Preliminary Findings |
Event: | International Conference on Intelligent Data Engineering and Automated Learning 2022 |
ISBN-13: | 9783031217524 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1007/978-3-031-21753-1_42 |
Publisher version: | ttps://doi.org/10.1007/978-3-031-21753-1_42 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Credit risk, Topological data analysis, Barcode, Landscape, Logistic regression |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10163320 |




Archive Staff Only
![]() |
View Item |