eprintid: 10091486
rev_number: 19
eprint_status: archive
userid: 608
dir: disk0/10/09/14/86
datestamp: 2020-02-18 15:34:10
lastmod: 2021-11-08 00:04:38
status_changed: 2020-02-18 15:34:10
type: proceedings_section
metadata_visibility: show
creators_name: Garcia, MS
creators_name: Agarwal, B
creators_name: Mookerjee, RP
creators_name: Jalan, R
creators_name: Doyle, G
creators_name: Ranco, G
creators_name: Arroyo, V
creators_name: Pavesi, M
creators_name: Garcia, E
creators_name: Saliba, F
creators_name: Banares, R
creators_name: Fernandez, J
title: An Accurate Data Preparation Approach for the Prediction of Mortality in ACLF Patients using the CANONIC Dataset
ispublished: pub
divisions: UCL
divisions: B02
divisions: C10
divisions: D17
divisions: G91
note: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
abstract: The incidence of chronic liver disease has increased in Europe and can lead to Acute on Chronic Liver Failure (ACLF) which is associated with high levels of mortality due to multisystem organ failure. The characteristics of the ACLF patients can change very rapidly within a short period of time. Continuous assessment of their recovery status is critical for clinicians to adjust and deliver effective treatment. The aim of this paper is to validate the usefulness of a data preparation approach by combining different criteria to replace missing values, balance target-class variables, select useful patient characteristics and optimise hyperparameters of machine learning models for the prediction of ACLF associated mortality rates. A key step in the data preparation is a feature selection Mutual Information (MI) based multivariate approach to build smaller, and yet equally and in some cases more informative, subsets of patient characteristics than those frequently proposed for the prediction of mortality, from patients with ACLF in the CANONIC dataset. The usefulness of the data preparation approach proposed to predict mortality was evaluated by training the XGBoost and Logistic Regression models with the prepared data. Evaluations of the models trained using a test set provided evidence of an overall high accuracy in the prediction of the mortality rates of patients for days after their diagnosis, and in some cases even higher when reduced and more informative subsets of patient characteristics were found.
date: 2019-10-07
date_type: published
publisher: IEEE
official_url: https://doi.org/10.1109/EMBC.2019.8857239
oa_status: green
full_text_type: other
language: eng
primo: open
primo_central: open_green
verified: verified_manual
elements_id: 1744787
doi: 10.1109/EMBC.2019.8857239
lyricists_name: Agarwal, Banwari
lyricists_name: Jalan, Rajiv
lyricists_name: Mookerjee, Rajeshwar
lyricists_id: BAGAR28
lyricists_id: RJALA78
lyricists_id: RPMOO69
actors_name: Stacey, Thomas
actors_id: TSSTA20
actors_role: owner
full_text_status: public
publication: Conf Proc IEEE Eng Med Biol Soc
volume: 2019
place_of_pub: Berlin, Germany
pagerange: 1371-1377
event_title: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)
event_location: United States
citation:        Garcia, MS;    Agarwal, B;    Mookerjee, RP;    Jalan, R;    Doyle, G;    Ranco, G;    Arroyo, V;                     ... Fernandez, J; + view all <#>        Garcia, MS;  Agarwal, B;  Mookerjee, RP;  Jalan, R;  Doyle, G;  Ranco, G;  Arroyo, V;  Pavesi, M;  Garcia, E;  Saliba, F;  Banares, R;  Fernandez, J;   - view fewer <#>    (2019)    An Accurate Data Preparation Approach for the Prediction of Mortality in ACLF Patients using the CANONIC Dataset.                     In:   (Proceedings) 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). (pp. pp. 1371-1377).  IEEE: Berlin, Germany.       Green open access   
 
document_url: https://discovery.ucl.ac.uk/id/eprint/10091486/2/Jalan_root.pdf