UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Offline Deep Reinforcement Learning and Off-Policy Evaluation for Personalized Basal Insulin Control in Type 1 Diabetes

Zhu, Taiyu; Li, Kezhi; Georgiou, Pantelis; (2023) Offline Deep Reinforcement Learning and Off-Policy Evaluation for Personalized Basal Insulin Control in Type 1 Diabetes. IEEE Journal of Biomedical and Health Informatics pp. 1-12. 10.1109/jbhi.2023.3303367. (In press). Green open access

[thumbnail of Offline_Deep_Reinforcement_Learning_and_Off-Policy_Evaluation_for_Personalized_Basal_Insulin_Control_in_Type_1_Diabetes.pdf]
Preview
Text
Offline_Deep_Reinforcement_Learning_and_Off-Policy_Evaluation_for_Personalized_Basal_Insulin_Control_in_Type_1_Diabetes.pdf - Accepted Version

Download (1MB) | Preview

Abstract

Recent advancements in hybrid closed-loop systems, also known as the artificial pancreas (AP), have been shown to optimize glucose control and reduce the self-management burdens for people living with type 1 diabetes (T1D). AP systems can adjust the basal infusion rates of insulin pumps, facilitated by real-time communication with continuous glucose monitoring. Empowered by deep neural networks, deep reinforcement learning (DRL) has introduced new paradigms of basal insulin control algorithms. However, all the existing DRL-based AP controllers require a large number of random online interactions between the agent and environment. While this can be validated in T1D simulators, it becomes impractical in real-world clinical settings. To this end, we propose an offline DRL framework that can develop and validate models for basal insulin control entirely offline. It comprises a DRL model based on the twin delayed deep deterministic policy gradient and behavior cloning, as well as off-policy evaluation (OPE) using fitted Q evaluation. We evaluated the proposed framework on an in silico dataset containing 10 virtual adults and 10 virtual adolescents, generated by the UVA/Padova T1D simulator, and the OhioT1DM dataset, a clinical dataset with 12 real T1D subjects. The performance on the in silico dataset shows that the offline DRL algorithm significantly increased time in range while reducing time below range and time above range for both adult and adolescent groups. The high Spearman's rank correlation coefficients between actual and estimated policy values indicate the accurate estimation made by the OPE. Then, we used the OPE to estimate model performance on the clinical dataset, where a notable increase in policy values was observed for each subject. The results demonstrate that the proposed framework is a viable and safe method for improving personalized basal insulin control in T1D.

Type: Article
Title: Offline Deep Reinforcement Learning and Off-Policy Evaluation for Personalized Basal Insulin Control in Type 1 Diabetes
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/jbhi.2023.3303367
Publisher version: https://doi.org/10.1109/JBHI.2023.3303367
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
URI: https://discovery.ucl.ac.uk/id/eprint/10175959
Downloads since deposit
116Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item