UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Conservative Policy Construction Using Variational Autoencoders for Logged Data With Missing Values

Abroshan, M; Yip, KH; Tekin, C; van der Schaar, M; (2022) Conservative Policy Construction Using Variational Autoencoders for Logged Data With Missing Values. IEEE Transactions on Neural Networks and Learning Systems 10.1109/TNNLS.2021.3136385. Green open access

[thumbnail of Conservative_Policy_Construction_Using_Variational_Autoencoders_for_Logged_Data_With_Missing_Values.pdf]
Preview
Text
Conservative_Policy_Construction_Using_Variational_Autoencoders_for_Logged_Data_With_Missing_Values.pdf - Accepted Version

Download (995kB) | Preview

Abstract

In high-stakes applications of data-driven decision-making such as healthcare, it is of paramount importance to learn a policy that maximizes the reward while avoiding potentially dangerous actions when there is uncertainty. There are two main challenges usually associated with this problem. First, learning through online exploration is not possible due to the critical nature of such applications. Therefore, we need to resort to observational datasets with no counterfactuals. Second, such datasets are usually imperfect, additionally cursed with missing values in the attributes of features. In this article, we consider the problem of constructing personalized policies using logged data when there are missing values in the attributes of features in both training and test data. The goal is to recommend an action (treatment) when ~X, a degraded version of Xwith missing values, is observed. We consider three strategies for dealing with missingness. In particular, we introduce the conservative strategy where the policy is designed to safely handle the uncertainty due to missingness. In order to implement this strategy, we need to estimate posterior distribution p(X|~X) and use a variational autoencoder to achieve this. In particular, our method is based on partial variational autoencoders (PVAEs) that are designed to capture the underlying structure of features with missing values.

Type: Article
Title: Conservative Policy Construction Using Variational Autoencoders for Logged Data With Missing Values
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/TNNLS.2021.3136385
Publisher version: https://doi.org/10.1109/TNNLS.2021.3136385
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Uncertainty, Estimation, Task analysis, IP networks, Noise measurement, Tuning, Training data
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Physics and Astronomy
URI: https://discovery.ucl.ac.uk/id/eprint/10166022
Downloads since deposit
34Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item