UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Sparse temporal difference learning via alternating direction method of multipliers

Tsipinakis, N; Nelson, JDB; (2016) Sparse temporal difference learning via alternating direction method of multipliers. In: Kurgan, L and Palade, V and Wani, A, (eds.) Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA 2015). (pp. pp. 220-225). Institute of Electrical and Electronics Engineers (IEEE) Green open access

[thumbnail of paper.pdf]
Preview
Text
paper.pdf - Accepted Version

Download (426kB) | Preview

Abstract

Recent work in off-line Reinforcement Learning has focused on efficient algorithms to incorporate feature selection, via 1-regularization, into the Bellman operator fixed-point estimators. These developments now mean that over-fitting can be avoided when the number of samples is small compared to the number of features. However, it remains unclear whether existing algorithms have the ability to offer good approximations for the task of policy evaluation and improvement. In this paper, we propose a new algorithm for approximating the fixed-point based on the Alternating Direction Method of Multipliers (ADMM). We demonstrate, with experimental results, that the proposed algorithm is more stable for policy iteration compared to prior work. Furthermore, we also derive a theoretical result that states the proposed algorithm obtains a solution which satisfies the optimality conditions for the fixed-point problem.

Type: Proceedings paper
Title: Sparse temporal difference learning via alternating direction method of multipliers
Event: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), 9-11 December 2015, Miami, Florida, USA
ISBN-13: 9781509002870
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/ICMLA.2015.36
Publisher version: http://dx.doi.org/10.1109/ICMLA.2015.36
Language: English
Additional information: Copyright © 2015 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
URI: https://discovery.ucl.ac.uk/id/eprint/1477560
Downloads since deposit
192Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item