UCL logo

UCL Discovery

UCL home » Library Services » Electronic resources » UCL Discovery

Optimality of LSTD and its relation to MC

Grunewalder, S; Hochreiter, S; Obermayer, K; (2007) Optimality of LSTD and its relation to MC. In: 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6. (pp. 338 - 343). IEEE

Full text not available from this repository.


In this analytical study we compare the risk of the Monte Carlo (MC) and the least-squares TD (LSTD) estimator. We prove that for the case of acyclic Markov Reward Processes (MRPs) LSTD has minimal risk for any convex loss function in the class of unbiased estimators. When comparing the Monte Carlo estimator, which does not assume a Markov structure, and LSTD, we find that the Monte Carlo estimator is equivalent to LSTD if both estimators have the same amount of information. Theoretical results are supported by an empirical evaluation of the estimators.

Type: Proceedings paper
Title: Optimality of LSTD and its relation to MC
Event: International Joint Conference on Neural Networks
Location: Orlando, FL
Dates: 2007-08-12 - 2007-08-17
ISBN-13: 978-1-4244-1379-9
URI: http://discovery.ucl.ac.uk/id/eprint/1323910
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item