loading...
Historical Temporal Difference Learning: Some Initial Results
Hangzhou, Zhejiang, China June 20-June 24
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IMSCCS.2006.2312006 First International Multi-Sympos ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Hengshuai Yao, Tsinghua University
Diao Dongcui, Shandong University of Technology
Zengqi Sun, Tsinghua University
In this paper, we develop a multi-step prediction algorithm that is guaranteed to converge when using general function approximation. Besides, the new algorithm should satisfy the following requirements: First, it does not have to be faster than TD(0) in the look-up table representation; however, the new algorithm should be faster than residual gradient method. Second, the new algorithm should learn optimally.
Index Terms:
Multi-step Prediction, Reinforcement Learning, Temporal Difference Learning
Citation:
Hengshuai Yao, Diao Dongcui, Zengqi Sun, "Historical Temporal Difference Learning: Some Initial Results," imsccs, vol. 2, pp.678-685, 2006 First International Multi-Symposiums on Computer and Computational Sciences, 2006
Usage of this product signifies your acceptance of the Terms of Use.