Wei Chen, National University of Defense Technology, China
Da-xue Liu, National University of Defense Technology, China
Tao Wu, National University of Defense Technology, China
Han-gen He, National University of Defense Technology, China
To reduce the learning time of reinforcement learning (RL), hybrid algorithms that combines reinforcement learning with various supervised learning methods have attracted many research interests. However, the global convergence and optimality become one of the main problems for hybrid reinforcement learning algorithms. In this paper, the convergence of a hybrid RL algorithm, which is combined with support vector machines(SVMs)is analyzed theoretically. It is shown that by making use of policy gradient learning and the SVM regression, the hybrid algorithm can easily escape from local optima.
Citation:
Xue-ning Wang, Wei Chen, Da-xue Liu, Tao Wu, Han-gen He, "The Optimality Analysis of Hybrid Reinforcement Learning Combined with SVMs," isda, vol. 1, pp.936-941, Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06) Volume 1, 2006