Fast and Epsilon-Optimal Discretized Pursuit Learning Automata
Document Type
Article
Publication Date
10-1-2015
Abstract
Learning automata (LA) are powerful tools for reinforcement learning. A discretized pursuit LA is the most popular one among them. During an iteration its operation consists of three basic phases: 1) selecting the next action; 2) finding the optimal estimated action; and 3) updating the state probability. However, when the number of actions is large, the learning becomes extremely slow because there are too many updates to be made at each iteration. The increased updates are mostly from phases 1 and 3. A new fast discretized pursuit LA with assured ϵ-optimality is proposed to perform both phases 1 and 3 with the computational complexity independent of the number of actions. Apart from its low computational complexity, it achieves faster convergence speed than the classical one when operating in stationary environments. This paper can promote the applications of LA toward the large-scale-action oriented area that requires efficient reinforcement learning tools with assured ϵ-optimality, fast convergence speed, and low computational complexity for each iteration.
Identifier
84911071747 (Scopus)
Publication Title
IEEE Transactions on Cybernetics
External Full Text Location
https://doi.org/10.1109/TCYB.2014.2365463
ISSN
21682267
First Page
2089
Last Page
2099
Issue
10
Volume
45
Grant
61202383
Fund Ref
National Science Foundation
Recommended Citation
Zhang, Junqi; Wang, Cheng; and Zhou, Mengchu, "Fast and Epsilon-Optimal Discretized Pursuit Learning Automata" (2015). Faculty Publications. 6753.
https://digitalcommons.njit.edu/fac_pubs/6753
