Distributed Fusion-Based Policy Search for Fast Robot Locomotion Learning
Document Type
Article
Publication Date
8-1-2019
Abstract
Deep reinforcement learning methods are developed to deal with challenging locomotion control problems in a robotics domain and can achieve significant performance improvement over conventional control methods. One of their appealing advantages is model-free. In other words, agents learn a control policy completely from scratches with raw high-dimensional sensory observations. However, they often suffer from poor sample-efficiency and instability issues, which make them inapplicable to many engineering systems. This paper presents a distributed fusion-based policy search framework to accelerate robot locomotion learning processes through variance reduction and asynchronous exploration approaches. An adaptive fusion-based variance reduction technique is introduced to improve sample-efficiency. A parametric noise is added to neural network weights, which leads to efficient exploration and ensures consistency in actions. Subsequently, the fusion-based policy gradient estimator is extended to a distributed decoupled actor-critic architecture. This allows the central estimator to handle off-policy data from different actors asynchronously, which fully utilizes CPUs and GPUs to maximize data throughput. The aim of this work is to improve sample-efficiency and convergence speed of deep reinforcement learning in robot locomotion tasks. Simulation results are presented to verify the theoretical results, which show that the proposed algorithm achieves and sometimes surpasses the state-of-theart performance.
Identifier
85069782804 (Scopus)
Publication Title
IEEE Computational Intelligence Magazine
External Full Text Location
https://doi.org/10.1109/MCI.2019.2919364
e-ISSN
15566048
ISSN
1556603X
First Page
19
Last Page
28
Issue
3
Volume
14
Grant
2018YFB1304600
Fund Ref
National Natural Science Foundation of China
Recommended Citation
Cao, Zhengcai; Xiao, Qing; and Zhou, Mengchu, "Distributed Fusion-Based Policy Search for Fast Robot Locomotion Learning" (2019). Faculty Publications. 7416.
https://digitalcommons.njit.edu/fac_pubs/7416
