SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
Document Type
Conference Proceeding
Publication Date
1-1-2024
Abstract
This paper studies the transfer reinforcement learning (RL) problem where multiple RL problems have different reward functions but share the same underlying transition dynamics.In this setting, the Q-function of each RL problem (task) can be decomposed into a successor feature (SF) and a reward mapping: the former characterizes the transition dynamics, and the latter characterizes the task-specific reward function.This Q-function decomposition, coupled with a policy improvement operator known as generalized policy improvement (GPI), reduces the sample complexity of finding the optimal Q-function, and thus the SF & GPI framework exhibits promising empirical performance compared to traditional RL methods like Q-learning.However, its theoretical foundations remain largely unestablished, especially when learning the successor features using deep neural networks (SF-DQN).This paper studies the provable knowledge transfer using SFs-DQN in transfer RL problems.We establish the first convergence analysis with provable generalization guarantees for SF-DQN with GPI.The theory reveals that SF-DQN with GPI outperforms conventional RL approaches, such as deep Q-network, in terms of both faster convergence rate and better generalization.Numerical experiments on real and synthetic RL tasks support the superior performance of SF-DQN & GPI, aligning with our theoretical findings.
Identifier
85203814855 (Scopus)
Publication Title
Proceedings of Machine Learning Research
e-ISSN
26403498
First Page
58897
Last Page
58934
Volume
235
Grant
FA9550-20-1-0122
Fund Ref
International Business Machines Corporation
Recommended Citation
Zhang, Shuai; Fernando, Heshan Devaka; Liu, Miao; Murugesan, Keerthiram; Lu, Songtao; Chen, Pin Yu; Chen, Tianyi; and Wang, Meng, "SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning" (2024). Faculty Publications. 880.
https://digitalcommons.njit.edu/fac_pubs/880