Faculty Publications

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with ε-Greedy Exploration

Shuai Zhang, New Jersey Institute of Technology
Hongkang Li, Rensselaer Polytechnic Institute
Meng Wang, Rensselaer Polytechnic Institute
Miao Liu, IBM Research
Pin Yu Chen, IBM Research
Songtao Lu, IBM Research
Sijia Liu, Michigan State University
Keerthiram Murugesan, IBM Research
Subhajit Chaudhury, IBM Research

Document Type

Conference Proceeding

Publication Date

1-1-2023

Abstract

This paper provides a theoretical understanding of Deep Q-Network (DQN) with the ε-greedy exploration in deep reinforcement learning. Despite the tremendous empirical achievement of the DQN, its theoretical characterization remains underexplored. First, the exploration strategy is either impractical or ignored in the existing analysis. Second, in contrast to conventional Q-learning algorithms, the DQN employs the target network and experience replay to acquire an unbiased estimation of the mean-square Bellman error (MSBE) utilized in training the Q-network. However, the existing theoretical analysis of DQNs lacks convergence analysis or bypasses the technical challenges by deploying a significantly overparameterized neural network, which is not computationally efficient. This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with ε-greedy policy. We prove an iterative procedure with decaying ε converges to the optimal Q-value function geometrically. Moreover, a higher level of ε values enlarges the region of convergence but slows down the convergence, while the opposite holds for a lower level of ε values. Experiments justify our established theoretical insights on DQNs.

Identifier

85187471122 (Scopus)

ISBN

[9781713899921]

Publication Title

Advances in Neural Information Processing Systems

ISSN

10495258

Volume

Grant

FA9550-20-1-0122

Fund Ref

Air Force Office of Scientific Research

Recommended Citation

Zhang, Shuai; Li, Hongkang; Wang, Meng; Liu, Miao; Chen, Pin Yu; Lu, Songtao; Liu, Sijia; Murugesan, Keerthiram; and Chaudhury, Subhajit, "On the Convergence and Sample Complexity Analysis of Deep Q-Networks with ε-Greedy Exploration" (2023). Faculty Publications. 2262.
https://digitalcommons.njit.edu/fac_pubs/2262

This document is currently not available here.

COinS

Faculty Publications

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with ε-Greedy Exploration

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

ISSN

Volume

Grant

Fund Ref

Recommended Citation

Search

Browse

Author Corner

Links

Faculty Publications

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with ε-Greedy Exploration

Authors

Document Type

Publication Date

Abstract

Identifier

ISBN

Publication Title

ISSN

Volume

Grant

Fund Ref

Recommended Citation

Share

Search

Browse

Author Corner

Links