Accelerated structure-aware reinforcement learning for delay-sensitive energy harvesting wireless sensors

Document Type

Article

Publication Date

1-1-2020

Abstract

We consider a time-slotted energy-harvesting wireless sensor transmitting delay-sensitive data over a fading channel. The sensor injects captured data packets into its transmission queue and relies on ambient energy harvested from the environment to transmit them. We aim to find the optimal scheduling policy that decides how many packets to transmit in each time slot to minimize the expected queuing delay. No prior knowledge of the stochastic processes that govern the channel, captured data, and harvested energy dynamics is assumed, thereby necessitating online learning to optimize the scheduling policy. We formulate this problem as a Markov decision process (MDP) with state-space spanning the sensor's buffer, battery, and channel states, and show that its optimal value function is non-decreasing and has increasing differences, in the buffer state, and that it is non-increasing and has increasing differences, in the battery state. We exploit this value function structure knowledge to formulate a novel accelerated reinforcement learning (RL) algorithm based on value function approximation that can solve the scheduling problem online with controlled approximation error, while inducing limited computational and memory complexity. We rigorously capture the trade-off between approximation accuracy and computational/memory complexity savings associated with our approach. Our simulations demonstrate that the proposed algorithm closely approximates the optimal offline solution, which requires complete knowledge of the system state dynamics. Simultaneously, our approach achieves competitive performance relative to a state-of-the-art RL algorithm, at orders of magnitude lower complexity. Moreover, considerable performance gains are demonstrated over the widely popular Q-learning RL technique.

Identifier

85081140264 (Scopus)

Publication Title

IEEE Transactions on Signal Processing

External Full Text Location

https://doi.org/10.1109/TSP.2020.2973125

e-ISSN

19410476

ISSN

1053587X

First Page

1409

Last Page

1424

Volume

68

Grant

1711335

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS