Incorporation of Optimal Computing Budget Allocation for Ordinal Optimization into Learning Automata

Document Type

Article

Publication Date

4-1-2016

Abstract

A learning automaton (LA) is a powerful tool for reinforcement learning. Its action probability vector plays two roles: 1) deciding when it converges, i.e., total computing budget it has used, and 2) allocating computing budget among actions to identify the optimal one. These two intertwined roles lead to a problem: the computing budget mostly goes to the currently estimated optimal action due to its high action probability regardless whether such budget allocation can help identify the true optimal one or not. This work proposes a new class of LA that avoids the use of its action probability vector for computing budget allocation. Instead we use such vector only to determine if it converges and then employ optimal computing budget allocation to accomplish the allocation of computing budget in a way that maximizes the probability of identifying the true optimal actions. ϵ-optimality is proven. Simulations verify its advantages over existing algorithms.

Identifier

84938505448 (Scopus)

Publication Title

IEEE Transactions on Automation Science and Engineering

External Full Text Location

https://doi.org/10.1109/TASE.2015.2450535

ISSN

15455955

First Page

1008

Last Page

1017

Issue

2

Volume

13

Grant

CMMI-1162482

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS