Improved online sequential extreme learning machine: A new intelligent evaluation method for AZ-style algorithms

Document Type

Article

Publication Date

1-1-2019

Abstract

Researches on computer games for Go, Chess, and Japanese Chess stand out as one of the notable landmarks in the progress of artificial intelligence. AlphaGo, AlphaGo Zero, and AlphaZero algorithms, which are called AlphaZero style (AZ-style) algorithms in some literature [1], have achieved superhuman performance by using deep reinforcement learning (DRL). However, the unavailability of training details, expensive equipment used for model training, and the low evaluation accuracy resulted by slow self-play training without expensive computing equipment in practical applications have been the defects of AZ-style algorithms. To solve the problems to a certain extent, the paper proposes an improved online sequential extreme learning machine (IOS-ELM), a new evaluation method, to evaluate chess board positions for AZ-style algortihm. Firstly, the theoretical principles of IOS-ELM is given. Secondly, the study considers Gomoku as the application object and uses IOS-ELM as the evaluation method for AZ-style's board positions to discuss the loss in the training process and hyperparameters affecting performance in detail. Under the same experimental conditions, the proposed method reduces the training parameters by 14 times, training time to 15%, and error of evaluation by 13% compared with the board evaluation network used in original AZ-style algorithms.

Identifier

85078027385 (Scopus)

Publication Title

IEEE Access

External Full Text Location

https://doi.org/10.1109/ACCESS.2019.2938568

e-ISSN

21693536

First Page

124891

Last Page

124901

Volume

7

Grant

61602539

Fund Ref

National Natural Science Foundation of China

This document is currently not available here.

Share

COinS