TL-GDBN: Growing Deep Belief Network with Transfer Learning

Document Type

Article

Publication Date

4-1-2019

Abstract

A deep belief network (DBN) is effective to create a powerful generative model by using training data. However, it is difficult to fast determine its optimal structure given specific applications. In this paper, a growing DBN with transfer learning (TL-GDBN) is proposed to automatically decide its structure size, which can accelerate its learning process and improve model accuracy. First, a basic DBN structure with single hidden layer is initialized and then pretrained, and the learned weight parameters are frozen. Second, TL-GDBN uses TL to transfer the knowledge from the learned weight parameters to newly added neurons and hidden layers, which can achieve a growing structure until the stopping criterion for pretraining is satisfied. Third, the weight parameters derived from pretraining of TL-GDBN are further fine-tuned by using layer-by-layer partial least square regression from top to bottom, which can avoid many problems of traditional backpropagation algorithm-based fine-tuning. Moreover, the convergence analysis of the TL-GDBN is presented. Finally, TL-GDBN is tested on two benchmark data sets and a practical wastewater treatment system. The simulation results show that it has better modeling performance, faster learning speed, and more robust structure than existing models. Note to Practitioners - Transfer learning (TL) aims to improve training effectiveness by transferring knowledge from a source domain to target domain. This paper presents a growing deep belief network (DBN) with TL to improve the training effectiveness and determine the optimal model size. Facing a complex process and real-world workflow, DBN tends to require long time for its successful training. The proposed growing DBN with TL (TL-GDBN) accelerates the learning process by instantaneously transferring the knowledge from a source domain to each new deeper or wider substructure. The experimental results show that the proposed TL-GDBN model has a great potential to deal with complex system, especially the systems with high nonlinearity. As a result, it can be readily applicable to some industrial nonlinear systems.

Identifier

85054471522 (Scopus)

Publication Title

IEEE Transactions on Automation Science and Engineering

External Full Text Location

https://doi.org/10.1109/TASE.2018.2865663

ISSN

15455955

First Page

874

Last Page

885

Issue

2

Volume

16

Grant

61533002

Fund Ref

National Natural Science Foundation of China

This document is currently not available here.

Share

COinS