A Length-Adaptive Non-Dominated Sorting Genetic Algorithm for Bi-Objective High-Dimensional Feature Selection

Document Type

Article

Publication Date

9-1-2023

Abstract

As a crucial data preprocessing method in data mining, feature selection (FS) can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected features. Evolutionary computing (EC) is promising for FS owing to its powerful search capability. However, in traditional EC-based methods, feature subsets are represented via a length-fixed individual encoding. It is ineffective for high-dimensional data, because it results in a huge search space and prohibitive training time. This work proposes a length-adaptive non-dominated sorting genetic algorithm (LA-NSGA) with a length-variable individual encoding and a length-adaptive evolution mechanism for bi-objective high-dimensional FS. In LA-NSGA, an initialization method based on correlation and redundancy is devised to initialize individuals of diverse lengths, and a Pareto dominance-based length change operator is introduced to guide individuals to explore in promising search space adaptively. Moreover, a dominance-based local search method is employed for further improvement. The experimental results based on 12 high-dimensional gene datasets show that the Pareto front of feature subsets produced by LA-NSGA is superior to those of existing algorithms.

Identifier

85168799494 (Scopus)

Publication Title

IEEE Caa Journal of Automatica Sinica

External Full Text Location

https://doi.org/10.1109/JAS.2023.123648

e-ISSN

23299274

ISSN

23299266

First Page

1834

Last Page

1844

Issue

9

Volume

10

Grant

62072060

Fund Ref

National Natural Science Foundation of China

This document is currently not available here.

Share

COinS