BO-SMOTE: A Novel Bayesian-Optimization-Based Synthetic Minority Oversampling Technique
Document Type
Article
Publication Date
4-1-2024
Abstract
An oversampling technique balances a dataset by increasing the number of minority samples. It is a common and effective method in imbalanced learning. However, most oversampling methods have randomness in generating minority samples, which would have negative impacts on the prediction performance of subsequent classifiers. This study treats the prediction made by classifiers as a black-box optimization problem. The optimization objective is to improve the classification accuracy of subsequent classifiers for minority samples. The solution of this optimization problem can be regarded as a minority sample that can be and added to the imbalanced dataset. The minority samples are iteratively generated by Bayesian optimization (BO). We determine two valuable intervals for each 1-D continuous variable feature. One is the interval with the densest minority samples. The other is that with the sparsest majority samples distributed among the minority samples. By adjusting the proportion of samples generated in the two areas, the presented algorithm can be flexibly applied to different datasets. In order to reduce the noise that may be caused by the exploration phase of BO, a sample selection procedure is carried out to eliminate the samples that are worse than those generated at the previous iteration. The samples generated in this way are based on the principle of improving the performance of the classifier, thus avoiding the negative effects of randomness. Experimental results via twenty open imbalanced datasets show that the proposed method obtains better results than existing state-of-The-Art oversampling models, thus well advancing the important field of imbalanced learning.
Identifier
85181565807 (Scopus)
Publication Title
IEEE Transactions on Systems, Man, and Cybernetics: Systems
External Full Text Location
https://doi.org/10.1109/TSMC.2023.3335241
e-ISSN
21682232
ISSN
21682216
First Page
2079
Last Page
2091
Issue
4
Volume
54
Recommended Citation
Yan, Shen; Zhao, Ziyan; Liu, Shixin; and Zhou, Mengchu, "BO-SMOTE: A Novel Bayesian-Optimization-Based Synthetic Minority Oversampling Technique" (2024). Faculty Publications. 535.
https://digitalcommons.njit.edu/fac_pubs/535