NUS: Noisy-Sample-Removed Undersampling Scheme for Imbalanced Classification and Application to Credit Card Fraud Detection
Document Type
Article
Publication Date
4-1-2024
Abstract
Since minority samples are substantially less common than majority samples, many industrial applications, such as credit card fraud detection (CCFD) and defective part identification, call for imbalanced classification. The performance of a classifier tends to suffer from the noisy samples in majority or minority classes. This work proposes a new undersampling scheme, called a clustering-based noisy-sample-removed undersampling scheme (NUS) for imbalanced classification. The majority class samples are first clustered. The distance of the majority class sample from the cluster center that is furthest away is used as the radius to build a hypersphere, with each cluster's center assumed to be a spherical center. We determine the Euclidean distance between the center of a cluster and each minority sample to find whether they are in the hypersphere or not. Afterward, we exclude noisy samples from the minority class. The noisy samples of majority classes are removed by using the same procedure. Second, we propose an NUS, which combines noisy sample removal with undersampling techniques. Finally, to prove the effectiveness of NUS, we integrate NUS with the basic classifiers random forest (RF), decision tree (DT), and logistics regression (LR). We conduct their comparison with seven undersampling, oversampling, and noisy-sample-removed methods. This work performs experiments on 13 public and three real transaction datasets related to e-commerce. The results show that NUS plays a positive role in promoting existing classifiers' performance.
Identifier
85149820382 (Scopus)
Publication Title
IEEE Transactions on Computational Social Systems
External Full Text Location
https://doi.org/10.1109/TCSS.2023.3243925
e-ISSN
2329924X
First Page
1793
Last Page
1804
Issue
2
Volume
11
Recommended Citation
Zhu, Honghao; Zhou, Meng Chu; Liu, Guanjun; Xie, Yu; Liu, Shijun; and Guo, Cheng, "NUS: Noisy-Sample-Removed Undersampling Scheme for Imbalanced Classification and Application to Credit Card Fraud Detection" (2024). Faculty Publications. 548.
https://digitalcommons.njit.edu/fac_pubs/548