Decision tree rule-based feature selection for large-scale imbalanced data
Document Type
Conference Proceeding
Publication Date
5-15-2017
Abstract
A class imbalance problem often appears in many real world applications, e.g. fault diagnosis, text categorization, fraud detection. When dealing with a large-scale imbalanced dataset, feature selection becomes a great challenge. To confront it, this work proposes a feature selection approach based on a decision tree rule. The effectiveness of the proposed approach is verified by classifying a large-scale dataset from Santander Bank. The results show that our approach can achieve higher Area Under the Curve (AUC) and less computational time. We also compare it with filter-based feature selection approaches, i.e., Chi-Square and F-statistic. The results show that it outperforms them but needs slightly more computational efforts.
Identifier
85021442647 (Scopus)
ISBN
[9781509049097]
Publication Title
2017 26th Wireless and Optical Communication Conference Wocc 2017
External Full Text Location
https://doi.org/10.1109/WOCC.2017.7928973
Recommended Citation
Liu, Haoyue and Zhou, Mengchu, "Decision tree rule-based feature selection for large-scale imbalanced data" (2017). Faculty Publications. 9573.
https://digitalcommons.njit.edu/fac_pubs/9573
