Decision tree rule-based feature selection for large-scale imbalanced data

Document Type

Conference Proceeding

Publication Date

5-15-2017

Abstract

A class imbalance problem often appears in many real world applications, e.g. fault diagnosis, text categorization, fraud detection. When dealing with a large-scale imbalanced dataset, feature selection becomes a great challenge. To confront it, this work proposes a feature selection approach based on a decision tree rule. The effectiveness of the proposed approach is verified by classifying a large-scale dataset from Santander Bank. The results show that our approach can achieve higher Area Under the Curve (AUC) and less computational time. We also compare it with filter-based feature selection approaches, i.e., Chi-Square and F-statistic. The results show that it outperforms them but needs slightly more computational efforts.

Identifier

85021442647 (Scopus)

ISBN

[9781509049097]

Publication Title

2017 26th Wireless and Optical Communication Conference Wocc 2017

External Full Text Location

https://doi.org/10.1109/WOCC.2017.7928973

This document is currently not available here.

Share

COinS