Faculty Publications

A Spammer Identification Method for Class Imbalanced Weibo Datasets

Wenbing Tang, Zhejiang Sci-Tech University
Zuohua Ding, Zhejiang Sci-Tech University
Mengchu Zhou, Newark College of Engineering

Document Type

Article

Publication Date

1-1-2019

Abstract

Nowadays, Weibo has become a significant and popular information sharing platform in China. Meanwhile, spammer identification has been a big challenge for it. To mitigate the damage caused by spammers, classification algorithms from machine learning have been applied to distinguish spammers and non-spammers. However, most of the previous studies overlook the class imbalance problem of real-world data. In this paper, by analyzing the characteristics of spammers in Weibo, we select microblog content similarity, the average number of links, and the other 12 features to construct a comprehensive feature vector never seen before. Considering the existence of imbalance problems in spammer identification, an ensemble learning method is used to combine multiple base classifiers for improving the learning performance. During the training stage of base learners, fuzzy-logic-based oversampling and cost-sensitive support vector machine are considered to tackle imbalanced data at both data and algorithmic levels. The experimental results demonstrate that compared with the existing state-of-the-art methods, the recall rate of our proposed approach increases by 6.5% and reaches the precision value of 87.53% when used to deal with real-world Weibo datasets we collected.

Identifier

85064705420 (Scopus)

Publication Title

IEEE Access

External Full Text Location

https://doi.org/10.1109/ACCESS.2019.2901756

e-ISSN

21693536

First Page

29193

Last Page

29201

Volume

Recommended Citation

Tang, Wenbing; Ding, Zuohua; and Zhou, Mengchu, "A Spammer Identification Method for Class Imbalanced Weibo Datasets" (2019). Faculty Publications. 7983.
https://digitalcommons.njit.edu/fac_pubs/7983

This document is currently not available here.

COinS

DOI

10.1109/ACCESS.2019.2901756

Faculty Publications

A Spammer Identification Method for Class Imbalanced Weibo Datasets

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

First Page

Last Page

Volume

Recommended Citation

DOI

Search

Browse

Author Corner

Links

Faculty Publications

A Spammer Identification Method for Class Imbalanced Weibo Datasets

Authors

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

e-ISSN

First Page

Last Page

Volume

Recommended Citation

Share

DOI

Search

Browse

Author Corner

Links