Drifted Twitter Spam Classification Using Multiscale Detection Test on K-L Divergence
Document Type
Article
Publication Date
1-1-2019
Abstract
Twitter spam classification is a tough challenge for social media platforms and cyber security companies. Twitter spam with illegal links may evolve over time in order to deceive filtering models, causing disastrous loss to both users and the whole network. We define this distributional evolution as a concept drift scenario. To build an effective model, we adopt K-L divergence to represent spam distribution and use a multiscale drift detection test (MDDT) to localize possible drifts therein. A base classifier is then retrained based on the detection result to gain performance improvement. Comprehensive experiments show that K-L divergence has highly consistent change patterns between features when a drift occurs. Also, the MDDT is proved to be effective in improving final classification result in both accuracy, recall, and f-measure.
Identifier
85071155136 (Scopus)
Publication Title
IEEE Access
External Full Text Location
https://doi.org/10.1109/ACCESS.2019.2932018
e-ISSN
21693536
First Page
108384
Last Page
108394
Volume
7
Grant
51775385
Fund Ref
National Natural Science Foundation of China
Recommended Citation
Wang, Xuesong; Kang, Qi; An, Jing; and Zhou, Mengchu, "Drifted Twitter Spam Classification Using Multiscale Detection Test on K-L Divergence" (2019). Faculty Publications. 8038.
https://digitalcommons.njit.edu/fac_pubs/8038
