Document Type
Dissertation
Date of Award
Summer 8-31-2019
Degree Name
Doctor of Philosophy in Electrical Engineering - (Ph.D.)
Department
Electrical and Computer Engineering
First Advisor
MengChu Zhou
Second Advisor
Nirwan Ansari
Third Advisor
John D. Carpinelli
Fourth Advisor
Qing Gary Liu
Fifth Advisor
Qiong Shen
Abstract
Recently, some researchers have attempted to find a relationship between the evolution of rare events and temporal-spatial patterns of social media activities. Their studies verify that the relationship exists in both time and spatial domains. However, few of those studies can accurately deduce a time point when social media activities are most highly affected by a rare event because producing an accurate temporal pattern of social media during the evolution of a rare event is very difficult. This work expands the current studies along three directions. Firstly, we focus on the intensity of information volume and propose an innovative clustering algorithm-based data processing method to characterize the evolution of a rare event by analyzing social media data. Secondly, novel feature extraction and fuzzy logic-based classification methods are proposed to distinguish and classify event-related and unrelated messages. Lastly, since many messages do not have ground truth, we execute four existing ground-truth inference algorithms to deduce the ground truth and compare their performances. Then, an Adaptive Majority Voting (Adaptive MV) method is proposed and compared with two of the existing algorithms based on a set containing manually-labeled social media data. Our case studies focus on Hurricane Sandy in 2012 and Hurricane Maria in 2017. Twitter data collected around them are used to verify the effectiveness of the proposed methods. Firstly, the results of the proposed data processing method not only verify that a rare event and social media activities have strong correlations, but also reveal that they have some time difference. Thus, it is conducive to investigate the temporal pattern of social media activities. Secondly, fuzzy logic-based feature extraction and classification methods are effective in identifying event-related and unrelated messages. Lastly, the Adaptive MV method deduces the ground truth well and performs better on datasets with noisy labels than other two methods, Positive Label Frequency Threshold and Majority Voting.
Recommended Citation
Lu, Xiaoyu, "Analyzing evolution of rare events through social media data" (2019). Dissertations. 1423.
https://digitalcommons.njit.edu/dissertations/1423
Included in
Chemical Engineering Commons, Computer Engineering Commons, Electrical and Electronics Commons