Date of Award
Doctor of Philosophy in Electrical Engineering - (Ph.D.)
Electrical and Computer Engineering
John D. Carpinelli
Recently, some researchers have attempted to find a relationship between the evolution of rare events and temporal-spatial patterns of social media activities. Their studies verify that the relationship exists in both time and spatial domains. However, few of those studies can accurately deduce a time point when social media activities are most highly affected by a rare event because producing an accurate temporal pattern of social media during the evolution of a rare event is very difficult. This work expands the current studies along three directions. Firstly, we focus on the intensity of information volume and propose an innovative clustering algorithm-based data processing method to characterize the evolution of a rare event by analyzing social media data. Secondly, novel feature extraction and fuzzy logic-based classification methods are proposed to distinguish and classify event-related and unrelated messages. Lastly, since many messages do not have ground truth, we execute four existing ground-truth inference algorithms to deduce the ground truth and compare their performances. Then, an Adaptive Majority Voting (Adaptive MV) method is proposed and compared with two of the existing algorithms based on a set containing manually-labeled social media data. Our case studies focus on Hurricane Sandy in 2012 and Hurricane Maria in 2017. Twitter data collected around them are used to verify the effectiveness of the proposed methods. Firstly, the results of the proposed data processing method not only verify that a rare event and social media activities have strong correlations, but also reveal that they have some time difference. Thus, it is conducive to investigate the temporal pattern of social media activities. Secondly, fuzzy logic-based feature extraction and classification methods are effective in identifying event-related and unrelated messages. Lastly, the Adaptive MV method deduces the ground truth well and performs better on datasets with noisy labels than other two methods, Positive Label Frequency Threshold and Majority Voting.
Lu, Xiaoyu, "Analyzing evolution of rare events through social media data" (2019). Dissertations. 1423.