Predicting lung cancer incidence from air pollution exposures using shapelet-based time series analysis
Document Type
Conference Proceeding
Publication Date
4-18-2016
Abstract
In this paper we investigated whether the geographical variation of lung cancer incidence can be predicted through examining the spatiotemporal trend of particulate matter air pollution levels. Regional trends of air pollution levels were analyzed by a novel shapelet-based time series analysis technique. First, we identified U.S. counties with reportedly high and low lung cancer incidence between 2008 and 2012 via the State Cancer Profiles provided by the National Cancer Institute. Then, we collected particulate matter exposure levels (PM2.5 and PM10) of the counties for the previous decade (1998-2007) via the AirData dataset provided by the Environmental Protection Agency. Using shapelet-based time series pattern mining, regional environmental exposure profiles were examined to identify frequently occurring sequential exposure patterns. Finally, a binary classifier was designed to predict whether a U.S. region is expected to experience high lung cancer incidence based on the region's PM2.5 and PM10 exposure the decade prior. The study confirmed the association between prolonged PM exposure and lung cancer risk. In addition, the study findings suggest that not only cumulative exposure levels but also the temporal variability of PM exposure influence lung cancer risk.
Identifier
84968610584 (Scopus)
ISBN
[9781509024551]
Publication Title
3rd IEEE EMBS International Conference on Biomedical and Health Informatics Bhi 2016
External Full Text Location
https://doi.org/10.1109/BHI.2016.7455960
First Page
565
Last Page
568
Recommended Citation
Yoon, Hong Jun; Xu, Songhua; and Tourassi, Georgia, "Predicting lung cancer incidence from air pollution exposures using shapelet-based time series analysis" (2016). Faculty Publications. 10578.
https://digitalcommons.njit.edu/fac_pubs/10578
