On Machine Learning-based Stage-aware Performance Prediction of Spark Applications
Document Type
Conference Proceeding
Publication Date
11-6-2020
Abstract
The data volume of large-scale applications in various science, engineering, and business domains has experienced an explosive growth over the past decade, and has gone far beyond the computing capability and storage capacity of any single server. As a viable solution, such data is oftentimes stored in distributed file systems and processed by parallel computing engines, as exemplified by Spark, which has gained increasing popularity over the traditional MapReduce framework due to its fast in-memory processing of streaming data. Spark engines are generally deployed in cloud environments such as Amazon EC2 and Alibaba Cloud. However, storage and computing resources in these cloud environments are typically provisioned on a pay-as-you-go basis and thus an accurate estimate of the execution time of Spark workloads is critical to making full utilization of cloud resources and meeting performance requirements of end users. Our insight is that the execution pattern of many Spark workloads is qualitatively similar, which makes it possible to leverage historical performance data to predict the execution time of a given Spark application. We use the execution information extracted from Spark History Server as training data and develop a stage-aware hierarchical neural network model for performance prediction. Experimental results show that the proposed hierarchical model achieves higher accuracy than a holistic prediction model at the end-to-end level, and also outperforms other existing regression-based prediction methods.
Identifier
85104397132 (Scopus)
ISBN
[9781728198293]
Publication Title
2020 IEEE 39th International Performance Computing and Communications Conference Ipccc 2020
External Full Text Location
https://doi.org/10.1109/IPCCC50635.2020.9391564
Grant
CNS-1828123
Fund Ref
National Science Foundation
Recommended Citation
Ye, Guangjun; Liu, Wuji; Wu, Chase Q.; Shen, Wei; and Lyu, Xukang, "On Machine Learning-based Stage-aware Performance Prediction of Spark Applications" (2020). Faculty Publications. 4847.
https://digitalcommons.njit.edu/fac_pubs/4847
