Optimizing the performance of big data workflows in multi-cloud environments under budget constraint
Document Type
Conference Proceeding
Publication Date
8-31-2016
Abstract
Workflow techniques have been widely used as a major computing solution in many science domains. With the rapid deployment of cloud infrastructures around the globe and the economic benefit of cloud-based computing and storage services, an increasing number of scientific workflows have been shifted or are in active transition to clouds. As the scale of scientific applications continues to grow, it is now common to deploy data-and network-intensive computing workflows in multi-cloud environments, where inter-cloud data transfer oftentimes plays a significant role in both workflow performance and financial cost. We construct rigorous mathematical models to analyze the intra-and inter-cloud execution process of scientific workflows and formulate a budget-constrained workflow mapping problem to optimize the network performance of scientific workflows in multi-cloud environments. We show the proposed problem to be NP-complete and design a heuristic solution that takes into consideration module execution, data transfer, and I/O operations. The performance superiority of the proposed solution over existing methods is illustrated through extensive simulations.
Identifier
84989960078 (Scopus)
ISBN
[9781509026289]
Publication Title
Proceedings 2016 IEEE International Conference on Services Computing Scc 2016
External Full Text Location
https://doi.org/10.1109/SCC.2016.25
First Page
138
Last Page
145
Recommended Citation
Wu, Chase Q. and Cao, Huiyan, "Optimizing the performance of big data workflows in multi-cloud environments under budget constraint" (2016). Faculty Publications. 10319.
https://digitalcommons.njit.edu/fac_pubs/10319
