Energy-efficient dynamic scheduling of deadline-constrained MapReduce workflows
Document Type
Conference Proceeding
Publication Date
11-14-2017
Abstract
Big data workflows comprised of moldable parallel MapReduce programs running on a large number of processors have become a main consumer of energy at data centers. The degree of parallelism of each moldable job in such workflows has a significant impact on the energy efficiency of parallel computing systems, which remains largely unexplored. In this paper, we validate with experimental results the moldable parallel computing model where the dynamic energy consumption of a moldable job increases with the number of parallel tasks. Based on our validation, we construct rigorous cost models and formulate a dynamic scheduling problem of deadline-constrained MapReduce workflows to minimize energy consumption in Hadoop systems. We propose a semi-dynamic online scheduling algorithm based on adaptive task partitioning to reduce dynamic energy consumption while meeting performance requirements from a global perspective, and also design the corresponding system modules for algorithm implementation in Hadoop architecture. The performance superiority of the proposed algorithm in terms of dynamic energy saving and deadline violation is illustrated by extensive simulation results in Hadoop/YARN in comparison with existing algorithms, and the core module of adaptive task partitioning is further validated through real-life workflow implementation and experimental results using the Oozie workflow engine in Hadoop/YARN systems.
Identifier
85043774328 (Scopus)
ISBN
[9781538626863]
Publication Title
Proceedings 13th IEEE International Conference on Escience Escience 2017
External Full Text Location
https://doi.org/10.1109/eScience.2017.18
First Page
393
Last Page
402
Recommended Citation
Shu, Tong and Wu, Chase Q., "Energy-efficient dynamic scheduling of deadline-constrained MapReduce workflows" (2017). Faculty Publications. 9193.
https://digitalcommons.njit.edu/fac_pubs/9193
