Dynamic Priority Job Scheduling on a Hadoop YARN Platform
Document Type
Conference Proceeding
Publication Date
1-1-2023
Abstract
In Hadoop's big data processing systems, YARN is responsible for resource management and job scheduling. The built-in job scheduling algorithms in YARN are simple to execute, but have some limitations such as job starvation, excessive server load, and load imbalance. In this paper, we propose a new Hybrid Dynamic Priority job Scheduling algorithm (HDPS) to address these limitations. HDPS dynamically adjusts the priority of a job as its waiting time increases to prevent job starvation. It also features a task assignment strategy designed specifically to address data locality by considering the available resources of servers and the distribution of data blocks stored on servers to reduce data transfer time and improve job execution efficiency. We implement and integrate HDPS into YARN and conduct experiments in a real Hadoop system using built-in benchmark test cases of Hadoop. Experimental results show that HDPS exhibits comprehensive superior performance over existing algorithms in terms of execution efficiency and load balance.
Identifier
85190304537 (Scopus)
ISBN
[9798350330717]
Publication Title
Proceedings of the International Conference on Parallel and Distributed Systems ICPADS
External Full Text Location
https://doi.org/10.1109/ICPADS60453.2023.00069
ISSN
15219097
First Page
412
Last Page
419
Recommended Citation
Du, Nana; Ji, Yudong; Hou, Aiqin; Wu, Chase; and Nie, Weike, "Dynamic Priority Job Scheduling on a Hadoop YARN Platform" (2023). Faculty Publications. 2210.
https://digitalcommons.njit.edu/fac_pubs/2210