Document Type
Dissertation
Date of Award
5-31-2024
Degree Name
Doctor of Philosophy in Computing Sciences - (Ph.D.)
Department
Computer Science
First Advisor
Chase Qishi Wu
Second Advisor
Guiling Wang
Third Advisor
Senjuti Basu Roy
Fourth Advisor
Yi Chen
Fifth Advisor
Hui Wang
Abstract
In next-generation scientific applications, the exponential growth of big data necessitates advanced techniques for efficient data storage, processing, and analysis. This has led to the construction of intricate computing workflows, managed and orchestrated by powerful engines in big data systems as exemplified by Hadoop. As scientific applications increasingly shift towards simulation-centric approaches, traditional methodologies face new challenges in accommodating the complexity of extreme-scale numerical modeling with numerous tunable parameters. To address these challenges, this dissertation propose to develop a machine learning-assisted framework that enables autonomous computational steering of scientific simulations and optimized execution of big-data workflows on heterogeneous platforms. This framework integrates three main technical components. 1) A computational steering strategy employs reinforcement learning to realize dynamic parameter tuning for accurate modeling in complex and distributed environments. 2) A workflow mapping scheme determines job or task assignment and on-node scheduling and resource allocation to minimize end-to-end delay. 3) A class of novel algorithms based on dueling double deep Q-networks with Gaussian Process Regression optimize data block distribution and recovery in Hadoop Distributed File System (HDFS) on heterogeneous clusters with diverse capacities of data nodes and disparate patterns of data access. Moreover, this dissertation formulate some of these problems within our framework as optimization problems, prove their NP-completeness, and design approximation algorithms with robust performance guarantees. Experimental results from real-life scientific simulations demonstrate the efficacy of our proposed methods, showcasing their superiority over existing algorithms and affirming the validity of our theoretical analyses. This dissertation research contributes to advancing the big data computing process in scientific disciplines and also highlights its potential applications to big data-driven industrial and business processes.
Recommended Citation
Zhang, Yijie, "A machine learning-assisted steering and scheduling framework for big-data scientific workflows on heterogeneous computing platforms" (2024). Dissertations. 1824.
https://digitalcommons.njit.edu/dissertations/1824
