Dissertations

A machine learning-assisted steering and scheduling framework for big-data scientific workflows on heterogeneous computing platforms

Yijie Zhang, New Jersey Institute of Technology

Document Type

Dissertation

Date of Award

5-31-2024

Degree Name

Doctor of Philosophy in Computing Sciences - (Ph.D.)

Department

Computer Science

First Advisor

Chase Qishi Wu

Second Advisor

Guiling Wang

Third Advisor

Senjuti Basu Roy

Fourth Advisor

Yi Chen

Fifth Advisor

Hui Wang

Abstract

In next-generation scientific applications, the exponential growth of big data necessitates advanced techniques for efficient data storage, processing, and analysis. This has led to the construction of intricate computing workflows, managed and orchestrated by powerful engines in big data systems as exemplified by Hadoop. As scientific applications increasingly shift towards simulation-centric approaches, traditional methodologies face new challenges in accommodating the complexity of extreme-scale numerical modeling with numerous tunable parameters. To address these challenges, this dissertation propose to develop a machine learning-assisted framework that enables autonomous computational steering of scientific simulations and optimized execution of big-data workflows on heterogeneous platforms. This framework integrates three main technical components. 1) A computational steering strategy employs reinforcement learning to realize dynamic parameter tuning for accurate modeling in complex and distributed environments. 2) A workflow mapping scheme determines job or task assignment and on-node scheduling and resource allocation to minimize end-to-end delay. 3) A class of novel algorithms based on dueling double deep Q-networks with Gaussian Process Regression optimize data block distribution and recovery in Hadoop Distributed File System (HDFS) on heterogeneous clusters with diverse capacities of data nodes and disparate patterns of data access. Moreover, this dissertation formulate some of these problems within our framework as optimization problems, prove their NP-completeness, and design approximation algorithms with robust performance guarantees. Experimental results from real-life scientific simulations demonstrate the efficacy of our proposed methods, showcasing their superiority over existing algorithms and affirming the validity of our theoretical analyses. This dissertation research contributes to advancing the big data computing process in scientific disciplines and also highlights its potential applications to big data-driven industrial and business processes.

Recommended Citation

Zhang, Yijie, "A machine learning-assisted steering and scheduling framework for big-data scientific workflows on heterogeneous computing platforms" (2024). Dissertations. 1824.
https://digitalcommons.njit.edu/dissertations/1824

Download

Included in

Climate Commons, Computer Sciences Commons, Environmental Sciences Commons

COinS

Dissertations

A machine learning-assisted steering and scheduling framework for big-data scientific workflows on heterogeneous computing platforms

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Fifth Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Dissertations

A machine learning-assisted steering and scheduling framework for big-data scientific workflows on heterogeneous computing platforms

Author

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Fifth Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links