TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping
Document Type
Conference Proceeding
Publication Date
9-22-2017
Abstract
Task mapping is an important problem in parallel and distributed computing. The goal in task mapping is to find an optimal layout of the processes of an application (or a task) onto a given network topology. We target this problem in the context of staging applications. A staging application consists of two or more parallel applications (also referred to as staging tasks) which run concurrently and exchange data over the course of computation. Task mapping becomes a more challenging problem in staging applications, because not only data is exchanged between the staging tasks, but also the processes of a staging task may exchange data with each other. We propose a novel method, called Task Graph Embedding (TGE), that harnesses the observable graph structures of parallel applications and network topologies. TGE employs a machine learning based algorithm to find the best representation of a graph, called an embedding, onto a space in which the task-To-processor mapping problem can be solved. We evaluate and demonstrate the effectiveness of TGE experimentally with the communication patterns extracted from runs of XGC, a large-scale fusion simulation code, on Titan.
Identifier
85032621616 (Scopus)
ISBN
[9781538623268]
Publication Title
Proceedings IEEE International Conference on Cluster Computing Iccc
External Full Text Location
https://doi.org/10.1109/CLUSTER.2017.67
ISSN
15525244
First Page
587
Last Page
591
Volume
2017-September
Fund Ref
U.S. Department of Energy
Recommended Citation
Choi, Jong Youl; Logan, Jeremy; Wolf, Matthew; Ostrouchov, George; Kurc, Tahsin; Liu, Qing; Podhorszki, Norbert; Klasky, Scott; Romanus, Melissa; Sun, Qian; Parashar, Manish; Churchill, Randy Michael; and Chang, Cs, "TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping" (2017). Faculty Publications. 9301.
https://digitalcommons.njit.edu/fac_pubs/9301
