Predicting Large-scale Protein-protein Interactions by Extracting Coevolutionary Patterns with MapReduce Paradigm
Document Type
Conference Proceeding
Publication Date
1-1-2021
Abstract
Protein-protein interactions are of great significance for us to understand the functional mechanisms of proteins. With the rapid development of high-throughput genomic technology, the amount of protein-protein interaction data has become so big that most of existing prediction algorithms are no longer applicable. To address this problem, we develop a distributed framework by reimplementing one of state-of-the-art algorithms, i.e., CoFex, by using MapReduce. In particular, we adopt a novel tree-based data structure to reduce the heavy memory consumption cased by the huge sequence information of proteins. After that, the procedure of CoFex is modified by following the paradigm of MapReduce such that the prediction task can be completed in a distributed manner, thus fulfilling the demanding requirements of large-scale protein-protein interaction prediction. A series of experiments have been conducted to evaluate the performance of the proposed distributed framework in terms of both efficiency and effectiveness. Experimental results demonstrate that the proposed framework can considerably improve the efficiency of CoFex by achieving more than two-orders-of-magnitude improvement in computational efficiency while retaining a comparable level of accuracy.
Identifier
85124317098 (Scopus)
ISBN
[9781665442077]
Publication Title
Conference Proceedings IEEE International Conference on Systems Man and Cybernetics
External Full Text Location
https://doi.org/10.1109/SMC52423.2021.9658839
ISSN
1062922X
First Page
939
Last Page
944
Recommended Citation
Hu, Lun; Zhao, Bo Wei; Yang, Shicheng; Luo, Xin; and Zhou, Mengchu, "Predicting Large-scale Protein-protein Interactions by Extracting Coevolutionary Patterns with MapReduce Paradigm" (2021). Faculty Publications. 4696.
https://digitalcommons.njit.edu/fac_pubs/4696