Predicting Large-scale Protein-protein Interactions by Extracting Coevolutionary Patterns with MapReduce Paradigm

Document Type

Conference Proceeding

Publication Date

1-1-2021

Abstract

Protein-protein interactions are of great significance for us to understand the functional mechanisms of proteins. With the rapid development of high-throughput genomic technology, the amount of protein-protein interaction data has become so big that most of existing prediction algorithms are no longer applicable. To address this problem, we develop a distributed framework by reimplementing one of state-of-the-art algorithms, i.e., CoFex, by using MapReduce. In particular, we adopt a novel tree-based data structure to reduce the heavy memory consumption cased by the huge sequence information of proteins. After that, the procedure of CoFex is modified by following the paradigm of MapReduce such that the prediction task can be completed in a distributed manner, thus fulfilling the demanding requirements of large-scale protein-protein interaction prediction. A series of experiments have been conducted to evaluate the performance of the proposed distributed framework in terms of both efficiency and effectiveness. Experimental results demonstrate that the proposed framework can considerably improve the efficiency of CoFex by achieving more than two-orders-of-magnitude improvement in computational efficiency while retaining a comparable level of accuracy.

Identifier

85124317098 (Scopus)

ISBN

[9781665442077]

Publication Title

Conference Proceedings IEEE International Conference on Systems Man and Cybernetics

External Full Text Location

https://doi.org/10.1109/SMC52423.2021.9658839

ISSN

1062922X

First Page

939

Last Page

944

This document is currently not available here.

Share

COinS