Parallel Suffix Sorting for Large String Analytics

Document Type

Conference Proceeding

Publication Date

1-1-2023

Abstract

The suffix array is a fundamental data structure to support string analysis efficiently. It took about 26 years for the sequential suffix array construction algorithm to achieve O(n) time complexity and in-place sorting. In this paper, we develop the D-Limited Parallel Induce (DLPI) algorithm, the first O(np) time parallel suffix array construction algorithm. The basic idea of DLPI includes two aspects: dividing the O(n) size problem into p reduced sub-problems with size O(np) so we can handle them on p processors in parallel; developing an efficient parallel induce sorting method to achieve correct order for all the reduced sub-problems. The complete algorithm description is given to show the implementation method of the proposed idea. The time and space complexity analysis and proof are also given to show the correctness and efficiency of the proposed algorithm. The proposed DLPI algorithm can handle large strings with scalable performance.

Identifier

85161428178 (Scopus)

ISBN

[9783031304415]

Publication Title

Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics

External Full Text Location

https://doi.org/10.1007/978-3-031-30442-2_6

e-ISSN

16113349

ISSN

03029743

First Page

71

Last Page

82

Volume

13826 LNCS

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS