Parallel Suffix Sorting for Large String Analytics
Document Type
Conference Proceeding
Publication Date
1-1-2023
Abstract
The suffix array is a fundamental data structure to support string analysis efficiently. It took about 26 years for the sequential suffix array construction algorithm to achieve O(n) time complexity and in-place sorting. In this paper, we develop the D-Limited Parallel Induce (DLPI) algorithm, the first O(np) time parallel suffix array construction algorithm. The basic idea of DLPI includes two aspects: dividing the O(n) size problem into p reduced sub-problems with size O(np) so we can handle them on p processors in parallel; developing an efficient parallel induce sorting method to achieve correct order for all the reduced sub-problems. The complete algorithm description is given to show the implementation method of the proposed idea. The time and space complexity analysis and proof are also given to show the correctness and efficiency of the proposed algorithm. The proposed DLPI algorithm can handle large strings with scalable performance.
Identifier
85161428178 (Scopus)
ISBN
[9783031304415]
Publication Title
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics
External Full Text Location
https://doi.org/10.1007/978-3-031-30442-2_6
e-ISSN
16113349
ISSN
03029743
First Page
71
Last Page
82
Volume
13826 LNCS
Fund Ref
National Science Foundation
Recommended Citation
Du, Zhihui; Zhang, Sen; and Bader, David A., "Parallel Suffix Sorting for Large String Analytics" (2023). Faculty Publications. 2078.
https://digitalcommons.njit.edu/fac_pubs/2078