Document Type
Thesis
Date of Award
Fall 1-31-2006
Degree Name
Master of Science in Computer Science - (M.S.)
Department
Computer Science
First Advisor
Jason T. L. Wang
Second Advisor
Chengjun Liu
Third Advisor
Qun Ma
Abstract
In recent years, RNA structural comparison becomes a crucial problem in bioinformatics research. Generally, it is a popular approach for representing the RNA secondary structures with arc-annotation sets. Several methods can be used to compare two RNA structures, such as tree edit distance, longest arc-preserving common subsequence (LAPCS) and stem based alignment. However, these methods may be helpful only for small RNA structures because of their high time complexity. In this thesis, we propose a simplified method to compare two RNA structures in O(mn) time, where m and n are the lengths of the two RNA sequences, respectively. The method transforms the RNA structures into specific sequences called object sequences, then compare these object sequences to find their common substructures. The comparison method is tested with 118 RNA structures obtained from RNase P Database. For any two structures, it is important to identify whether they are in the same family by both structure comparison and sequence comparison. In the experiment, it is found that the method for comparing RNA structures can yield better hit rates and is faster than the traditional method to compare the RNA sequences. Therefore, the approach to extract and compare the RNA secondary structures is more sensitive in biology and more efficient in time complexity.
Recommended Citation
Walawalkar, Girish Prakash, "A new approach to feature extraction for RNA structure comparision" (2006). Theses. 418.
https://digitalcommons.njit.edu/theses/418