Date of Award

Fall 2005

Document Type

Thesis

Degree Name

Master of Science in Computer Science - (M.S.)

Department

Computer Science

First Advisor

Jason T. L. Wang

Second Advisor

Chengjun Liu

Third Advisor

Qun Ma

Abstract

In recent years, RNA structural comparison becomes a crucial problem in bioinformatics research. Generally, it is a popular approach for representing the RNA secondary structures with arc-annotation sets. Several methods can be used to compare two RNA structures, such as tree edit distance, longest arc-preserving common subsequence (LAPCS) and stem based alignment. However, these methods may be helpful only for small RNA structures because of their high time complexity. In this thesis, we propose a simplified method to compare two RNA structures in O(mn) time, where m and n are the lengths of the two RNA sequences, respectively. The method transforms the RNA structures into specific sequences called object sequences, then compare these object sequences to find their common substructures. The comparison method is tested with 118 RNA structures obtained from RNase P Database. For any two structures, it is important to identify whether they are in the same family by both structure comparison and sequence comparison. In the experiment, it is found that the method for comparing RNA structures can yield better hit rates and is faster than the traditional method to compare the RNA sequences. Therefore, the approach to extract and compare the RNA secondary structures is more sensitive in biology and more efficient in time complexity.

Share

COinS