Document Type
Dissertation
Date of Award
Spring 5-31-1996
Degree Name
Doctor of Philosophy in Computing Sciences - (Ph.D.)
Department
Computer and Information Science
First Advisor
Jason T. L. Wang
Second Advisor
James A. McHugh
Third Advisor
David Nassimi
Fourth Advisor
Peter A. Ng
Fifth Advisor
Wen-Syan Li
Abstract
Sequence databases comprise sequence data, which are linear structural descriptions of many natural entities. Approximate pattern discovery in a sequence database can lead to important conclusions or prediction of new phenomena. Traditional database technology is not suitable for accomplishing the task, and new techniques need to be developed.
In this dissertation, we propose several new techniques for discovering patterns in sequence databases. Our techniques incorporate pattern matching algorithms and novel heuristics for discovery and optimization. Experimental results of applying the techniques to both generated data and DNA/proteins show the effectiveness of the proposed techniques.
We then develop several classifiers using our pattern discovery algorithms and a previously published fingerprint technique. When we apply the classifiers to classify DNA and protein sequences, they give information that is complementary to the best classifiers available today.
Recommended Citation
Chirn, Gung-Wei, "Pattern discovery in sequence databases : algorithms and applications to DNA/protein classification" (1996). Dissertations. 1013.
https://digitalcommons.njit.edu/dissertations/1013