Document Type


Date of Award

Fall 10-31-1997

Degree Name

Master of Science in Computer Science - (M.S.)


Computer and Information Science

First Advisor

Jason T. L. Wang

Second Advisor

James A. McHugh

Third Advisor

Peter A. Ng


Multiple sequence alignment has proven to be a successful method of representing and organizing of protein sequence data. It is crucial to medical researches on the structure and function of proteins.

There have been numerous tools published on how to abstract meaningful relationship from an unknown sequence and a set of known sequences. One study used a method for discovering active motifs in a set of related protein sequences. These are meaningful knowledge abstracted from the known protein database since most protein families are characterized by multiple local motifs. Another study abstracts knowledge regarding the input sequence using a preconstructed algorithm from a set of sequences.

Most of these studies of classification processes use statistically optimized heuristics to enhance their accompanying algorithms. Therefore, these algorithms can be analyzed for statistical significance using Baysian Theorems.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.