Document Type
Thesis
Date of Award
Fall 10-31-1997
Degree Name
Master of Science in Computer Science - (M.S.)
Department
Computer and Information Science
First Advisor
Jason T. L. Wang
Second Advisor
James A. McHugh
Third Advisor
Peter A. Ng
Abstract
Multiple sequence alignment has proven to be a successful method of representing and organizing of protein sequence data. It is crucial to medical researches on the structure and function of proteins.
There have been numerous tools published on how to abstract meaningful relationship from an unknown sequence and a set of known sequences. One study used a method for discovering active motifs in a set of related protein sequences. These are meaningful knowledge abstracted from the known protein database since most protein families are characterized by multiple local motifs. Another study abstracts knowledge regarding the input sequence using a preconstructed algorithm from a set of sequences.
Most of these studies of classification processes use statistically optimized heuristics to enhance their accompanying algorithms. Therefore, these algorithms can be analyzed for statistical significance using Baysian Theorems.
Recommended Citation
Shih, Tom Tien-Hua, "Testing statistical significance in sequence classification algorithms" (1997). Theses. 1034.
https://digitalcommons.njit.edu/theses/1034