Document Type
Dissertation
Date of Award
Summer 8-31-2000
Degree Name
Doctor of Philosophy in Computing Sciences - (Ph.D.)
Department
Computer and Information Science
First Advisor
Jason T. L. Wang
Second Advisor
James A. McHugh
Third Advisor
Frank Y. Shih
Fourth Advisor
Daochuan Hung
Fifth Advisor
Michael Halper
Abstract
Knowledge discovery, in databases, also known as data mining, is aimed to find significant information from a set of data. The knowledge to be mined from the dataset may refer to patterns, association rules, classification and clustering rules, and so forth. In this dissertation, we present a neural network approach to finding knowledge in biological databases. Specifically, we propose new methods to process biological sequences in two case studies: the classification of protein sequences and the prediction of E. Coli promoters in DNA sequences. Our proposed methods, based oil neural network architectures combine techniques ranging from Bayesian inference, coding theory, feature selection, dimensionality reduction, to dynamic programming and machine learning algorithms. Empirical studies show that the proposed methods outperform previously published methods and have excellent performance on the latest dataset. We have implemented the proposed algorithms into an infrastructure, called Genome Mining, developed for biosequence classification and recognition.
Recommended Citation
Ma, Qicheng, "Knowledge discovery in biological databases : a neural network approach" (2000). Dissertations. 423.
https://digitalcommons.njit.edu/dissertations/423