Scientific data mining: A case study
Document Type
Article
Publication Date
1-1-1998
Abstract
Scientific data mining is the activity of finding significant information in scientific data. This paper presents an example of scientific data mining: the discovery of approximately common patterns in RNA secondary structures. We represent an RNA secondary structure by an ordered labeled tree based on a previously proposed scheme. The patterns in the trees are substructures that can differ in both substitutions and deletions/insertions of nodes of the trees. Our techniques incorporate approximate tree matching algorithms and novel heuristics for discovery and optimization. Experimental results obtained by running these algorithms on both generated data and RNA secondary structures show the good performance of the algorithms. It is shown that the optimization heuristics speed up the discovery algorithm by a factor of 10. Moreover, our optimized approach is 100,000 times faster than the brute force method.
Identifier
11544375735 (Scopus)
Publication Title
International Journal of Software Engineering and Knowledge Engineering
External Full Text Location
https://doi.org/10.1142/S0218194098000078
ISSN
02181940
First Page
77
Last Page
96
Issue
1
Volume
8
Recommended Citation
Chang, Chia Yo; Wang, Jason T.L.; and Chang, Roger K., "Scientific data mining: A case study" (1998). Faculty Publications. 16482.
https://digitalcommons.njit.edu/fac_pubs/16482
