Scientific data mining: A case study

Document Type

Article

Publication Date

1-1-1998

Abstract

Scientific data mining is the activity of finding significant information in scientific data. This paper presents an example of scientific data mining: the discovery of approximately common patterns in RNA secondary structures. We represent an RNA secondary structure by an ordered labeled tree based on a previously proposed scheme. The patterns in the trees are substructures that can differ in both substitutions and deletions/insertions of nodes of the trees. Our techniques incorporate approximate tree matching algorithms and novel heuristics for discovery and optimization. Experimental results obtained by running these algorithms on both generated data and RNA secondary structures show the good performance of the algorithms. It is shown that the optimization heuristics speed up the discovery algorithm by a factor of 10. Moreover, our optimized approach is 100,000 times faster than the brute force method.

Identifier

11544375735 (Scopus)

Publication Title

International Journal of Software Engineering and Knowledge Engineering

External Full Text Location

https://doi.org/10.1142/S0218194098000078

ISSN

02181940

First Page

77

Last Page

96

Issue

1

Volume

8

This document is currently not available here.

Share

COinS