TreeRank: A similarity measure for nearest neighbor searching in phylogenetic databases

Document Type

Conference Proceeding

Publication Date

1-1-2003

Abstract

Phylogenetic trees are unordered labeled trees in which each leaf node has a label and the order among siblings is unimportant. In this paper we propose a new similarity measure, called TreeRank, for phylogenetic trees and present an algorithm for computing TreeRank scores. Given a query or pattern tree P and a data tree D, the TreeRank score from P to D is a measure of the topological relationships in P that are found to be the same or similar in D. The proposed algorithm calculates the TreeRank score in O(M2 + N) time where M is the number of nodes appearing in both P and D, and N is the number of nodes in D. We then develop a search engine that, given a query or pattern tree P and a database of trees D, finds and ranks the nearest neighbors of P in D where the "nearness" is measured by the proposed similarity function. This structure-based search engine is fully operational and is available on the World Wide Web.

Identifier

43249085884 (Scopus)

ISBN

[0769519644]

Publication Title

Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm

External Full Text Location

https://doi.org/10.1109/SSDM.2003.1214978

ISSN

10993371

First Page

171

Last Page

180

Volume

2003-January

Grant

IIS-9988345

This document is currently not available here.

Share

COinS