TreeRank: A similarity measure for nearest neighbor searching in phylogenetic databases
Document Type
Conference Proceeding
Publication Date
1-1-2003
Abstract
Phylogenetic trees are unordered labeled trees in which each leaf node has a label and the order among siblings is unimportant. In this paper we propose a new similarity measure, called TreeRank, for phylogenetic trees and present an algorithm for computing TreeRank scores. Given a query or pattern tree P and a data tree D, the TreeRank score from P to D is a measure of the topological relationships in P that are found to be the same or similar in D. The proposed algorithm calculates the TreeRank score in O(M2 + N) time where M is the number of nodes appearing in both P and D, and N is the number of nodes in D. We then develop a search engine that, given a query or pattern tree P and a database of trees D, finds and ranks the nearest neighbors of P in D where the "nearness" is measured by the proposed similarity function. This structure-based search engine is fully operational and is available on the World Wide Web.
Identifier
43249085884 (Scopus)
ISBN
[0769519644]
Publication Title
Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm
External Full Text Location
https://doi.org/10.1109/SSDM.2003.1214978
ISSN
10993371
First Page
171
Last Page
180
Volume
2003-January
Grant
IIS-9988345
Recommended Citation
Wang, J. T.L.; Shan, Huiyuan; Shasha, D.; and Piel, W. H., "TreeRank: A similarity measure for nearest neighbor searching in phylogenetic databases" (2003). Faculty Publications. 14464.
https://digitalcommons.njit.edu/fac_pubs/14464
