Persistent clustered main memory index for accelerating k-NN queries on high dimensional datasets

Document Type

Conference Proceeding

Publication Date

12-1-2005

Abstract

Similarity search implemented via k-Nearest-Neighbor (k-NN) queries is an extremely useful paradigm in content based image retrieval (CBIR), which is costly on high-dimensional indices due to the curse of dimensionality. We improve k-NN query processing by utilizing the double filtering effect of clustering and indexing on a persistent version of the Ordered-Partition tree (OP-tree) index, which is highly efficient in processing k-NN queries. The OP-tree is made persistent by writing it onto disk after serialization, i.e. arranging its nodes into contiguous memory locations, so that the high transfer rate of modern disk drives is exploited. We first report experimental results to optimize OP-tree parameters. We then compare OP-trees and sequential scans with options for the Karhunen-Loève transform and Euclidean distance calculation. Comparisons against OMNI-based sequential scan are also reported. We finally compare a clustered and persistent version of the OP-tree against a clustered version of the SR-tree and the VA-File method. It is observed that the OP-tree index outperforms the other two methods and that the improvement increases with the number of dimensions. Copyright 2005 ACM.

Identifier

77953525148 (Scopus)

ISBN

[1595931511, 9781595931511]

Publication Title

ACM International Conference Proceeding Series

External Full Text Location

https://doi.org/10.1145/1160939.1160955

First Page

65

Last Page

70

Volume

160

Grant

0105485

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS