Persistent clustered main memory index for accelerating k-NN queries on high dimensional datasets
Document Type
Conference Proceeding
Publication Date
12-1-2005
Abstract
Similarity search implemented via k-Nearest-Neighbor (k-NN) queries is an extremely useful paradigm in content based image retrieval (CBIR), which is costly on high-dimensional indices due to the curse of dimensionality. We improve k-NN query processing by utilizing the double filtering effect of clustering and indexing on a persistent version of the Ordered-Partition tree (OP-tree) index, which is highly efficient in processing k-NN queries. The OP-tree is made persistent by writing it onto disk after serialization, i.e. arranging its nodes into contiguous memory locations, so that the high transfer rate of modern disk drives is exploited. We first report experimental results to optimize OP-tree parameters. We then compare OP-trees and sequential scans with options for the Karhunen-Loève transform and Euclidean distance calculation. Comparisons against OMNI-based sequential scan are also reported. We finally compare a clustered and persistent version of the OP-tree against a clustered version of the SR-tree and the VA-File method. It is observed that the OP-tree index outperforms the other two methods and that the improvement increases with the number of dimensions. Copyright 2005 ACM.
Identifier
77953525148 (Scopus)
ISBN
[1595931511, 9781595931511]
Publication Title
ACM International Conference Proceeding Series
External Full Text Location
https://doi.org/10.1145/1160939.1160955
First Page
65
Last Page
70
Volume
160
Grant
0105485
Fund Ref
National Science Foundation
Recommended Citation
Zhang, Lijuan and Thomasian, Alexander, "Persistent clustered main memory index for accelerating k-NN queries on high dimensional datasets" (2005). Faculty Publications. 19339.
https://digitalcommons.njit.edu/fac_pubs/19339
