Abstract
In this paper, we present a novel index structure, called the SA-tree, to speed up processing of high-dimensional K-nearest neighbor (KNN) queries. The SA-tree employs data clustering and compression, i.e. utilizes the characteristics of each cluster to adaptively compress feature vectors into bit-strings. Hence our proposed mechanism can reduce the disk I/O and computational cost significantly, and adapt to different data distributions. We also develop efficient KNN search algorithms using MinMax Pruning and Partial MinDist Pruning methods. We conducted extensive experiments to evaluate the SA-tree and the results show that our approaches provide superior performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berchtold, S., Bohm, C., Jagadish, H.V., Kriegel, H.P., Sander, J.: Independent quantization: An index compression technique for high-dimensional data spaces. In: Proc. 16th ICDE Conference, pp. 577–588 (2000)
Berchtold, S., Keim, D.A., Kriegel, H.P.: The x-tree: An index structure for high-dimensional data. In: Proc. 22th VLDB Conference, pp. 28–39 (1996)
Bohm, C., Berchtold, S., Keim, D.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys 33(3), 322–373 (2001)
Chakrabarti, K., Mehrotra, S.: Local dimensionality reduction: A new approach to indexing high dimensional spaces. In: Proc. 26th VLDB Conference, pp. 89–100 (2000)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proc. of the ACM SIGMOD Conference, pp. 47–57 (1984)
Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proc. 24th VLDB Conference, pp. 194–205 (1998)
Yu, C., Ooi, B.C., Tan, K.L., Jagadish, H.V.: Indexing the distance: An efficient method to knn processing. In: Proc. 27th VLDB Conference, pp. 421–430 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cui, B., Hu, J., Shen, H., Yu, C. (2004). Adaptive Quantization of the High-Dimensional Data for Efficient KNN Processing. In: Lee, Y., Li, J., Whang, KY., Lee, D. (eds) Database Systems for Advanced Applications. DASFAA 2004. Lecture Notes in Computer Science, vol 2973. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24571-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-24571-1_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21047-4
Online ISBN: 978-3-540-24571-1
eBook Packages: Springer Book Archive