Abstract
Utilizing spatial index structures on secondary memory for nearest neighbor search in high-dimensional data spaces has been the subject of much research. With the potential to host larger indexes in main memory, applications demanding a high query throughput stand to benefit from index structures tailored for that environment. “Index once, query at very high frequency” scenarios on semi-static data require particularly fast responses while allowing for more extensive precalculations. One such precalculation consists of indexing the solution space for nearest neighbor queries as used by the approximate Voronoi cell-based method. A major deficiency of this promising approach is the lack of a way to incorporate effective dimensionality reduction techniques. We propose methods to overcome the difficulties faced for normalized data and present a second reduction step that improves response times through limiting the dimensionality of the Voronoi cell approximations. In addition, we evaluate the suitability of our approach for main memory indexing where speedup factors of up to five can be observed for real world data sets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Berchtold, S., Ertl, B., Keim, D.A., Kriegel, H.P., Seidl, T.: Fast Nearest Neighbor Search in High-Dimensional Spaces. In: ICDE Conf, pp. 209–218 (1998)
Berchtold, S., Keim, D.A., Kriegel, H.P., Seidl, T.: Indexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space. In: IEEE Trans. Knowl. Data Eng, vol. 12, pp. 45–57 (2000)
Dobkin, D., Lipton, R.: Multidimensional Searching Problems. SIAM J. on Computing 5, 181–186 (1976)
Weber, R., Schek, H.J., Blott, S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. In: VLDB Conf, pp. 194–205 (1998)
Guttman, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: SIGMOD Conf., pp. 47–57 (1984)
Beckmann, N., Kriegel, H.P., Schneider, R., Seeger, B.: The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. In: SIGMOD Conf., pp. 322–331 (1990)
Kim, K., Cha, S.K., Kwon, K.: Optimizing Multidimensional Index Trees for Main MemoryAccess. In: SIGMOD Conf, 139–150 (2001)
Berchtold, S., Keim, D.A., Kriegel, H.P.: The X-Tree: An Index Structure for High-Dimensional Data. In: VLDB Conf, 28–39 (1996)
Roussopoulos, N., Kelley, S., Vincent, S.: Nearest Neighbor Queries. In: SIGMOD Conf, 71–79 (1995)
Hjaltason, G.R., Samet, H.: Ranking in Spatial Databases. In: SSD, pp. 83–95 (1995)
Bohannon, P., McIlroy, P., Rastogi, R.: Main-Memory Index Structures with Fixed-Size Partial Keys. In: SIGMOD Conf, pp. 163–174 (2001)
Rao, J., Ross, K.A.: Making B+-Trees Cache Conscious in Main Memory. In: SIGMOD Conf., pp. 475–486 (2000)
Voronoi, G.: Nouvelles applications des parametres continus la theorie des formes quadratiques. J. für die reine und angewandte Mathematik 138, 198–287 (1908)
Aurenhammer, F., Klein, R.: Handbook of Computational Geometry, pp. 201–290. Elsevier Science Publishers, Amsterdam (2000)
Klee, V.: On the Complexity of d-dimensional Voronoi Diagrams. Archiv der Mathematik 34, 75–80 (1980)
Seidel, R.: On the Number of Faces in Higher-Dimensional Voronoi Diagrams. In: Symposium on Computational Geometry, pp. 181–185 (1987)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge (1992)
Kaski, S.: Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering. IJCNN, 413–418 (1998)
Edelsbrunner, H.: Algorithms in Combinatorial Geometry. Springer-Verlag (1987)
Jaffar, J., Maher, M.J., Stuckey, P.J., Yap, R.H.C.: Projecting CLP(R) Constraints. New Generation Computing 11, 449–469 (1993)
Bradford Barber, C., Dobkin, D., Huhdanpaa, H.: The Quickhull Algorithm for Convex Hulls. ACM Trans. Math. Softw. 22, 469–483 (1996)
Goldstein, J., Platt, J.C., Burges, C.J.C.: Indexing High Dimensional Rectangles for Fast Multimedia Identification. Technical Report MSR-TR-2003-38, Microsoft Research (2003)
Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., Niblack, W.: Efficient Color Histogram Indexing for Quadratic Form Distance Functions. IEEE Trans. PAMI 17, 729–736 (1995)
Wahlster, W.: Verbmobil: Foundations of Speech-to-Speech Translation, pp. 537–631. Springer, Heidelberg (2000)
Keogh, E., Folias, T.: The UCR Time Series Data Mining Archive. (2002), http://www.cs.ucr.edu/~eamonn/TSDMA/index.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brochhaus, C., Wichterich, M., Seidl, T. (2006). Approximation Techniques to Enable Dimensionality Reduction for Voronoi-Based Nearest Neighbor Search. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_15
Download citation
DOI: https://doi.org/10.1007/11687238_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)