Abstract
An efficient tunable high-dimensional indexing scheme called the iMinMax(θ) was proposed to map high-dimensional data points into single dimension value based on the minimum or maximum values among all dimensions [7]. Unfortunately, the number of leaf nodes needs to be scanned remains large. To reduce the number of leaf nodes, we propose to use the compression technique proposed in the Vector Approximation File (VA-file) [10] to represent vectors. We call the hybrid method, the iMinMax(θ)*. While the marriage is straight forward, the gain in performance is significant. In our extensive performance study, the results clearly indicate that iMinMax(θ)* outperforms the original iMinMax(θ) index scheme and the VA-file. iMinMax(θ)* is also attractive from a practical view point for its implementation cost is only slightly higher than that of the original iMinMax(θ). The approximation concept that is incorporated in iMinMax(θ)* can be integrated in other high-dimensional index structures without much difficulty.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Berchtold, D.A. Keim, H.P. Kriegel:The X-tree: an index structure for high-dimensional data. Proc. Very Large Data Bases VLDB’96 (1996) 23–27.
N. Beckmann, H-P. Kriegel R. Schneider, B. Seeger: The R*-tree, An efficient and robust access method for points and rectangles. Proc. ACM SIGMOD Int. Conf. On Management of Data SIGMOD’90 (1990) 322–331.
R. Finkel and J. Bentley: Quad-trees: A data structure for retrieval on composite keys. ACTA Information (1974) 1–9.
A. Guttman:R-tree: A dynamic index structure for spatial searching. Proc. ACM SIGMOD Int. Conf. On Management of Data SIGMOD’84 (1984) 47–54.
J. Nievergelt, H. Hinterberger, and K. Sevcik:The grid file: An adaptable symmetric multikey file structure. ACM Transactions on Database Systems 1984 38–71.
B. C. Ooi: Efficient query processing in geographical information system. Lecture Notes in Computer Science #471, Springer-Verlag, 1990.
B.C. Ooi, K.L. Tan, C. Yu, and S. Bressan:Indexing the Edges-A simple and yet efficient approach to high-dimensional indexing. Proc. ACM SIGMOD-SIGACT-SIGART 19th Symposium on Principles of Database Systems PODS’2000 (2000) 166–174.
J. Robinson: The k-d-b tree: A search structure for large multidimensional dynamic indexes. Proc. ACM SIGMOD Int. Conf. On Management of Data (1981) 10–18.
R. Weber and S. Blott. An approximation based data structure for similarity search. Technical Report 24, ESPRIT project HERMES (no. 9141) (1997)
R. Weber, Hans-J. Schek, and S. Blott: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. Proc. Int. Conf. Very Large Data Bases VLDB’98 (1998) 194–205.
C. Yu: High-dimensional indexing. PhD Thesis. National University of Singapore (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, S., Yu, C., Ooi, B.C. (2001). Compressing the Index - A Simple and yet Efficient Approximation Approach to High-Dimensional Indexing. In: Wang, X.S., Yu, G., Lu, H. (eds) Advances in Web-Age Information Management. WAIM 2001. Lecture Notes in Computer Science, vol 2118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47714-4_27
Download citation
DOI: https://doi.org/10.1007/3-540-47714-4_27
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42298-3
Online ISBN: 978-3-540-47714-3
eBook Packages: Springer Book Archive