Abstract
Tree-structured indexes typically restrict the search domain level by level, which means that the search information can be encoded more and more compactly on the way down. This simple observation is here formulated as a general principle of index compression. Saving storage space is one advantage, but more important is reduction of disk accesses, because more entries can be packed into a page. The index fan-out can be increased, reducing the average height of the tree. The applicability of compression is studied for several popular one-and multidimensional indexes. Experiments with the well-known spatial index, R*-tree, show that with modest assumptions and simple coding, 30-40% reduction of disk accesses is obtainable for intersection queries. Compression of index entries can be used together with other index compaction techniques, such as quantization and pointer list compression.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bayer, R.: The Universal B-Tree for Multidimensional Indexing, Technical Rep. TUM-19637, Technische Universität München (1996).
Beckmann, R., Kriegel, H.-P., Schneider, R., and Seeger, B.: “The R*-tree: An Efficient and Robust Access Method for Points and Rectangles”, Proc. ACM SIGMOD Conf., Atlantic City, NJ (1990) 322–331.
Bell, T.C., Cleary, J.G., and Witten, I.H.: “Text Compression”, Prentice-Hall (1990).
Bentley, J.L.: “Multidimensional Binary Search Trees in Database Applications”, IEEE Trans. on Software Eng., Vol. SE-5,No. 4 (1979) 333–340.
Berchtold, S., Böhm, C., Jagadish, H.V., Kriegel, H.-P., and Sander, J,: “Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces”, Proc. 16th ICDE Conf., San Diego, CA (2000) 577–588.
Bookstein, A., and Klein, S.T.: “Compression of Correlated Bit-Vectors”, Inf. Systems, Vol. 16,No. 4 (1991) 387–400.
Ciaccia, P., Patella, M., and Zezula, P.: “M-tree: An Efficient Access Method for Metric Spaces”, Proc. 23rd VLDB Conf., Athens, Greece (1997) 426–435.
Deppisch, U.: “S-Tree: A Dynamic Balanced Signature Index for Office Retrieval”, Proc. ACM Conf. on Res. and Dev. in Inf. Retrieval, Pisa, Italy (1986) 77–87.
Elias, P.: “Universal Codeword Sets and Representations of the Integers”, IEEE Trans. Information Theory, Vol. IT-21,No. 2 (1975) 194–203.
Goldstein, J., Ramakrishnan, R., and Shaft, U.: “Compressing Relations and Indexes”, Proc. 14th ICDE Conf., Orlando, FA (1998) 370–379.
Guttman, A.: “R-trees: A Dynamic Index Structure for Spatial Searching”, Proc. ACM SIGMOD Conf., Boston, MA (1984) 47–57.
Hellerstein, J.M., Naughton, J.F., and Pfeffer, A.: “Generalized Search Trees for Database Systems”, Proc. 21st VLDB Conf., Zurich, Switzerland (1995) 562–573.
Moffat, A., and Stuiver, L.: “Exploiting Clustering in Inverted File Compression”, Proc. 6th Data Compression Conf. (DCC), Snowbird, UT (1996) 82–91.
Orenstein, J.A., and Merrett, T.H.: “A Class of Data Structures for Associative Searching”, Proc. 3rd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, Waterloo, Ontario, Canada (1984) 181–190.
Rissanen, J.J.: “Arithmetic Coding”, IBM J. Res. Develop., Vol. 23,No.2 (1979) 149–162.
Robinson, J.T.: “The K-D-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes”, Proc. ACM SIGMOD Conf., Ann Arbor, MI (1981) 10–18.
Uhlmann, J.K.: “Satisfying General Proximity/Similarity Queries with Metric Trees”, Inf. Proc. Letters, Vol. 40,No. 4 (1991) 175–179.
Witten, I.H., Moffat, A., and Bell, T.C.: “Managing Gigabytes-Compressing and Indexing Documents and Images”, Morgan Kaufmann Publ. (1999).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Teuhola, J. (2001). A General Approach to Compression of Hierarchical Indexes. In: Mayr, H.C., Lazansky, J., Quirchmayr, G., Vogel, P. (eds) Database and Expert Systems Applications. DEXA 2001. Lecture Notes in Computer Science, vol 2113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44759-8_75
Download citation
DOI: https://doi.org/10.1007/3-540-44759-8_75
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42527-4
Online ISBN: 978-3-540-44759-7
eBook Packages: Springer Book Archive