Skip to main content

A General Approach to Compression of Hierarchical Indexes

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2113))

Abstract

Tree-structured indexes typically restrict the search domain level by level, which means that the search information can be encoded more and more compactly on the way down. This simple observation is here formulated as a general principle of index compression. Saving storage space is one advantage, but more important is reduction of disk accesses, because more entries can be packed into a page. The index fan-out can be increased, reducing the average height of the tree. The applicability of compression is studied for several popular one-and multidimensional indexes. Experiments with the well-known spatial index, R*-tree, show that with modest assumptions and simple coding, 30-40% reduction of disk accesses is obtainable for intersection queries. Compression of index entries can be used together with other index compaction techniques, such as quantization and pointer list compression.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bayer, R.: The Universal B-Tree for Multidimensional Indexing, Technical Rep. TUM-19637, Technische Universität München (1996).

    Google Scholar 

  2. Beckmann, R., Kriegel, H.-P., Schneider, R., and Seeger, B.: “The R*-tree: An Efficient and Robust Access Method for Points and Rectangles”, Proc. ACM SIGMOD Conf., Atlantic City, NJ (1990) 322–331.

    Google Scholar 

  3. Bell, T.C., Cleary, J.G., and Witten, I.H.: “Text Compression”, Prentice-Hall (1990).

    Google Scholar 

  4. Bentley, J.L.: “Multidimensional Binary Search Trees in Database Applications”, IEEE Trans. on Software Eng., Vol. SE-5,No. 4 (1979) 333–340.

    Article  Google Scholar 

  5. Berchtold, S., Böhm, C., Jagadish, H.V., Kriegel, H.-P., and Sander, J,: “Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces”, Proc. 16th ICDE Conf., San Diego, CA (2000) 577–588.

    Google Scholar 

  6. Bookstein, A., and Klein, S.T.: “Compression of Correlated Bit-Vectors”, Inf. Systems, Vol. 16,No. 4 (1991) 387–400.

    Article  Google Scholar 

  7. Ciaccia, P., Patella, M., and Zezula, P.: “M-tree: An Efficient Access Method for Metric Spaces”, Proc. 23rd VLDB Conf., Athens, Greece (1997) 426–435.

    Google Scholar 

  8. Deppisch, U.: “S-Tree: A Dynamic Balanced Signature Index for Office Retrieval”, Proc. ACM Conf. on Res. and Dev. in Inf. Retrieval, Pisa, Italy (1986) 77–87.

    Google Scholar 

  9. Elias, P.: “Universal Codeword Sets and Representations of the Integers”, IEEE Trans. Information Theory, Vol. IT-21,No. 2 (1975) 194–203.

    Article  MathSciNet  Google Scholar 

  10. Goldstein, J., Ramakrishnan, R., and Shaft, U.: “Compressing Relations and Indexes”, Proc. 14th ICDE Conf., Orlando, FA (1998) 370–379.

    Google Scholar 

  11. Guttman, A.: “R-trees: A Dynamic Index Structure for Spatial Searching”, Proc. ACM SIGMOD Conf., Boston, MA (1984) 47–57.

    Google Scholar 

  12. Hellerstein, J.M., Naughton, J.F., and Pfeffer, A.: “Generalized Search Trees for Database Systems”, Proc. 21st VLDB Conf., Zurich, Switzerland (1995) 562–573.

    Google Scholar 

  13. Moffat, A., and Stuiver, L.: “Exploiting Clustering in Inverted File Compression”, Proc. 6th Data Compression Conf. (DCC), Snowbird, UT (1996) 82–91.

    Google Scholar 

  14. Orenstein, J.A., and Merrett, T.H.: “A Class of Data Structures for Associative Searching”, Proc. 3rd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, Waterloo, Ontario, Canada (1984) 181–190.

    Google Scholar 

  15. Rissanen, J.J.: “Arithmetic Coding”, IBM J. Res. Develop., Vol. 23,No.2 (1979) 149–162.

    Article  MATH  MathSciNet  Google Scholar 

  16. Robinson, J.T.: “The K-D-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes”, Proc. ACM SIGMOD Conf., Ann Arbor, MI (1981) 10–18.

    Google Scholar 

  17. Uhlmann, J.K.: “Satisfying General Proximity/Similarity Queries with Metric Trees”, Inf. Proc. Letters, Vol. 40,No. 4 (1991) 175–179.

    Article  MATH  Google Scholar 

  18. Witten, I.H., Moffat, A., and Bell, T.C.: “Managing Gigabytes-Compressing and Indexing Documents and Images”, Morgan Kaufmann Publ. (1999).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Teuhola, J. (2001). A General Approach to Compression of Hierarchical Indexes. In: Mayr, H.C., Lazansky, J., Quirchmayr, G., Vogel, P. (eds) Database and Expert Systems Applications. DEXA 2001. Lecture Notes in Computer Science, vol 2113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44759-8_75

Download citation

  • DOI: https://doi.org/10.1007/3-540-44759-8_75

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42527-4

  • Online ISBN: 978-3-540-44759-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics