Skip to main content

Compressing Large Signature Trees

  • Conference paper
Advances in Databases and Information Systems (ADBIS 2003)

Abstract

In this paper we present a new compression scheme for signature tree structures. Beyond the reduction of storage space, compression attains significant savings in terms of query processing. The latter issue is of critical importance when considering large collections of set valued data, e.g., in object-relational databases, where signature tree structures find important applications. The proposed scheme works on a per node basis, by reorganizing node entries according to their similarity, which results to sparse bit vectors that can be drastically compressed. Experimental results illustrate the efficiency gains due to the proposed scheme, especially for interesting real-world cases, like basket-market data or Web-server logs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proc. Conf. On Very Large Databases (VLDB 1994), pp. 3–14 (1995)

    Google Scholar 

  2. Berchtold, S., Bohm, C., Jagadish, H., Kriegel, H.-P., Sander, J.: Independent Quantization: an Index Compression Technique for High Dimensional Data Spaces. In: Proc. Conf. on Data Engineering (ICDE 2000), pp. 577–588 (2000)

    Google Scholar 

  3. Bookstein, A., Klein, S.: Compression of Correlated Bit-Vectors. Information Systems 16(4), 387–400 (1991)

    Article  Google Scholar 

  4. Christodoulakis, S., Faloutsos, C.: Signature Files: An Access Method for Documents and its Analytical Performance Evaluation. ACM Transactions on Office Information Systems 2, 267–288 (1984)

    Article  Google Scholar 

  5. Deppisch, U.: S-tree: A Dynamic Balanced Signature Index for Office Retrieval. In: Proc. ACM SIGIR Conf., pp. 77–87 (1986)

    Google Scholar 

  6. Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing Relations and Indexes. In: Proc. Conf. on Data Engineering, pp. 370–379 (1998)

    Google Scholar 

  7. Manolopoulos, Y., Nanopoulos, Y.A., Tousidou, E.: Advanced Signature Indexing for Multimedia and Web Applications. The Kluwer International Series on Advances in Databases Systems. Kluwer Academic Publishers, Dordrecht (2003) (in print)

    Google Scholar 

  8. Nanopoulos, A., Manolopoulos, Y.: Efficient Similarity Search for Market Basket Data. The VLDB Journal 11(2), 138–152 (2002)

    Article  Google Scholar 

  9. Nascimento, M., Tousidou, E., Vishal, C., Manolopoulos, Y.: Image Indexing and Retrieval Using Signature Trees. Data and Knowledge Engineering 43(1), 57–77 (2002)

    Article  MATH  Google Scholar 

  10. Sacks-Davis, R., Ramamohanarao, K.: Multikey Access Methods Based On Superimposed Coding Techniques. ACM Transactions on Database Systems 12(4), 655–696 (1987)

    Article  Google Scholar 

  11. Teuhola, J.: A General Approach to Compression of Hierarchical Indexes. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 775–784. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Tousidou, E., Bozanis, P., Manolopoulos, Y.: Signature-based Structures for Objects with Set-valued Attributes. Information Systems 27(2), 93–121 (2002)

    Article  MATH  Google Scholar 

  13. Tousidou, E., Nanopoulos, A., Manolopoulos, Y.: Improved Methods for Signature- Tree Construction. The Computer Journal 43(4), 301–314 (2000)

    Article  MATH  Google Scholar 

  14. Witten, I., Moffat, A., Bell, T.: Managing Gigabytes – Compressing and Indexing Documents and Images. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  15. Zezula, P.: Linear Hashing For Signatures Files. In: Proc. IFIP TC6 and TC8 Symp. on Network Information Processing Systems, pp. 192–196 (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kontaki, M., Manolopoulos, Y., Nanopoulos, A. (2003). Compressing Large Signature Trees. In: Kalinichenko, L., Manthey, R., Thalheim, B., Wloka, U. (eds) Advances in Databases and Information Systems. ADBIS 2003. Lecture Notes in Computer Science, vol 2798. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39403-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39403-7_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20047-5

  • Online ISBN: 978-3-540-39403-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics