Abstract
In this paper we present a new compression scheme for signature tree structures. Beyond the reduction of storage space, compression attains significant savings in terms of query processing. The latter issue is of critical importance when considering large collections of set valued data, e.g., in object-relational databases, where signature tree structures find important applications. The proposed scheme works on a per node basis, by reorganizing node entries according to their similarity, which results to sparse bit vectors that can be drastically compressed. Experimental results illustrate the efficiency gains due to the proposed scheme, especially for interesting real-world cases, like basket-market data or Web-server logs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proc. Conf. On Very Large Databases (VLDB 1994), pp. 3–14 (1995)
Berchtold, S., Bohm, C., Jagadish, H., Kriegel, H.-P., Sander, J.: Independent Quantization: an Index Compression Technique for High Dimensional Data Spaces. In: Proc. Conf. on Data Engineering (ICDE 2000), pp. 577–588 (2000)
Bookstein, A., Klein, S.: Compression of Correlated Bit-Vectors. Information Systems 16(4), 387–400 (1991)
Christodoulakis, S., Faloutsos, C.: Signature Files: An Access Method for Documents and its Analytical Performance Evaluation. ACM Transactions on Office Information Systems 2, 267–288 (1984)
Deppisch, U.: S-tree: A Dynamic Balanced Signature Index for Office Retrieval. In: Proc. ACM SIGIR Conf., pp. 77–87 (1986)
Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing Relations and Indexes. In: Proc. Conf. on Data Engineering, pp. 370–379 (1998)
Manolopoulos, Y., Nanopoulos, Y.A., Tousidou, E.: Advanced Signature Indexing for Multimedia and Web Applications. The Kluwer International Series on Advances in Databases Systems. Kluwer Academic Publishers, Dordrecht (2003) (in print)
Nanopoulos, A., Manolopoulos, Y.: Efficient Similarity Search for Market Basket Data. The VLDB Journal 11(2), 138–152 (2002)
Nascimento, M., Tousidou, E., Vishal, C., Manolopoulos, Y.: Image Indexing and Retrieval Using Signature Trees. Data and Knowledge Engineering 43(1), 57–77 (2002)
Sacks-Davis, R., Ramamohanarao, K.: Multikey Access Methods Based On Superimposed Coding Techniques. ACM Transactions on Database Systems 12(4), 655–696 (1987)
Teuhola, J.: A General Approach to Compression of Hierarchical Indexes. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 775–784. Springer, Heidelberg (2001)
Tousidou, E., Bozanis, P., Manolopoulos, Y.: Signature-based Structures for Objects with Set-valued Attributes. Information Systems 27(2), 93–121 (2002)
Tousidou, E., Nanopoulos, A., Manolopoulos, Y.: Improved Methods for Signature- Tree Construction. The Computer Journal 43(4), 301–314 (2000)
Witten, I., Moffat, A., Bell, T.: Managing Gigabytes – Compressing and Indexing Documents and Images. Morgan Kaufmann, San Francisco (1999)
Zezula, P.: Linear Hashing For Signatures Files. In: Proc. IFIP TC6 and TC8 Symp. on Network Information Processing Systems, pp. 192–196 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kontaki, M., Manolopoulos, Y., Nanopoulos, A. (2003). Compressing Large Signature Trees. In: Kalinichenko, L., Manthey, R., Thalheim, B., Wloka, U. (eds) Advances in Databases and Information Systems. ADBIS 2003. Lecture Notes in Computer Science, vol 2798. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39403-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-39403-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20047-5
Online ISBN: 978-3-540-39403-7
eBook Packages: Springer Book Archive