Abstract
Set-valued attributes are convenient to model complex objects occurring in the real world. Currently available database systems support the storage of set-valued attributes in relational tables but contain no primitives to query them efficiently. Queries involving set-valued attributes either perform full scans of the source data or make multiple passes over single-value indexes to reduce the number of retrieved tuples. Existing techniques for indexing set-valued attributes (e.g., inverted files, signature indexes or RD-trees) are not efficient enough to support fast access of set-valued data in very large databases.
In this paper we present the hierarchical bitmap index—a novel technique for indexing set-valued attributes. Our index permits to index sets of arbitrary length and its performance is not affected by the size of the indexed domain. The hierarchical bitmap index efficiently supports different classes of queries, including subset, superset and similarity queries. Our experiments show that the hierarchical bitmap index outperforms other set indexing techniques significantly.
Work supported by a Bilateral Greek-Polish Program
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Carey, M.J., Faloutsos, C., Ghosh, S.P., Houtsma, M.A.W., Imielinski, T., Iyer, B.R., Mahboob, A., Miranda, H., Srikant, R., Swami, A.N.: Quest: A project on database mining. In: Snodgrass, R.T., Winslett, M. (eds.) Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 1994, p. 514. ACM Press, New York (1994)
Araujo, M.D., Navarro, G., Ziviani, N.: Large text searching allowing errors. In: Baeza-Yates, R. (ed.) Proceedings of the 4th South American Workshop on String Processing, Valparaiso, Chile, pp. 2–20. Carleton University Press, Ottawa (1997)
Chan, C.Y., Ioannidis, Y.E.: Bitmap index design and evaluation. In: Haas, L.M., Tiwary, A. (eds.) Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle,Washington, june 1998, pp. 355–366. ACM Press, New York (1998)
Christodoulakis, S., Faloutsos, C.: Signature files: an access method for documents and its analytical performance evaluation. ACM Transactions on Office Information Systems 2(4), 267–288 (1984)
Comer, D.: The ubiquitous b-tree. ACM Computing Surveys 11(2), 121–137 (1979)
Deppisch, U.: S-tree: a dynamic balanced signature index for office retrieval. In: Proceedings of the Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, Italy, pp. 77–87. ACM, New York (1986)
Faloutsos, C.: Signature files. In: Information Retrieval: Data Structures and Algorithms. Prentice Hall, Englewood Cliffs (1992)
Freeston, M., Geffner, S., Hörhammer, M.: More bang for your buck: A performance comparison of bang and r* spatial indexing. In: Bench-Capon, T.J.M., Soda, G., Tjoa, A.M. (eds.) DEXA 1999. LNCS, vol. 1677, pp. 1052–1065. Springer, Heidelberg (1999)
Gionis, A., Gunopulos, D., Koudas, N.: Efficient and tunable similar set retrieval. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, ACM Press, New York (2001)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Yormark, B. (ed.) SIGMOD 1984, Proceedings of Annual Meeting, Boston, Massachusetts, pp. 47–57. ACM Press, New York (1984)
Hellerstein, J.M., Pfeffer, A.: The rd-tree: An index structure for sets. Technical Report 1252, University of Wisconsin at Madison (1994)
Helmer, S., Moerkotte, G.: A study of four index structures for set-valued attributes of low cardinality. Technical Report 2/99, Universität Mannheim (1999)
Ishikawa, Y., Kitagawa, H., Ohbo, N.: Evaluation of signature files as set access facilities in oodbs. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C, pp. 247–256. ACM Press, New York (1993)
Morzy, T., Zakrzewicz, M.: Group bitmap index: A structure for association rules retrieval. In: Agrawal, R., Stolorz, P.E., Piatetsky-Shapiro, G. (eds.) Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, August 1998, pp. 284–288. ACM Press, New York (1998)
Nanopoulos, A., Manolopoulos, Y.: Efficient similarity search for market basket data. VLDB Journal 11(2), 138–152 (2002)
Nørvåg, K.: Efficient use of signatures in object-oriented database systems. In: Eder, J., Rozman, I., Welzer, T. (eds.) ADBIS 1999. LNCS, vol. 1691, pp. 367–381. Springer, Heidelberg (1999)
Tousidou, E., Nanopoulos, A., Manolopoulos, Y.: Improved methods for signature-tree construction. The Computer Journal 43(4), 301–314 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Morzy, M., Morzy, T., Nanopoulos, A., Manolopoulos, Y. (2003). Hierarchical Bitmap Index: An Efficient and Scalable Indexing Technique for Set-Valued Attributes. In: Kalinichenko, L., Manthey, R., Thalheim, B., Wloka, U. (eds) Advances in Databases and Information Systems. ADBIS 2003. Lecture Notes in Computer Science, vol 2798. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39403-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-39403-7_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20047-5
Online ISBN: 978-3-540-39403-7
eBook Packages: Springer Book Archive