Abstract
Bitmap indexes are known to be efficient for ad-hoc range queries that are common in data warehousing and scientific applications. However, they suffer from the curse of cardinality, that is, their efficiency deteriorates as attribute cardinalities increase. A number of strategies have been proposed, but none of them addresses the problem adequately. In this paper, we propose a novel binned bitmap index that greatly reduces the cost to answer queries, and therefore breaks the curse of cardinality. The key idea is to augment the binned index with an Order-preserving Bin-based Clustering (OrBiC) structure. This data structure significantly reduces the I/O operations needed to resolve records that can not be resolved with the bitmaps. To further improve the proposed index structure, we also present a strategy to create single-valued bins for frequent values. This strategy reduces index sizes and improves query processing speed. Overall, the binned indexes with OrBiC great improves the query processing speed, and are 3 – 25 times faster than the best available indexes for high-cardinality data.
This work was supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Berchtold, S., Böhm, C., Kriegal, H.P.: The pyramid-technique: Towards breaking the curse of dimensionality. SIGMOD Record 27(2), 142–153 (1998)
O’Neil, P.: Model 204 architecture and performance. In: Second International Workshop in High Performance Transaction Systems. Springer, Heidelberg (1987)
O’Neil, P., Quass, D.: Improved query performance with variant indices. In: SIGMOD. ACM Press, New York (1997)
Wu, K., Otoo, E.J., Shoshani, A.: On the performance of bitmap indices for high cardinality attributes. In: VLDB, pp. 24–35. Morgan Kaufmann, San Francisco (2004)
Wu, K., Otoo, E., Shoshani, A.: A performance comparison of bitmap indices. In: CIKM. ACM Press, New York (2001)
Lewis, J.: Bitmap indexes - part 1: Understanding bitmap indexes (2006), http://www.dbazine.com/oracle/or-articles/jlewis3
Koudas, N.: Space efficient bitmap indexing. In: CIKM. ACM Press, New York (2000)
Shoshani, A., Bernardo, L.M., Nordberg, H., Rotem, D., Sim, A.: Multidimensional indexing and query coordination for tertiary storage management. In: SSDBM, pp. 214–225 (1999)
Stockinger, K., Duellmann, D., Hoschek, W., Schikuta, E.: Improving the performance of high-energy physics analysis through bitmap indices. In: DEXA. Springer, Heidelberg (2000)
Wu, K.L., Yu, P.: Range-based bitmap indexing for high cardinality attributes with skew. Technical Report RC 20449, IBM Watson Research, New York (1996)
Johnson, T.: Performance Measurements of Compressed Bitmap Indices. In: VLDB. Morgan Kaufmann, San Francisco (1999)
Antoshenkov, G.: Byte-aligned Bitmap Compression. Technical report, Oracle Corp. U.S. Patent number 5,363,098 (1994)
Wu, K., Otoo, E., Shoshani, A.: Optimizing bitmap indices with efficient compression. ACM Transactions on Database Systems 31, 1–38 (2006)
Comer, D.: The ubiquitous B-tree. Computing Surveys 11(2), 121–137 (1979)
Wu, K., Otoo, E.J., Shoshani, A.: Compressing bitmap indexes for faster search operations. In: SSDBM, pp. 99–108 (2002)
Wong, H.K.T., Liu, H.F., Olken, F., Rotem, D., Wong, L.: Bit transposed files. In: Proceedings of VLDB 1985, pp. 448–457. Stockholm (1985)
Chan, C.Y., Ioannidis, Y.E.: Bitmap Index Design and Evaluation. In: SIGMOD. ACM Press, New York (1998)
Chan, C.Y., Ioannidis, Y.E.: An Efficient Bitmap Encoding Scheme for Selection Queries. In: SIGMOD. ACM Press, New York (1999)
Rotem, D., Stockinger, K., Wu, K.: Minimizing I/O costs of multi-dimensional queries with bitmap indices. In: SSDBM. IEEE, Los Alamitos (2006)
Rotem, D., Stockinger, K., Wu, K.: Optimizing candidate check costs for bitmap indices. In: CIKM. ACM Press, New York (2005)
Gray, J., Liu, D.T., Nieto-Santisteban, M., Szalay, A., DeWitt, D., Heber, G.: Scientific data management in the coming decade. CTWatch Quarterly (2005)
Stonebraker, M., et al.: C-store: A column-oriented dbms. In: VLDB, pp. 553–564 (2005)
Boncz, P.A., Zukowski, M., Nes, N.: Monetdb/x100: Hyper-pipelining query execution. In: CIDR, pp. 225–237 (2005)
Golub, G.H., van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press (1996)
Thaper, N., Guha, S., Indyk, P., Koudas, N.: Dynamic multidimensional histograms. In: SIGMOD, pp. 428–439. ACM, New York (2002)
O’Neil, E., O’Neil, P., Wu, K.: Bitmap index design choices and their performance implications. In: IDEAS, pp. 72–84 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, K., Stockinger, K., Shoshani, A. (2008). Breaking the Curse of Cardinality on Bitmap Indexes. In: Ludäscher, B., Mamoulis, N. (eds) Scientific and Statistical Database Management. SSDBM 2008. Lecture Notes in Computer Science, vol 5069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69497-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-69497-7_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69476-2
Online ISBN: 978-3-540-69497-7
eBook Packages: Computer ScienceComputer Science (R0)