Abstract
Incomplete databases, that is, databases that are missing data, are present in many research domains. It is important to derive techniques to access these databases efficiently. We first show that known indexing techniques for multi-dimensional data search break down in terms of performance when indexed attributes contain missing data. This paper utilizes two popularly employed indexing techniques, bitmaps and quantization, to correctly and efficiently answer queries in the presence of missing data. Query execution and interval evaluation are formalized for the indexing structures based on whether missing data is considered to be a query match or not. The performance of Bitmap indexes and quantization based indexes is evaluated and compared over a variety of analysis parameters for real and synthetic data sets. Insights into the conditions for which to use each technique are provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amer-Yahia, S., Johnson, T.: Optimizing queries on compressed bitmaps. The VLDB Journal, 329–338 (2000)
Antoshenkov, G.: Byte-aligned bitmap compression. In: Data Compression Conference, Nashua, NH (1995)(Oracle Corp)
Antoshenkov, G., Ziauddin, M.: Query processing and optimization in oracle rdb. The VLDB Journal (1996)
Chan, C.-Y., Ioannidis, Y.E.: Bitmap index design and evaluation. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pp. 355–366. ACM Press, New York (1998)
Chan, C.-Y., Ioannidis, Y.E.: An efficient bitmap encoding scheme for selection queries. SIGMOD Rec. 28(2), 215–226 (1999)
Ferhatosmanoglu, H., Tuncel, E., Agrawal, D., Abbadi, A.E.: Vector approximation based indexing for non-uniform high dimensional data sets. In: Proceedings of the ninth international conference on Information and knowledge management, pp. 202–209. ACM Press, New York (2000)
Inc, S.: Sybase IQ Indexes., chapter Sybase IQ Release 11.2 Collection, chapter 5. Sybase Inc. (March 1997)
Johnson, T.: Performance measurements of compressed bitmap indices. In: Proceedings of the 25th International Conference on Very Large Data Bases, pp. 278–289. Morgan Kaufmann Publishers, San Francisco (1999)
Koudas, N.: Space efficient bitmap indexing. In: Proceedings of the ninth international conference on Information and knowledge management, pp. 194–201. ACM Press, New York (2000)
O’Neil, P., Quass, D.: Improved query performance with variant indexes. In: Proceedings of the 1997 ACM SIGMOD international conference on Management of data, pp. 38–49. ACM Press, New York (1997)
O’Neil, P.E.: Model 204 architecture and performance. In: Proceedings of the 2nd International Workshop on High Performance Transaction Systems, pp. 40–59. Springer, Heidelberg (1989)
Ooi, B.C., Goh, C.H., Tan, K.-L.: Fast high-dimensional data search in incomplete databases. In: Proceedings of the 24rd International Conference on Very Large Data Bases, pp. 357–367. Morgan Kaufmann Publishers, San Francisco (1998)
Stockinger, K.: Bitmap indices for speeding up high-dimensional data analysis. In: Proceedings of the 13th International Conference on Database and Expert Systems Applications, pp. 881–890. Springer, Heidelberg (2002)
Weber, R., Blott, S.: An approximation based data structure for similarity search (1997)
Weber, R., Schek, H.-J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th International Conference on Very Large Databases, pp. 194–205 (1998)
Wu, K., Otoo, E., Shoshani, A.: Compressing bitmap indexes for faster search operations. In: SSDBM (2002)
Wu, K., Otoo, E., Shoshani, A.: On the performance of bitmap indices for high cardinality attributes. Technical Report LBNL-54673, Lawrence Berkeley National Laboratory (March 2004)
Wu, K., Otoo, E.J., Shoshani, A.: A performance comparison of bitmap indexes. In: Proceedings of the tenth international conference on Information and knowledge management, pp. 559–561. ACM Press, New York (2001)
Wu, K., Otoo, E.J., Shoshani, A., Nordberg, H.: Notes on design and implementation of compressed bit vectors. Technical Report LBNL PUB-3161, Lawrence Berkeley National Laboratory (2001)
Wu, M.-C.: Query optimization for selections using bitmaps. In: Proceedings of the 1999 ACM SIGMOD international conference on Management of data, pp. 227–238. ACM Press, New York (1999)
Zimanyi, E.: Incomplete and Uncertain Information in Relational Databases. PhD thesis, Université Libre de Bruxelles (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Canahuate, G., Gibas, M., Ferhatosmanoglu, H. (2006). Indexing Incomplete Databases. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_52
Download citation
DOI: https://doi.org/10.1007/11687238_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)