Skip to main content
Log in

Abstract

A join of two relations in real databases is usually much smaller than their Cartesian product. This means that most of the combinations of tuples in the crossproduct of the respective relations do not appear together in the join result. We characterize these combinations as ranges of attributes that do not appear together. We sketch an algorithm for finding such combinations and present experimental results from real data sets. We then explore two potential applications of this knowledge in query processing. In the first application, we model empty joins as materialized views, we show how they can be used for query optimization. In the second application, we propose a strategy that uses information about empty joins for an improved join selectivity estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Acharya, S., Gibbons, P. B., Poosala, V., & Ramaswamy, S. (1999). Join synopses for approximate query answering. In Proceedings SIGMOD (pp. 275–286) Philadelphia, Pennsylvania, USA.

  • Agrawal, S., Chaudhuri, S., & Narasayya, V. R. (2000). Automated selection of materialized views and indexes in sql database. In Proc. of VLDB (pp. 496–505) Cairo, Egypt.

  • Bello, R. G., Dias, K., Downing, A., Feenan, J. Jr., Norcott, W. D., Sun, H., Witkowski, A., et. al. (1998). Materialized views in oracle. In Proceedings of 24th VLDB (pp. 659–664).

  • Ceri, S., Fraternali, P., Paraboschi, S., & Tanca, L. (1994). Automatic generation of production rules for integrity maintenance. TODS, 19(3), 367–422.

    Article  Google Scholar 

  • Ceri, S., & Widom, J. (1990). Deriving production rules for constraint maintenance. In Proceedings of the 16th VLDB (pp. 577–589) Brisbane, Australia.

  • Chakravarthy, U., Grant, J., & Minker, J. (1990, June). Logic-based approach to semantic query optimization. ACM TODS, 15(2), 162–207.

    Article  Google Scholar 

  • Chaudhuri, S., Krishnamurthy, R., Potamianos, S., & Shim, K. (1995). Optimizing queries with materialized views. In Proceedings of the 11th ICDE (pp. 190–200) Taipei, Taiwan, IEEE Computer Society.

  • Chen, I-Min A., & Lee, R. C. (1991). An approach to deriving object hierarchies from database schema and contents. In Proceedings of the 6th ISMIS (pp. 112–121) Charlotte, North Carolina.

  • Cheng, Q., Gryz, J., Koo, F., Leung, C., Liu, L., Qian, X., & Schiefer, B. (1999). Implementation of two semantic query optimization techniques in DB2 UDB. In Proc. of the 25th VLDB (pp. 687–698) Scotland: Edinburgh.

  • Chu, W., Lee, R. C., & Chen, Q. (1991). Using type inference and induced rules to provide intentional answers. In Proceedings of the 7th ICDE (pp. 396–403) Japan: Kobe.

  • Dar, S., Franklin, M., Jonsson, B., Srivastava, D., & Tan, M. (1996). Semantic data caching and replacement. In Proceedings of 22nd VLDB (pp. 330–341) Bombay, India: Morgan Kaufmann.

  • Edmonds, J., Gryz, J., Liang, D., & Miller, R. J. (2001). Mining for empty rectangles in large data sets. In Proceedings of the 8th ICDT (pp. 174–188) London, UK.

  • Garey, M. R., & Johnson, D. S. (1979). Computers and Intractability. New York: Freeman.

    Google Scholar 

  • Godfrey, P., Gryz, J., & Zuzarte, C. (2001). Exploiting constraint-like data characterizations in query optimization. In Proceedings of Sigmod (pp. 582–592) Santa Barbara, California.

  • Gryz, J., Schiefer, B., Zheng, J., & Zuzarte, C. (2001). Discovery and application of check constraints in DB2. In Proceedings of ICDE (pp. 551–556) Germany: Heidelberg.

  • Gupta, A., & Mumick, I. S. (1995). Maintenance of materialized views: problems, techniques, and applications. Data Engineering Bulletin, 18(2), 3–18.

    MATH  Google Scholar 

  • Hammer, M. T., & Zdonik, S. B. (1980, October). Knowledge-based query processing. Proc. 6th VLDB (pp. 137–147).

  • Han, J., Cai, Y., & Cercone, N. (1992). Knowledge discovery in databases: an attribute-oriented approach. In Proceedings of the 18th VLDB (pp. 547–559) Canada: Vancouver.

  • Hsu, C. N., & Knoblock, C. A. (1996). Using inductive learning to generate rules for semantic query optimization. In Advances in Knowledge Discovery and Data Mining (pp. 425–445). AAAI/MIT.

  • Ioannidis, Y. E., & Poosala, V. (1995). Balancing histogram optimality and practicality for query result size estimation. In Proceedings of the SIGMOD, San Jose, California (pp. 233–244).

  • Jarke, M., Clifford, J., & Vassiliou, Y. (1984). An optimizing PROLOG front-end to a relational query system. In SIGMOD (pp. 296–306).

  • King, J. J. (1981, September). Quist: a system for semantic query optimization in relational databases. In Proc. 7th VLDB (pp. 510–517) France: Cannes.

  • Lee, J.-H., Kim, D.-H., & Chung C.-W. (1999). Multi-dimensional selectivity estimation using compressed histogram information. In Proceedings SIGMOD (pp. 205–214) Philadelphia, Pennsylvania, USA.

  • Levy, A. Y., Mendelzon, A. O., Sagiv, Y., & Srivastava, D. (1995). Answering queries using views. In Proceedings of the 14th PODS (pp. 95–104) San Jose, California: ACM.

  • Mannino, M., Chu, P., & Sager, T. (1988). Statistical profile estimation in database. ACM Computing Surveys, 20(3), 191–221.

    Article  MATH  Google Scholar 

  • Matias, Y., Vitter J. S., & Wang, M. (1998). Wavelet-based histograms for selectivity estimation. In Proceedings SIGMOD (pp. 448–459) Seattle, Washington, USA.

  • Muralikrishna, M., & DeWitt, D. J. (1988). Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. In Proceedings of SIGMOD (pp. 28–36) Chicago, Illinois.

  • Namaad, A., Hsu, W. L., & Lee, D. T. (1984). On the maximum empty rectangle problem. Applied Discrete Mathematics, (8), 267–277.

  • OLAP Council. (1998). APB-1 OLAP Benchmark Release II, November. www.olapcouncil.org).

  • Poosala, V., & Ioannidis, Y. E. (1997). Selectivity estimation without the attribute value independence assumption. In VLDB’97 (pp. 486–495).

  • Poosala, V., Ioannidis, Y., Haas, P., & Shekita, E. (1996). Improved histograms for selectivity estimation of range predicates. In Proceedings of SIGMOD (pp. 294–305) Canada: Montreal.

  • Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., & Price, T. G. (1979, May). Access path election in a relational database management system. Proc. ACM-SIGMOD International Conference on Management of Data (pp. 23–34).

  • Shekar, S., Hamidzadeh, B., Kohli, A., & Coyle, M. (1993, December). Learning transformation rules for semantic query optimization. TKDE, 5(6), 950–964.

    Google Scholar 

  • Shenoy, S. T., & Ozsoyoglu, Z. M. (1989, September). Design and implementation of a semantic query optimizer. IEEE Transactions on Knowledge and Data Engineering, 1(3), 344–361, September.

    Article  Google Scholar 

  • Siegel, M. D. (1988). Automatic rule derivation for semantic query optimization. In Proceedings of the 2nd International Conference on Expert Database Systems (pp. 371–386) Vienna, Virginia.

  • Simmen, D., Shekita, E., & Malkems, T. (1996). Fundamental techniques for order optimization. In Proceedings of SIGMOD (pp. 57–67).

  • Srivastava, D., Dar, S., Jagadish, H. V., & Levy, A. (1996). Answering queries with aggregation using views. In Proceedings of the 22nd VLDB (pp. 318–329) Bombay, India.

  • SQL Reference Manual. (1999). Oracle 8i, Realease 8.1.5. 500 Oracle Parkway, Redwood City, California 94065.

  • Thaper, N., Guha, S., Indyk, P., & Koudas, N. (2002). Dynamic multidimensional histograms. In Proceedings of SIGMOD (pp. 428–439).

  • Transaction Processing Performance Council. (1998, February). 777 No. First Street, Suite 600, San Jose, California 95112–6311, www.tpc.org. TPC BenchmarkTM D, 1.3.1 edition.

  • Yu, C. T., & Sun, W. (1989, September). Automatic knowledge acquisition and maintenance for semantic query optimization. IEEE Transactions on Knowledge and Data Engineering, 1(3), 362–375.

    Article  Google Scholar 

  • Zilio, D., Zuzarte, C., Lightstone, S., Ma, W., Lohman, G., Cochrane, R., Pirahesh, H., et. al. (2004). Recommending materialized views and indexes with ibm db2 design advisor. In Proceedings of 1st International Conference on Autonomic Computing (pp. 180–188) New York.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jarek Gryz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gryz, J., Liang, D. Holes in joins. J Intell Inf Syst 26, 247–268 (2006). https://doi.org/10.1007/s10844-006-0368-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-006-0368-2

Keywords

Navigation