Skip to main content
Log in

Exploiting maximal redundancy to optimize SQL queries

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Detecting and dealing with redundancy is an ubiquitous problem in query optimization, which manifests itself in many areas of research such as materialized views, multi-query optimization, and query-containment algorithms. In this paper, we focus on the issue of intra-query redundancy, redundancy present within a query. We present a method to detect the maximal redundancy present between a main (outer) query block and a subquery block. We then use the method for query optimization, introducing query plans and a new operator that take full advantage of the redundancy discovered. Our approach can deal with redundancy in a wider spectrum of queries than existing techniques. We show experimental evidence that our approach works under certain conditions, and compares favorably to existing optimization techniques when applicable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Akinde M, Bohlen M (2003) Efficient computation of subqueries in complex olap. In: Proceedings of the ICDE Conference, pp 163–174

  2. Chatziantoniou D, Ross KA (1996) Querying multiple features of groups in relational databases. In: Proceedings of the VLDB Conference, pp 295–306

  3. Chatziantoniou D, Ross KA (1997) Groupwise processing of relational queries. In: Proceedings of the VLDB Conference, pp 476–485

  4. Council TPP (n.d.) The tpc-h benchmark. http://www.tpc.org/tpch

  5. Dayal U (1987) Of nests and trees: a unified approach to processing queries that contain nested subqueries, aggregates, and quantifiers. In: Proceedings of the VLDB Conference, pp 197–208

  6. Galindo-Legaria CA, Joshi MM (2001) Orthogonal optimization of subqueries and aggregation. In: Proceedings of the SIGMOD Conference, pp 571–581

  7. Ganski RA, Wong HKT (1987) Optimization of nested sql queries revisited. In: Proceedings of the SIGMOD Conference, pp 23–33

  8. Goel P, Iyer B (1996) Sql query optimization: Reordering for a general class of queries. In: Proceedings of the SIGMOD Conference, pp 47–56

  9. Goldstein J, Larson P-A (2001) Optimizing queries using materialized views: a practical, scalable solution. In: Proceedings of the SIGMOD Conference, pp 331–342

  10. Gupta A, Harinarayan V, Quass D (1995) Aggregate-query processing in data warehousing environments. In: Proceedings of the VLDB Conference, pp 358–369

  11. Halevy A (2001) Answering queries using views: a survey. VLDB J 10(4): 270–294

    Article  MATH  Google Scholar 

  12. Hellerstein J (1998) Optimization techniques for queries with expensive methods. ACM Trans Database Syst 23(2): 113–157

    Article  MathSciNet  Google Scholar 

  13. Kim W (1982) On optimizing an sql-like nested query. ACM Trans Database Syst 7(3): 443–469

    Article  MATH  Google Scholar 

  14. Kossmann D (2000) The state of the art in distributed query processing. ACM Comput Surv 32(4): 422–469

    Article  Google Scholar 

  15. Li Z, Ng WK, Sun A (2005) Web data extraction based on structural similarity. Knowl Inf Syst 8(4): 438–461

    Article  Google Scholar 

  16. Lu H, Chan HC, Wei KK (1993) A survey on usage of sql. ACM SIGMOD Record 22(4): 60–65

    Article  Google Scholar 

  17. Millstein T, Halevy A, Friedman M (2003) Query containment for data integration systems. J Comp Syst Sci 66(1): 10–39

    Article  MathSciNet  Google Scholar 

  18. Muralikrishna M (1989) Optimization and dataflow algorithms for nested tree queries. In: Proceedings of the VLDB conference, pp 77–85

  19. Muralikrishna M (1992) Improved unnesting algorithms for join aggregate sql queries. In: Proceedings of the VLDB conference, pp 91–102

  20. Park J, Segev A (1988) Using common subexpressions to optimize multiple queries. In: Proceedings of the ICDE Conference, pp 311–319

  21. Rao J, Ross KA (1998) Reusing invariants: a new strategy for correlated queries. In: Proceedings of the SIGMOD conference, pp 37–48

  22. Roy P, Seshadri S, Sudarshan S, Bhobe S (2000) Efficient and extensible algorithms for multi query optimization. In: Proceedings of the SIGMOD conference, pp 249–260

  23. Sellis TK (1988) Multiple-query optimization. ACM Trans Database Syst 13(1): 23–52

    Article  Google Scholar 

  24. Seshadri P, Pirahesh H, Leung TYC (1996) Complex query decorrelation. In: Proceedings of the ICDE conference, pp 450–458

  25. Ullman J (2000) Information integration using logical views. Theor Comp Sci 239(2): 189–210

    Article  MATH  MathSciNet  Google Scholar 

  26. Wan X (2008) Beyond topical similarity: a structural similarity measure for retrieving highly similar documents. Knowl Inf Syst 15(1): 55–73

    Article  Google Scholar 

  27. Zaharioudakis M, Cochrane R, Lapis G, Pirahesh H, Urata M (2000) Answering complex sql queries using automatic summary tables. In: Proceedings of the SIGMOD conference, pp 105–116

  28. Zhou J, Larson P-A, Freytag J-C, Lehner W (2007) Efficient explointation of similar subexpressions for query processing. In: Proceedings of the SIGMOD conference, pp 533–544

  29. Zhu Q, Tao Y, Zuzarte C (2005) Optimizing complex queries based on similarities of subqueries. Knowl Inf Syst 8(3): 350–373

    Article  Google Scholar 

  30. Zuzarte C, Pirahesh H, Ma W, Cheng Q, Liu L, Wong K (2003) Winmagic: Subquery elimination using window aggregation. In: Proceedings of the SIGMOD conference, pp 652–656

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Cao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, B., Badia, A. Exploiting maximal redundancy to optimize SQL queries. Knowl Inf Syst 20, 187–220 (2009). https://doi.org/10.1007/s10115-008-0156-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-008-0156-0

Keywords

Navigation