Skip to main content

SAM: A Sorting Approach for Optimizing Multijoin Queries

  • Conference paper
  • First Online:
Database and Expert Systems Applications (Globe 2015, DEXA 2015)

Abstract

Finding the optimal join order for a multijoin query is an old, yet very important topic for relational database systems. It has been studied for the last few decades and proven to be NP-hard. The mainstream techniques, first proposed in System R, are based on dynamic programming. These techniques are widely adopted by commercial database systems.

However, it is well known that such approaches suffer from exponential running time in finding the optimal join order for most queries, except simple ones like linear queries. Therefore, a query optimizer must resort to finding a suboptimal join order when the number of tables is large.

This paper proposes SAM, which departs from current practice in two ways: (1) SAM orders the joining attributes before ordering the tables; (2) SAM sorts the tables by comparing selectivities for “table blocks”. This approach reduces the exponential time complexity in the optimization; in particular, it can find, in polynomial time, the optimal ordering for clique queries that take exponential time to optimize by dynamic programming.

Experiments comparing SAM to the query optimizers in MySQL and PostgreSQL, using real data, show that its performance is similar for small queries, but much better for large queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.postgresql.org/.

  2. 2.

    http://www.postgresql.org/docs/9.4/static/geqo-intro2.html.

  3. 3.

    http://www.mysql.com/.

  4. 4.

    http://dev.mysql.com/doc/internals/en/optimizer-joins-access-methods.html.

  5. 5.

    Thanks to Craig Rodkin at ACM Headquarters for providing the ACMDL dataset.

References

  1. Bao, Z., Tay, Y.C., Zhou, J.: sonSchema: a conceptual schema for social networks. In: Ng, W., Storey, V.C., Trujillo, J.C. (eds.) ER 2013. LNCS, vol. 8217, pp. 197–211. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  2. DeHaan, D., Tompa, F.W.: Optimal top-down join enumeration. In: Proceedings of the SIGMOD, pp. 785–796 (2007)

    Google Scholar 

  3. Fegaras, L.: A new heuristic for optimizing large queries. In: Quirchmayr, G., Bench-Capon, T.J.M., Schweighofer, E. (eds.) DEXA 1998. LNCS, vol. 1460, pp. 726–735. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  4. Ibaraki, T., Kameda, T.: On the optimal nesting order for computing n-relational joins. ACM Trans. Database Syst. 9(3), 482–502 (1984)

    Article  MathSciNet  Google Scholar 

  5. Kossmann, D., Stocker, K.: Iterative dynamic programming: a new class of query optimization algorithms. ACM Trans. Database Syst. 25(1), 43–82 (2000)

    Article  Google Scholar 

  6. Lee, C., Shih, C.-S., Chen, Y.-H.: Optimizing large join queries using a graph-based approach. IEEE Trans. Knowl. Data Eng. 13(2), 298–315 (2001)

    Article  Google Scholar 

  7. Moerkotte, G., Neumann, T.: Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. In: Proceedings of the VLDB, pp. 930–941 (2006)

    Google Scholar 

  8. Moerkotte, G., Neumann, T.: Dynamic programming strikes back. In: Proceedings of the SIGMOD, pp. 539–552 (2008)

    Google Scholar 

  9. Neumann, T.: Query simplification: graceful degradation for join-order optimization. In: Proceedings of the SIGMOD, pp. 403–414 (2009)

    Google Scholar 

  10. Ngo, H.Q., Porat, E., Ré, C., Rudra, A.: Worst-case optimal join algorithms: [extended abstract]. In: Proceedings of the PODS, pp. 37–48 (2012)

    Google Scholar 

  11. Sacco, G.M.: Truly adaptive optimization: the basic ideas. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 751–760. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., et al.: Access path selection in a relational database management system. In: Proceedings of the SIGMOD, pp. 23–34 (1979)

    Google Scholar 

  13. Steinbrunn, M., Moerkotte, G., Kemper, A.: Heuristic and randomized optimization for the join ordering problem. VLDB J. 6(3), 191–208 (1997)

    Article  MATH  Google Scholar 

  14. Swami, A., Gupta, A.: Optimization of large join queries. In: Proceedings of the SIGMOD, pp. 8–17 (1988)

    Google Scholar 

  15. Vance, B., Maier, D.: Rapid bushy join-order optimization with cartesian products. In: Proceedings of the SIGMOD, pp. 35–46 (1996)

    Google Scholar 

Download references

Acknowledgement

We thank Zhifeng Bao for helpful discussions and comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Zeng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zeng, Y., Lu, A.N., Xia, L., Tian, C.X., Tay, Y.C. (2015). SAM: A Sorting Approach for Optimizing Multijoin Queries. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9261. Springer, Cham. https://doi.org/10.1007/978-3-319-22849-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22849-5_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22848-8

  • Online ISBN: 978-3-319-22849-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics