Skip to main content

First Past the Post: Evaluating Query Optimization in MongoDB

  • Conference paper
  • First Online:
Databases Theory and Applications (ADC 2024)

Abstract

Query optimization is crucial for every DBMS to enable fast execution of declarative queries. While most DBMS designs include cost-based query optimization, MongoDB chooses an execution plan by what we call “first past the post” (FPTP) query optimization. This partially executes the alternative plans in a round-robin race and observes the work done by each relative to the number of records returned. Through experiments, we analyze the effectiveness of MongoDB’s FPTP query optimizer, concluding that it chooses index scans even in many cases where collection scans would run faster. We identify the reasons for this.

Australian Research Council Linkage Project grant LP160100883.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://mongodb.com/docs/manual/reference/glossary/#std-term-query-shape.

References

  1. Abhirama, M., Bhaumik, S., Dey, A., Shrimal, H., Haritsa, J.R.: On the stability of plan costs and the costs of plan stability. Proc. VLDB Endow. 3(1), 1137–1148 (2010)

    Article  Google Scholar 

  2. Agrawal, S., Chaudhuri, S., Kollar, L., Marathe, A., Narasayya, V., Syamala, M.: Database tuning advisor for microsoft SQL server 2005. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 930–932. ACM (2005)

    Google Scholar 

  3. Ahad, R., Bapa, K., McLeod, D.: On estimating the cardinality of the projection of a database relation. ACM Trans. Database Syst. (TODS) 14(1), 28–40 (1989)

    Article  MATH  Google Scholar 

  4. Babcock, B., Chaudhuri, S.: Towards a robust query optimizer: a principled and practical approach. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 119–130. ACM (2005)

    Google Scholar 

  5. Babu, S., Bizarro, P., DeWitt, D.: Proactive re-optimization. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 107–118. ACM (2005)

    Google Scholar 

  6. Chandra, D.G.: Base analysis of NoSQL database. Futur. Gener. Comput. Syst. 52, 13–21 (2015)

    Article  MATH  Google Scholar 

  7. Chaudhuri, S., Datar, M., Narasayya, V.: Index selection for databases: a hardness study and a principled heuristic solution. IEEE Trans. Knowl. Data Eng. 16, 1313–1323 (2004). https://doi.org/10.1109/tkde.2004.75

    Article  MATH  Google Scholar 

  8. Chaudhuri, S., Narasayya, V.: AutoAdmin “what-if’’ index analysis utility. ACM SIGMOD Rec. 27, 367–378 (1998). https://doi.org/10.1145/276305.276337

    Article  Google Scholar 

  9. Chaudhuri, S., Narasayya, V., Ramamurthy, R.: A pay-as-you-go framework for query execution feedback. Proc. VLDB Endow. 1(1), 1141–1152 (2008)

    Article  MATH  Google Scholar 

  10. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)

    Article  MATH  Google Scholar 

  11. Freytag, J.C.: A rule-based view of query optimization. In: Proceedings of the Association for Computing Machinery Special Interest Group on Management of Data 1987 Annual Conference, San Francisco, CA, USA, 27–29 May 1987, pp. 173–180 (1987)

    Google Scholar 

  12. Graefe, G.: Volcano - an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6(1), 120–135 (1994)

    Article  MATH  Google Scholar 

  13. Graefe, G.: The cascades framework for query optimization. IEEE Data Eng. Bull. 18(3), 19–29 (1995)

    MATH  Google Scholar 

  14. Graefe, G., DeWitt, D.J.: The EXODUS optimizer generator. In: Proceedings of the Association for Computing Machinery Special Interest Group on Management of Data 1987 Annual Conference, San Francisco, CA, USA, 27–29 May 1987, pp. 160–172 (1987)

    Google Scholar 

  15. Gu, Z., Soliman, M.A., Waas, F.M.: Testing the accuracy of query optimizers. In: Proceedings of the Fifth International Workshop on Testing Database Systems, DBTest 2012, Scottsdale, AZ, USA, 21 May 2012, p. 11 (2012)

    Google Scholar 

  16. Gubner, T., Boncz, P.: Excalibur: a virtual machine for adaptive fine-grained jit-compiled query execution based on voila. Proc. VLDB Endow. 16(4), 829–841 (2022). https://doi.org/10.14778/3574245.3574266

  17. Haas, L.M., Freytag, J.C., Lohman, G.M., Pirahesh, H.: Extensible query processing in starburst. In: Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, Portland, Oregon, USA, May 31–June 2 1989, pp. 377–388. ACM Press (1989)

    Google Scholar 

  18. Haas, P.J., Naughton, J.F., Seshadri, S., Swami, A.N.: Selectivity and cost estimation for joins based on random sampling. J. Comput. Syst. Sci. 52(3), 550–569 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  19. Haritsa, J.R.: The picasso database query optimizer visualizer. Proc. VLDB Endow. 3(2), 1517–1520 (2010)

    Article  MATH  Google Scholar 

  20. Lahdenmaki, T., Leach, M.: Relational Database Index Design and the Optimizers: DB2. SQL Server, et al. John Wiley & Sons, Oracle (2005)

    Google Scholar 

  21. Leis, V., Gubichev, A., Mirchev, A., Boncz, P.A., Kemper, A., Neumann, T.: How good are query optimizers, really? Proc. VLDB Endow. 9(3), 204–215 (2015)

    Article  Google Scholar 

  22. Markl, V., Megiddo, N., Kutsch, M., Tran, T.M., Haas, P.J., Srivastava, U.: Consistently estimating the selectivity of conjuncts of predicates. In: Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30–September 2 2005, pp. 373–384 (2005)

    Google Scholar 

  23. Markl, V., Raman, V., Simmen, D., Lohman, G., Pirahesh, H., Cilimdzic, M.: Robust query processing through progressive optimization. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 659–670. ACM (2004)

    Google Scholar 

  24. MongoDB: Mongodb architecture guide: Overview, August 2019. https://www.mongodb.com/collateral/mongodb-architecture-guide

  25. Olken, F., Rotem, D.: Random sampling from databases: a survey. Stat. Comput. 5(1), 25–42 (1995)

    Article  MATH  Google Scholar 

  26. Pirahesh, H., Hellerstein, J.M., Hasan, W.: Extensible/rule based query rewrite optimization in starburst. In: Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data, San Diego, California, USA, 2–5 June 1992, pp. 39–48 (1992)

    Google Scholar 

  27. Poosala, V., Ioannidis, Y.E., Haas, P.J., Shekita, E.J.: Improved histograms for selectivity estimation of range predicates. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, 4–6 June 1996, pp. 294–305 (1996)

    Google Scholar 

  28. Reddy, N., Haritsa, J.R.: Analyzing plan diagrams of database query optimizers. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 1228–1239. VLDB Endowment (2005)

    Google Scholar 

  29. Selinger, P.G., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, Boston, Massachusetts, USA, May 30–1 June, pp. 23–34 (1979)

    Google Scholar 

  30. Soliman, M.A., et al.: Orca: a modular query optimizer architecture for big data. In: International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June 2014, pp. 337–348 (2014)

    Google Scholar 

  31. Stillger, M., Freytag, J.C.: Testing the quality of a query optimizer. IEEE Data Eng. Bull. 18(3), 41–48 (1995)

    MATH  Google Scholar 

  32. Trummer, I., et al.: SkinnerDB: regret-bounded query evaluation via reinforcement learning. ACM Trans. Database Syst. 46(3) (2021). https://doi.org/10.1145/3464389

  33. Valentin, G., Zuliani, M., Zilio, D., Lohman, G., Skelley, A.: DB2 advisor: an optimizer smart enough to recommend its own indexes. In: Proceedings of 16th International Conference on Data Engineering (Cat. No. 00CB37073), pp. 101–110 (2000). https://doi.org/10.1109/icde.2000.839397

  34. Waas, F., Galindo-Legaria, C.A.: Counting, enumerating, and sampling of execution plans in a cost-based query optimizer. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 16–18 May 2000, Dallas, Texas, USA, pp. 499–509 (2000)

    Google Scholar 

  35. Waas, F.M., Hellerstein, J.M.: Parallelizing extensible query optimizers. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29–2 July 2009, pp. 871–878 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alan Fekete .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tao, D., Liu, E., Randeni Kadupitige, S., Cahill, M., Fekete, A., Röhm, U. (2025). First Past the Post: Evaluating Query Optimization in MongoDB. In: Chen, T., Cao, Y., Nguyen, Q.V.H., Nguyen, T.T. (eds) Databases Theory and Applications. ADC 2024. Lecture Notes in Computer Science, vol 15449. Springer, Singapore. https://doi.org/10.1007/978-981-96-1242-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-96-1242-0_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-96-1241-3

  • Online ISBN: 978-981-96-1242-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics