Abstract
Flaky tests can seemingly nondeterministically pass or fail for the same code under test. Flaky tests are detrimental to regression testing because tests that pass before code changes and fail after code changes do not reliably indicate problems in code changes. An important category of flaky tests is order-dependent tests that pass or fail based on the order of tests in the test suite. Prior work has considered the problem of counting test orders that pass or fail, given relationships of tests within a test suite. However, prior work has not addressed the most general case of these relationships. This paper shows how to encode the problem of counting test orders in the Alloy modeling language and how to use propositional model counters to obtain the count for test orders. We illustrate that Alloy makes it easy to handle even the most general case. The results show that this problem produces challenging propositional formulas for the state-of-the-art model counters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aydin, A., Bang, L., Bultan, T.: Automata-based model counting for string constraints. In: Kroening, D., PÄsÄreanu, C.S. (eds.) CAV 2015, Part I. LNCS, vol. 9206, pp. 255ā272. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21690-4_15
Bacchus, F., Dalmao, S., Pitassi, T.: Algorithms and complexity results for # SAT and Bayesian inference. In: FOCS (2003)
BĆ¼ttner, F., Egea, M., Cabot, J., Gogolla, M.: Verification of ATL transformations using transformation models and model finders. In: Aoki, T., Taguchi, K. (eds.) ICFEM 2012. LNCS, vol. 7635, pp. 198ā213. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34281-3_16
Google: Avoiding flakey tests (2008). http://googletesting.blogspot.com/2008/04/tott-avoiding-flakey-tests.html
Harman, M., OāHearn, P.: From start-ups to scale-ups: opportunities and open problems for static and dynamic program analysis. In: SCAM (2018)
Herzig, K., Greiler, M., Czerwonka, J., Murphy, B.: The art of testing less without sacrificing quality. In: ICSE (2015)
Herzig, K., Nagappan, N.: Empirically detecting false test alarms using association rules. In: ICSE (2015)
Jackson, D.: Software Abstractions: Logic, Language, and Analysis. The MIT Press, Cambridge (2006)
Jiang, H., Li, X., Yang, Z., Xuan, J.: What causes my test alarm? Automatic cause analysis for test alarms in system and integration testing. In: ICSE (2017)
Kang, E., Jackson, D.: Formal modeling and analysis of a flash filesystem in alloy. In: Bƶrger, E., Butler, M., Bowen, J.P., Boca, P. (eds.) ABZ 2008. LNCS, vol. 5238, pp. 294ā308. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87603-8_23
Kowalczyk, E., Nair, K., Gao, Z., Silberstein, L., Long, T., Memon, A.: Modeling and ranking flaky tests at Apple. In: ICSE-SEIP (2020)
Lagniez, J.-M., Marquis, P.: A recursive algorithm for projected model counting. In: AAAI, vol. 33, pp. 1536ā1543 (2019)
Lam, W., Godefroid, P., Nath, S., Santhiar, A., Thummalapenta, S.: Root causing flaky tests in a large-scale industrial setting. In: ISSTA (2019)
Lam, W., MuÅlu, K., Sajnani, H., Thummalapenta, S.: A study on the lifecycle of flaky tests. In: ICSE (2020)
Luo, Q., Hariri, F., Eloussi, L., Marinov, D.: An empirical analysis of flaky tests. In: FSE (2014)
Memon, A., Gao, Z., Nguyen, B., Dhanda, S., Siemborski, R., Micco, J.: Taming Google-scale continuous testing. In: ICSE-SEIP, Eric Nickell (2017)
Shi, A., Lam, W., Oei, R., Xie, T., Marinov, D.: iFixFlakies: a framework for automatically fixing order-dependent flaky tests. In: FSE (2019)
Soos, M., Gocht, S., Meel, K.S.: Tinted, detached, and lazy CNF-XOR SOLVING and its applications to counting and sampling. In: Lahiri, S.K., Wang, C. (eds.) CAV 2020, Part I. LNCS, vol. 12224, pp. 463ā484. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53288-8_22
Torlak, E., Jackson, D.: Kodkod: a relational model finder. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 632ā647. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71209-1_49
Wei, A., Yi, P., Xie, T., Marinov, D., Lam, W.: Probabilistic and systematic coverage of consecutive test-method pairs for detecting order-dependent flaky tests. In: TACAS 2021, Part I. LNCS, vol. 12651, pp. 270ā287. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72016-2_15
Yang, J., Wang, W., Marinov, D., Khurshid, S.: Alloy meets model counting. In: FSE, AlloyMC (2020)
Ziftci, C., Reardon. J.: Who broke the build? Automatically identifying changes that induce test failures in continuous integration at Google scale. In: ICSE (2017)
Acknowledgment
We thank Wing Lam and Anjiang Wei for discussions on counting test orders. This work was partially supported by NSF grants CCF-1763788. We also acknowledge support for research on flaky tests from Facebook and Google.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2022 IFIP International Federation for Information Processing
About this paper
Cite this paper
Wang, W., Yi, P., Khurshid, S., Marinov, D. (2022). Initial Results on Counting Test Orders forĀ Order-Dependent Flaky Tests Using Alloy. In: Clark, D., Menendez, H., Cavalli, A.R. (eds) Testing Software and Systems. ICTSS 2021. Lecture Notes in Computer Science, vol 13045. Springer, Cham. https://doi.org/10.1007/978-3-031-04673-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-04673-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04672-8
Online ISBN: 978-3-031-04673-5
eBook Packages: Computer ScienceComputer Science (R0)