Initial Results on Counting Test Orders for Order-Dependent Flaky Tests Using Alloy

Wang, Wenxi; Yi, Pu; Khurshid, Sarfraz; Marinov, Darko

doi:10.1007/978-3-031-04673-5_9

Wenxi Wang¹¹,
Pu Yi¹²,
Sarfraz Khurshid¹¹ &
…
Darko Marinov¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13045))

Included in the following conference series:

IFIP International Conference on Testing Software and Systems

378 Accesses

Abstract

Flaky tests can seemingly nondeterministically pass or fail for the same code under test. Flaky tests are detrimental to regression testing because tests that pass before code changes and fail after code changes do not reliably indicate problems in code changes. An important category of flaky tests is order-dependent tests that pass or fail based on the order of tests in the test suite. Prior work has considered the problem of counting test orders that pass or fail, given relationships of tests within a test suite. However, prior work has not addressed the most general case of these relationships. This paper shows how to encode the problem of counting test orders in the Alloy modeling language and how to use propositional model counters to obtain the count for test orders. We illustrate that Alloy makes it easy to handle even the most general case. The results show that this problem produces challenging propositional formulas for the state-of-the-art model counters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aydin, A., Bang, L., Bultan, T.: Automata-based model counting for string constraints. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015, Part I. LNCS, vol. 9206, pp. 255–272. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21690-4_15
Chapter Google Scholar
Bacchus, F., Dalmao, S., Pitassi, T.: Algorithms and complexity results for # SAT and Bayesian inference. In: FOCS (2003)
Google Scholar
Büttner, F., Egea, M., Cabot, J., Gogolla, M.: Verification of ATL transformations using transformation models and model finders. In: Aoki, T., Taguchi, K. (eds.) ICFEM 2012. LNCS, vol. 7635, pp. 198–213. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34281-3_16
Chapter Google Scholar
Google: Avoiding flakey tests (2008). http://googletesting.blogspot.com/2008/04/tott-avoiding-flakey-tests.html
Harman, M., O’Hearn, P.: From start-ups to scale-ups: opportunities and open problems for static and dynamic program analysis. In: SCAM (2018)
Google Scholar
Herzig, K., Greiler, M., Czerwonka, J., Murphy, B.: The art of testing less without sacrificing quality. In: ICSE (2015)
Google Scholar
Herzig, K., Nagappan, N.: Empirically detecting false test alarms using association rules. In: ICSE (2015)
Google Scholar
Jackson, D.: Software Abstractions: Logic, Language, and Analysis. The MIT Press, Cambridge (2006)
Google Scholar
Jiang, H., Li, X., Yang, Z., Xuan, J.: What causes my test alarm? Automatic cause analysis for test alarms in system and integration testing. In: ICSE (2017)
Google Scholar
Kang, E., Jackson, D.: Formal modeling and analysis of a flash filesystem in alloy. In: Börger, E., Butler, M., Bowen, J.P., Boca, P. (eds.) ABZ 2008. LNCS, vol. 5238, pp. 294–308. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87603-8_23
Chapter Google Scholar
Kowalczyk, E., Nair, K., Gao, Z., Silberstein, L., Long, T., Memon, A.: Modeling and ranking flaky tests at Apple. In: ICSE-SEIP (2020)
Google Scholar
Lagniez, J.-M., Marquis, P.: A recursive algorithm for projected model counting. In: AAAI, vol. 33, pp. 1536–1543 (2019)
Google Scholar
Lam, W., Godefroid, P., Nath, S., Santhiar, A., Thummalapenta, S.: Root causing flaky tests in a large-scale industrial setting. In: ISSTA (2019)
Google Scholar
Lam, W., Muşlu, K., Sajnani, H., Thummalapenta, S.: A study on the lifecycle of flaky tests. In: ICSE (2020)
Google Scholar
Luo, Q., Hariri, F., Eloussi, L., Marinov, D.: An empirical analysis of flaky tests. In: FSE (2014)
Google Scholar
Memon, A., Gao, Z., Nguyen, B., Dhanda, S., Siemborski, R., Micco, J.: Taming Google-scale continuous testing. In: ICSE-SEIP, Eric Nickell (2017)
Google Scholar
Shi, A., Lam, W., Oei, R., Xie, T., Marinov, D.: iFixFlakies: a framework for automatically fixing order-dependent flaky tests. In: FSE (2019)
Google Scholar
Soos, M., Gocht, S., Meel, K.S.: Tinted, detached, and lazy CNF-XOR SOLVING and its applications to counting and sampling. In: Lahiri, S.K., Wang, C. (eds.) CAV 2020, Part I. LNCS, vol. 12224, pp. 463–484. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53288-8_22
Chapter Google Scholar
Torlak, E., Jackson, D.: Kodkod: a relational model finder. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 632–647. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71209-1_49
Chapter Google Scholar
Wei, A., Yi, P., Xie, T., Marinov, D., Lam, W.: Probabilistic and systematic coverage of consecutive test-method pairs for detecting order-dependent flaky tests. In: TACAS 2021, Part I. LNCS, vol. 12651, pp. 270–287. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72016-2_15
Chapter MATH Google Scholar
Yang, J., Wang, W., Marinov, D., Khurshid, S.: Alloy meets model counting. In: FSE, AlloyMC (2020)
Google Scholar
Ziftci, C., Reardon. J.: Who broke the build? Automatically identifying changes that induce test failures in continuous integration at Google scale. In: ICSE (2017)
Google Scholar

Download references

Acknowledgment

We thank Wing Lam and Anjiang Wei for discussions on counting test orders. This work was partially supported by NSF grants CCF-1763788. We also acknowledge support for research on flaky tests from Facebook and Google.

Author information

Authors and Affiliations

The University of Texas at Austin, Austin, USA
Wenxi Wang & Sarfraz Khurshid
Peking University, Beijing, China
Pu Yi
University of Illinois Urbana-Champaign, Champaign, USA
Darko Marinov

Authors

Wenxi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Pu Yi
View author publications
You can also search for this author in PubMed Google Scholar
Sarfraz Khurshid
View author publications
You can also search for this author in PubMed Google Scholar
Darko Marinov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenxi Wang .

Editor information

Editors and Affiliations

University College London, London, UK
David Clark
Middlesex University, London, UK
Hector Menendez
Telecom SudParis, Evry Cedex, France
Ana Rosa Cavalli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, W., Yi, P., Khurshid, S., Marinov, D. (2022). Initial Results on Counting Test Orders for Order-Dependent Flaky Tests Using Alloy. In: Clark, D., Menendez, H., Cavalli, A.R. (eds) Testing Software and Systems. ICTSS 2021. Lecture Notes in Computer Science, vol 13045. Springer, Cham. https://doi.org/10.1007/978-3-031-04673-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-04673-5_9
Published: 10 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04672-8
Online ISBN: 978-3-031-04673-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Initial Results on Counting Test Orders for Order-Dependent Flaky Tests Using Alloy