ABSTRACT
Software testing is hard, and a testing problem is composed of many sub-problems with different, often conflicting, solutions. Like many real-world problems, it admits no single optimal solution, but requires dexterity, and the opportunistic combination of many partial solutions. Exploration and experiment, even by practitioners, are important in real-world critical testing efforts. An important set of research results in the field endorse and codify the value of diversity in test generation. However, our current approaches to evaluating research results arguably cut against this fundamental reality: while effective testing may need true diversity, combining many partial answers, the iron logic of the research results section often imposes a totalizing vision where authors must at least pretend to present a monolithic, unitary solution, a new “king of the hill.”
Supplemental Material
- James H. Andrews, Alex Groce, Melissa Weston, and Ru-Gang Xu. 2008. Random Test Run Length and Effectiveness. In Automated Software Engineering. 19–28. Google ScholarDigital Library
- Andrea Arcuri. 2012. A Theoretical and Empirical Analysis of the Role of Test Sequence Length in Software Testing for Structural Coverage. IEEE Trans. Software Eng., 38, 3 (2012), 497–519. https://doi.org/10.1109/TSE.2011.44 Google ScholarDigital Library
- Andrea Arcuri and Lionel Briand. 2014. A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability, 24, 3 (2014), 219–250. Google ScholarDigital Library
- Andrea Arcuri, Muhammad Zohaib Iqbal, and Lionel Briand. 2010. Formal Analysis of the Effectiveness and Predictability of Random Testing. In Proceedings of the 19th International Symposium on Software Testing and Analysis (ISSTA ’10). Association for Computing Machinery, New York, NY, USA. 219–230. isbn:9781605588230 https://doi.org/10.1145/1831708.1831736 Google ScholarDigital Library
- Leonard E Baum and Patrick Billingsley. 1965. Asymptotic distributions for the coupon collector’s problem. The Annals of Mathematical Statistics, 36, 6 (1965), 1835–1839.Google ScholarCross Ref
- Marcel Böhme, László Szekeres, Baishakhi Ray, and Cristian Cadar. 2021. Journal Special Issue on Fuzzing: What about Preregistration? http://fuzzbench.com/blog/2021/04/22/special-issue/Google Scholar
- Frederic P. Brooks. 1987. No Silver Bullet Essence and Accidents of Software Engineering. Computer, 20 (1987), 10–19. Google ScholarDigital Library
- Miguel de Cervantes. 1605. Don Quixote.Google Scholar
- Yang Chen, Alex Groce, Chaoqiang Zhang, Weng-Keen Wong, Xiaoli Fern, Eric Eide, and John Regehr. 2013. Taming Compiler Fuzzers. In ACM SIGPLAN Symposium on Programming Language Design and Implementation. 197–208. https://doi.org/10.1145/2499370.2462173 Google ScholarDigital Library
- Yuanliang Chen, Yu Jiang, Fuchen Ma, Jie Liang, Mingzhe Wang, Chijin Zhou, Xun Jiao, and Zhuo Su. 2019. Enfuzz: Ensemble fuzzing with seed synchronization among diverse fuzzers. In USENIX Security Symposium. 1967–1983. Google ScholarDigital Library
- G. K. Chesterton. 1920. The Uses of Diversity: A Book of Essays.Google Scholar
- 1972. Structured Programming, O. J. Dahl, E. W. Dijkstra, and C. A. R. Hoare (Eds.). Academic Press Ltd., GBR. isbn:0122005503Google Scholar
- Kyle Dewey, Jared Roesch, and Ben Hardekopf. 2015. Fuzzing the Rust typechecker using CLP (T). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 482–493. Google ScholarDigital Library
- Thomas G Dietterich. 2002. Ensemble learning. The handbook of brain theory and neural networks, 2 (2002), 110–125.Google Scholar
- Peter Goodman and Alex Groce. 2018. DeepState: Symbolic unit testing for C and C++. In NDSS Workshop on Binary Analysis Research.Google ScholarCross Ref
- Alex Groce, Mohammad Amin Alipour, and Rahul Gopinath. 2014. Coverage and Its Discontents. In Proceedings of the 2014 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software (Onward -þinmuskip 2014). Association for Computing Machinery, New York, NY, USA. 255–268. isbn:9781450332101 https://doi.org/10.1145/2661136.2661157 Google ScholarDigital Library
- Alex Groce and Martin Erwig. 2012. Finding Common Ground: Choose, Assert, and Assume. In International Workshop on Dynamic Analysis. 12–17. Google ScholarDigital Library
- Alex Groce, Chaoqiang Zhang, Mohammad Amin Alipour, Eric Eide, Yang Chen, and John Regehr. 2013. Help, help, I’m being suppressed! The significance of suppressors in software testing. In 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE). 390–399.Google ScholarCross Ref
- Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm Testing. In International Symposium on Software Testing and Analysis. 78–88. Google ScholarDigital Library
- Josie Holmes, Iftekhar Ahmed, Caius Brindescu, Rahul Gopinath, He Zhang, and Alex Groce. 2020. Using Relative Lines of Code to Guide Automated Test Generation for Python. ACM Trans. Softw. Eng. Methodol., 29, 4 (2020), Article 28, Sept., 38 pages. issn:1049-331X https://doi.org/10.1145/3408896 Google ScholarDigital Library
- Josie Holmes, Alex Groce, Jervis Pinto, Pranjal Mittal, Pooria Azimi, Kevi n Kellar, and James O’Brien. 2018. TSTL: the Template Scripting Testing Language. International Journal on Software Tools for Technology Transfer, 20, 1 (2018), 57–78. Google ScholarDigital Library
- Gerard Holzmann, Rajeev Joshi, and Alex Groce. 2011. Swarm Verification Techniques. IEEE Transactions on Software Engineering, 37, 6 (2011), 845–857. Google ScholarDigital Library
- Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. ACM SIGPLAN Notices, 49, 6 (2014), 216–226. Google ScholarDigital Library
- Roderick MacFarquhar. 1966. The Hundred Flowers Campaign and the Chinese Intellectuals. Praeger.Google Scholar
- David R. MacIver. 2013. Hypothesis: Test faster, fix more. http://hypothesis.works/Google Scholar
- Phil McMinn. 2004. Search-based Software Test Data Generation: A Survey. Software Testing, Verification and Reliability, 14 (2004), 105–156. Google ScholarDigital Library
- Michel de Montaigne. 1595. Essays.Google Scholar
- Dan R. Olsen. 2007. Evaluating User Interface Systems Research. In Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology (UIST ’07). Association for Computing Machinery, New York, NY, USA. 251–258. isbn:9781595936790 https://doi.org/10.1145/1294211.1294256 Google ScholarDigital Library
- Glenn Reeves and Tracy Neilson. 2005. The Mars Rover Spirit Flash Anomaly. In IEEE Aerospace Conference.Google Scholar
- Jingyu Zhou, Meng Xu, Alexander Shraer, Bala Namasivayam, Alex Miller, Evan Tschannen, Steve Atherton, Andrew J Beamon, Rusty Sears, and John Leach. 2021. FoundationDB: A Distributed Unbundled Transactional Key Value Store. In ACM SIGMOD. Google ScholarDigital Library
Index Terms
- Let a thousand flowers bloom: on the uses of diversity in software testing
Recommendations
echidna-parade: a tool for diverse multicore smart contract fuzzing
ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and AnalysisEchidna is a widely used fuzzer for Ethereum Virtual Machine (EVM) compatible blockchain smart contracts that generates transaction sequences of calls to smart contracts. While Echidna is an essentially single-threaded tool, it is possible for multiple ...
Increasing Diversity in Coverage Test Suites Using Model Checking
QSIC '09: Proceedings of the 2009 Ninth International Conference on Quality SoftwareAutomated test case generation often results in test suites containing significant redundancy such as test cases that are duplicates, prefixes of other test cases, or cover the same test requirements. In this paper we consider the fact that items ...
An optimized BIST test pattern generator for delay testing
VTS '97: Proceedings of the 15th IEEE VLSI Test SymposiumAs delay testing using external testers requires expensive equipment, built-in self-test (BIST) is an alternative technique that can significantly reduce the test cost. In this paper, a BIST test pattern generator (TPG) design for the detection of delay ...
Comments