Abstract
Without complete formal specification, automatically generated software tests need to be manually checked in order to detect faults. This makes it desirable to produce the strongest possible test set while keeping the number of tests as small as possible. As commonly applied coverage criteria like branch coverage are potentially weak, mutation testing has been proposed as a stronger criterion. However, mutation based test generation is hampered because usually there are simply too many mutants, and too many of these are either trivially killed or equivalent. On such mutants, any effort spent on test generation would per definition be wasted. To overcome this problem, our search-based EvoSuite test generation tool integrates two novel optimizations: First, we avoid redundant test executions on mutants by monitoring state infection conditions, and second we use whole test suite generation to optimize test suites towards killing the highest number of mutants, rather than selecting individual mutants. These optimizations allowed us to apply EvoSuite to a random sample of 100 open source projects, consisting of a total of 8,963 classes and more than two million lines of code, leading to a total of 1,380,302 mutants. The experiment demonstrates that our approach scales well, making mutation testing a viable test criterion for automated test case generation tools, and allowing us to analyze the relationship of branch coverage and mutation testing in detail.
Similar content being viewed by others
Notes
We used the 1.02 version of SF100. The original version in Fraser and Arcuri (2012b) had 8, 784 classes, but more classes became available once we fixed some classpath issues (e.g., missing jars) in some of the projects.
Note that the base case of branch coverage is produced using whole test suite generation; targeting individual branches would lead to lower branch coverage (Fraser and Arcuri 2013c).
References
Acree AT (1980) On mutation. Phd thesis, Georgia Institute of Technology, Atlanta, Georgia
Adamopoulos K, Harman M, Hierons RM (2004) How to overcome the equivalent mutant problem and achieve tailored selective mutation using co-evolution. In: Genetic and evolutionary computation conference (GECCO). Seattle, Washington, pp 1338–1349
Andrews JH, Briand LC, Labiche Y (2005) Is mutation an appropriate tool for testing experiments? In: Proceedings of the 27th international conference on software engineering, ICSE 05. ACM, pp 402–411
Arcuri A (2013) It really does matter how you normalize the branch distance in search-based software testing. Softw Test Verification Reliab (STVR) 23(2):119–147
Arcuri A, Briand L (2012) A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. In: Software testing verification and reliability (STVR). doi:10.1002/stvr.1486
Arcuri A, Fraser G (2013) Parameter tuning or default values? An empirical investigation in search-based software engineering. In: Empirical software engineering (EMSE). pp 1–30. doi:10.1007/s10664-013-9249-9
Ayari K, Bouktif S, Antoniol G (2007) Automatic mutation test input data generation via ant colony. In: Genetic and evolutionary computation conference (GECCO). ACM, New York, pp 1074–1081
Baker R, Habli I (2012) An empirical evaluation of mutation testing for improving the test quality of safety-critical software. In: IEEE transactions on software engineering (TSE)
Baldwin D, Sayward FG (1979) Heuristics for determining equivalence of program mutations. Technical Report 276, Yale University, New Haven, Connecticut
Baudry B, Fleurey F, Jzquel JM, Le Traon Y (2005) Automatic test cases optimization: a bacteriologic algorithm. IEEE Softw 22(2):76–82
Bauersfeld S, Vos T, Lakhotia K, Poulding S, Condori N (2013) Unit testing tool competition. In: International workshop on search-based software testing (SBST). pp 414–420
Bottaci L (2001) A genetic algorithm fitness function for mutation testing. In: International workshop on software engineering using metaheuristic inovative algorithms, a workshop at 23rd International conference on software engineering, SEMINAL 2001. pp 3–7
Budd TA (1980) Mutation analysis of program test data. Phd thesis, Yale University, New Haven, Connecticut
DeMillo RA, Offutt AJ (1991) Constraint-based automatic test data generation. IEEE Trans Softw Eng 17(9):900–910
DeMillo RA, Lipton RJ, Sayward FG (1978) Hints on test data selection: help for the practicing programmer. Computer 11(4):34–41
Deng L, Offutt J, Li N (2013) Empirical evaluation of the statement deletion mutation operator. In: IEEE International conference on software testing, verification and validation (ICST)
Fleyshgakker VN, Weiss SN (1994) Efficient mutation analysis: a new approach. In: Proceedings of the international symposium on software testing and analysis, ISSTA ’94. Seattle, Washington, pp 185–195
Frankl PG, Weiss SN, Hu C (1997) All-uses vs mutation testing: an experimental comparison of effectiveness. J Syst Softw 38(3):235–253
Fraser G, Arcuri A (2011a) EvoSuite: Automatic test suite generation for object-oriented software. In: ACM symposium on the foundations of software engineering (FSE). pp 416–419
Fraser G, Arcuri A (2011b) It is not the length that matters, it is how you control it. In: IEEE international conference on software testing, verification and validation (ICST). pp 150–159
Fraser G, Arcuri A (2012a) The seed is strong: Seeding strategies in search-based software testing. In: IEEE international conference on software testing, verification and validation (ICST). pp 121–130
Fraser G, Arcuri A (2012b) Sound empirical evidence in software testing. In: ACM/IEEE international conference on software engineering (ICSE). pp 178–188
Fraser G, Arcuri A (2013a) Evosuite at the SBST 2013 tool competition. In: International workshop on search-based software testing (SBST). pp 406–409
Fraser G, Arcuri A (2013b) EvoSuite: On the challenges of test case generation in the real world (tool paper). In: IEEE international conference on software testing, verification and validation (ICST)
Fraser G, Arcuri A (2013c) Whole test suite generation, vol 39
Fraser G, Zeller A (2012) Mutation-driven generation of unit tests and oracles. IEEE Trans Softw Eng (TSE) 28(2):278–292
Fraser G, Arcuri A, McMinn P (2013) Test suite generation with memetic algorithms. In: Genetic and evolutionary computation conference (GECCO)
Godefroid P, Klarlund N, Sen K (2005) DART: directed automated random testing. In: Proceedings of the 2005 ACM SIGPLAN conference on programming language design and implementation, PLDI05. ACM, pp 213223
Hamlet RG (1977) Testing programs with the aid of a compiler. IEEE Trans Softw Eng 3(4):279–290
Harman M, Jia Y, Langdon WB (2011) Strong higher order mutation-based test data generation. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, ESEC/FSE 11. ACM, pp 212–222
Hierons RM, Harman M, Danicic S (1999) Using program slicing to assist in the detection of equivalent mutants. Softw Test Verification Reliab 9(4):233–262
Howden WE (1982) Weak mutation testing and completeness of test sets. IEEE Trans Softw Eng 8(4):371–379
Jia Y, Harman M (2009) Higher order mutation testing. J Informat Softw Technol 51(10):1379–1393
Jia Y, Harman M (2011) An analysis and survey of the development of mutation testing. IEEE Trans Softw Eng (TSE) 37(5):649–678
Just R, Kapfhammer GM, Schweiggert F (2012) Using non-redundant mutation operators and test suite prioritization to achieve efficient and scalable mutation analysis. In: Proceedings of the 2012 IEEE 23rd international symposium on software reliability engineering, ISSRE 12. IEEE Computer Society, pp11–20
Just R, Ernst MD, Fraser G (2013) Using state infection conditions to detect equivalent mutants and speed up mutation analysis. arXiv preprint arXiv:13032784
Korel B (1990) Automated software test data generation. In: IEEE Transactions on software engineering, pp 870–879
Mateo PR, Usaola MP, Aleman JLF (2012) Validating 2nd-order mutation at system level. In: IEEE Transactions on software engineering (TSE)
McMinn P (2004) Search-based software test data generation: a survey. Softw Test Verification Reliab 14(2):105–156
Offutt AJ (1992) Investigations of the software testing coupling effect. ACM Trans Softw Eng Methodol 1(1):5–20
Offutt AJ, Craft WM (1994) Using compiler optimization techniques to detect equivalent mutants. Softw Test Verification Reliab 4(3):131–154
Offutt AJ, Lee SD (1991) How strong is weak mutation? In: Proceedings of the symposium on testing, analysis, and verification. TAV4. ACM, pp 200–213
Offutt AJ, Lee SD (1994) An empirical evaluation of weak mutation. IEEE Trans Softw Eng 20(5):337–344
Offutt AJ, Pan J (1997) Automatically detecting equivalent mutants and infeasible paths. Softw Test Verification Reliab 7(3):165–192
Offutt AJ, Untch RH (2001) Mutation testing for the new century. Chap mutation 2000: uniting the orthogonal. Kluwer Academic Publishers, Norwell, MA, pp 34–44
Offutt AJ, Rothermel G, Zapf C (1993) An experimental evaluation of selective mutation. In: Proceedings of the 15th international conference on software engineering, ICSE ’93. Baltimore, Maryland, pp 100–107
Offutt AJ, Ma YS, Kwon YR (2004) An experimental mutation system for Java. ACM SIGSOFT Softw Eng Notes 29(5):1–4
Pacheco C, Ernst MD (2007) Randoop: feedback-directed random testing for Java. In: Companion to the 22nd ACM SIGPLAN conference on object-oriented programming systems and application, OOPSLA07. ACM, pp 815816
Papadakis M, Malevris N (2010) Automatic mutation test case generation via dynamic symbolic execution. In: IEEE 21st International symposium on software reliability engineering (ISSRE). pp 121–130
Patrick M, Alexander R, Oriol M, Clark JA (2013) Using mutation analysis to evolve subdomains for random testing. In: International workshop on mutation analysis
Schuler D, Zeller A (2010) (Un-)Covering equivalent mutants. In: Proceedings of the 3rd international conference on software testing, verification, and validation, ICST ’10. IEEE Computer Society, pp 45–54
Staats M, Whalen MW, Heimdahl MP (2011) Programs, tests, and oracles: the foundations of testing revisited. In: ACM/IEEE international conference on software engineering (ICSE). pp 391–400
Untch RH (1992) Mutation-based software testing using program schemata. In: Proceedings of the 30th annual southeast regional conference (ACM-SE’92). Raleigh, North Carolina, pp 285–291
Walsh PJ (1985) A measure of test case completeness (software, engineering). PhD thesis, State University of New York at Binghamton, Binghamton, NY
Wong WE, Mathur AP, Maldonado JC (1995) Mutation versus all-uses: an empirical evaluation of cost, strength and effectiveness. In: Software quality and productivity: theory, practice and training. Chapman & Hall, London, pp 258–265
Zhang L, Xie T, Zhang L, Tillmann N, de Halleux J, Mei H (2010) Test generation via dynamic symbolic execution for mutation testing. In: Proceedings of the 2010 IEEE international conference on software maintenance, ICSM 10. IEEE Computer Society, pp 110
Acknowledgments
This project has been funded by a Google Focused Research Award on “Test Amplification” and the Norwegian Research Council.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Antonia Bertolino
Rights and permissions
About this article
Cite this article
Fraser, G., Arcuri, A. Achieving scalable mutation-based generation of whole test suites. Empir Software Eng 20, 783–812 (2015). https://doi.org/10.1007/s10664-013-9299-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-013-9299-z