ABSTRACT
Software defects cost time and money to diagnose and fix. Consequently, developers use a variety of techniques to avoid introducing defects into their systems. However, these techniques have costs of their own; the benefit of using a technique must outweigh the cost of applying it.
In this paper we investigate the costs and benefits of automated regression testing in practice. Specifically, we studied 61 projects that use Travis CI, a cloud-based continuous integration tool, in order to examine real test failures that were encountered by the developers of those projects. We determined how the developers resolved the failures they encountered and used this information to classify the failures as being caused by a flaky test, by a bug in the system under test, or by a broken or obsolete test. We consider that test failures caused by bugs represent a benefit of the test suite, while failures caused by broken or obsolete tests represent a test suite maintenance cost.
We found that 18% of test suite executions fail and that 13% of these failures are flaky. Of the non-flaky failures, only 74% were caused by a bug in the system under test; the remaining 26% were due to incorrect or obsolete tests. In addition, we found that, in the failed builds, only 0.38% of the test case executions failed and 64% of failed builds contained more than one failed test.
Our findings contribute to a wider understanding of the unforeseen costs that can impact the overall cost effectiveness of regression testing in practice. They can also inform research into test case selection techniques, as we have provided an approximate empirical bound on the practical value that could be extracted from such techniques. This value appears to be large, as the 61 systems under study contained nearly 3 million lines of test code and yet over 99% of test case executions could have been eliminated with a perfect oracle.
- Moritz Beller, Georgios Gousios, Annibale Panichella, and Andy Zaidman. 2015. When, How, and Why Developers (Do Not) Test in Their IDEs. In Proceedings of the European Software Engineering Conference held jointly with the Symposium on Foundations of Software Engineering (ESEC/FSE). 179–190. Google ScholarDigital Library
- Sebastian Elbaum, Gregg Rothermel, and John Penix. 2014. Techniques for Improving Regression Testing in Continuous Integration Development Environments. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). 235–245. Google ScholarDigital Library
- Mark Grechanik, Qing Xie, and Chen Fu. 2009. Maintaining and Evolving GUIdirected Test Scripts. In Proceedings of the International Conference on Software Engineering (ICSE). 408–418. Google ScholarDigital Library
- Kim Herzig, Michaela Greiler, Jacek Czerwonka, and Brendan Murphy. 2015. The Art of Testing Less Without Sacrificing Quality. In Proceedings of the International Conference on Software Engineering (ICSE). 483–493. Google ScholarDigital Library
- Saeed Salem Jeff Anderson and Hyunsook Do. 2014. Improving the Effectiveness of Test Suite Through Mining Historical Data. In Proceedings of the Working Conference on Mining Software Repositories (MSR). 142–151. Google ScholarDigital Library
- Jussi Kasurinen, Ossi Taipale, and Kari Smolander. 2010. Software Test Automation in Practice: Empirical Observations. Advances in Software Engineering (2010). Google ScholarDigital Library
- Negar Koochakzadeh and Vahid Garousi. 2010. A Tester-Assisted Methodology for Test Redundancy Detection. Advances in Software Engineering (2010). Google ScholarDigital Library
- Negar Koochakzadeh, Vahid Garousi, and Frank Maurer. 2009. Test Redundancy Measurement Based on Coverage Information: Evaluations and Lessons Learned. In Proceedings of the International Conference on Software Testing Verification and Validation (ICST). 220–229. Google ScholarDigital Library
- Cosmin Marsavina, Daniela Romano, and Andy Zaidman. 2014. Studying Fine-Grained Co-Evolution Patterns of Production and Test Code. In Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM). 195–204. Google ScholarDigital Library
- Atif M Memon and Mary Lou Soffa. 2003. Regression Testing of GUIs. SIGSOFT Software Engineering Notes 28, 5 (2003), 118–127. Google ScholarDigital Library
- Leandro Sales Pinto, Saurabh Sinha, and Alessandro Orso. 2012. Understanding Myths and Realities of Test-Suite Evolution. In Proceedings of the Symposium on the Foundations of Software Engineering (FSE). Google ScholarDigital Library
- David Rosenblum and Elaine Weyuker. 1997. Using Coverage Information to Predict the Cost-Effectiveness of Regression Testing Strategies. Transactions on Software Engineering (TSE) 23:3 (1997), 146–156. Google ScholarDigital Library
- Gregg Rothermel, Sebastian Elbaum, Alexey Malishevsky, Praveen Kallakuri, and Brian Davia. 2002.Google Scholar
- The Impact of Test Suite Granularity on the Costeffectiveness of Regression Testing. In Proceedings of the International Conference on Software Engineering (ICSE). 130–140. Google ScholarDigital Library
- Bogdan Vasilescu, Stef Van Schuylenburg, Jules Wulms, Alexander Serebrenik, and Mark GJ van den Brand. 2015. Continuous Integration in a Social-Coding World: Empirical Evidence From GitHub.** Updated version with corrections**. arXiv preprint arXiv:1512.01862 (2015). Google ScholarDigital Library
Index Terms
- Measuring the cost of regression testing in practice: a study of Java projects using continuous integration
Recommendations
Reinforcement learning for automatic test case prioritization and selection in continuous integration
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and AnalysisTesting in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle. Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if ...
Techniques for improving regression testing in continuous integration development environments
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software EngineeringIn continuous integration development environments, software engineers frequently integrate new or changed code with the mainline codebase. This can reduce the amount of code rework that is needed as systems evolve and speed up development time. While ...
Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study
ICSM '13: Proceedings of the 2013 IEEE International Conference on Software MaintenanceRegression testing in continuous integration environment is bounded by tight time constraints. To satisfy time constraints and achieve testing goals, test cases must be efficiently ordered in execution. Prioritization techniques are commonly used to ...
Comments