research-article

Measuring the cost of regression testing in practice: a study of Java projects using continuous integration

Authors:
Adriaan Labuschagne

University of Waterloo, Canada

University of Waterloo, Canada
View Profile

,
Laura Inozemtseva

University of Waterloo, Canada

University of Waterloo, Canada
View Profile

,
Reid Holmes

University of British Columbia, Canada

University of British Columbia, Canada
View Profile

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software EngineeringAugust 2017Pages 821–830https://doi.org/10.1145/3106237.3106288

Published:21 August 2017Publication History

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Pages 821–830

ABSTRACT

Software defects cost time and money to diagnose and fix. Consequently, developers use a variety of techniques to avoid introducing defects into their systems. However, these techniques have costs of their own; the benefit of using a technique must outweigh the cost of applying it.

In this paper we investigate the costs and benefits of automated regression testing in practice. Specifically, we studied 61 projects that use Travis CI, a cloud-based continuous integration tool, in order to examine real test failures that were encountered by the developers of those projects. We determined how the developers resolved the failures they encountered and used this information to classify the failures as being caused by a flaky test, by a bug in the system under test, or by a broken or obsolete test. We consider that test failures caused by bugs represent a benefit of the test suite, while failures caused by broken or obsolete tests represent a test suite maintenance cost.

We found that 18% of test suite executions fail and that 13% of these failures are flaky. Of the non-flaky failures, only 74% were caused by a bug in the system under test; the remaining 26% were due to incorrect or obsolete tests. In addition, we found that, in the failed builds, only 0.38% of the test case executions failed and 64% of failed builds contained more than one failed test.

Our findings contribute to a wider understanding of the unforeseen costs that can impact the overall cost effectiveness of regression testing in practice. They can also inform research into test case selection techniques, as we have provided an approximate empirical bound on the practical value that could be extracted from such techniques. This value appears to be large, as the 61 systems under study contained nearly 3 million lines of test code and yet over 99% of test case executions could have been eliminated with a perfect oracle.

References

Moritz Beller, Georgios Gousios, Annibale Panichella, and Andy Zaidman. 2015. When, How, and Why Developers (Do Not) Test in Their IDEs. In Proceedings of the European Software Engineering Conference held jointly with the Symposium on Foundations of Software Engineering (ESEC/FSE). 179–190. Google ScholarDigital Library
Sebastian Elbaum, Gregg Rothermel, and John Penix. 2014. Techniques for Improving Regression Testing in Continuous Integration Development Environments. In Proceedings of the International Symposium on Foundations of Software Engineering (FSE). 235–245. Google ScholarDigital Library
Mark Grechanik, Qing Xie, and Chen Fu. 2009. Maintaining and Evolving GUIdirected Test Scripts. In Proceedings of the International Conference on Software Engineering (ICSE). 408–418. Google ScholarDigital Library
Kim Herzig, Michaela Greiler, Jacek Czerwonka, and Brendan Murphy. 2015. The Art of Testing Less Without Sacrificing Quality. In Proceedings of the International Conference on Software Engineering (ICSE). 483–493. Google ScholarDigital Library
Saeed Salem Jeff Anderson and Hyunsook Do. 2014. Improving the Effectiveness of Test Suite Through Mining Historical Data. In Proceedings of the Working Conference on Mining Software Repositories (MSR). 142–151. Google ScholarDigital Library
Jussi Kasurinen, Ossi Taipale, and Kari Smolander. 2010. Software Test Automation in Practice: Empirical Observations. Advances in Software Engineering (2010). Google ScholarDigital Library
Negar Koochakzadeh and Vahid Garousi. 2010. A Tester-Assisted Methodology for Test Redundancy Detection. Advances in Software Engineering (2010). Google ScholarDigital Library
Negar Koochakzadeh, Vahid Garousi, and Frank Maurer. 2009. Test Redundancy Measurement Based on Coverage Information: Evaluations and Lessons Learned. In Proceedings of the International Conference on Software Testing Verification and Validation (ICST). 220–229. Google ScholarDigital Library
Cosmin Marsavina, Daniela Romano, and Andy Zaidman. 2014. Studying Fine-Grained Co-Evolution Patterns of Production and Test Code. In Proceedings of the International Working Conference on Source Code Analysis and Manipulation (SCAM). 195–204. Google ScholarDigital Library
Atif M Memon and Mary Lou Soffa. 2003. Regression Testing of GUIs. SIGSOFT Software Engineering Notes 28, 5 (2003), 118–127. Google ScholarDigital Library
Leandro Sales Pinto, Saurabh Sinha, and Alessandro Orso. 2012. Understanding Myths and Realities of Test-Suite Evolution. In Proceedings of the Symposium on the Foundations of Software Engineering (FSE). Google ScholarDigital Library
David Rosenblum and Elaine Weyuker. 1997. Using Coverage Information to Predict the Cost-Effectiveness of Regression Testing Strategies. Transactions on Software Engineering (TSE) 23:3 (1997), 146–156. Google ScholarDigital Library
Gregg Rothermel, Sebastian Elbaum, Alexey Malishevsky, Praveen Kallakuri, and Brian Davia. 2002.Google Scholar
The Impact of Test Suite Granularity on the Costeffectiveness of Regression Testing. In Proceedings of the International Conference on Software Engineering (ICSE). 130–140. Google ScholarDigital Library
Bogdan Vasilescu, Stef Van Schuylenburg, Jules Wulms, Alexander Serebrenik, and Mark GJ van den Brand. 2015. Continuous Integration in a Social-Coding World: Empirical Evidence From GitHub.** Updated version with corrections**. arXiv preprint arXiv:1512.01862 (2015). Google ScholarDigital Library

Index Terms

Measuring the cost of regression testing in practice: a study of Java projects using continuous integration
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software notations and tools
    1. Software libraries and repositories

Recommendations

Reinforcement learning for automatic test case prioritization and selection in continuous integration
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

Testing in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle. Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if ...
Read More
Techniques for improving regression testing in continuous integration development environments
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

In continuous integration development environments, software engineers frequently integrate new or changed code with the mainline codebase. This can reduce the amount of code rework that is needed as systems evolve and speed up development time. While ...
Read More
Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study
ICSM '13: Proceedings of the 2013 IEEE International Conference on Software Maintenance

Regression testing in continuous integration environment is bounded by tight time constraints. To satisfy time constraints and achieve testing goals, test cases must be efficiently ordered in execution. Prioritization techniques are commonly used to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
August 2017
1073 pages
ISBN:9781450351058
DOI:10.1145/3106237
General Chairs:
Eric Bodden
Paderborn University, Germany / Fraunhofer IEM, Germany
,
Wilhelm Schäfer
Paderborn University, Germany
,
Program Chairs:
Arie van Deursen
Delft University of Technology, Netherlands
,
Andrea Zisman
Open University, UK
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Regression testing
continuous integration
cost effectiveness
flaky tests
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate112of543submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 57
  Total Citations
  View Citations
- 5,275
  Total Downloads
- Downloads (Last 12 months)68
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Measuring the cost of regression testing in practice: a study of Java projects using continuous integration

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Reinforcement learning for automatic test case prioritization and selection in continuous integration

Techniques for improving regression testing in continuous integration development environments

Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study