skip to main content
10.1145/2483760.2483769acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Comparing non-adequate test suites using coverage criteria

Published: 15 July 2013 Publication History

Abstract

A fundamental question in software testing research is how to compare test suites, often as a means for comparing test-generation techniques. Researchers frequently compare test suites by measuring their coverage. A coverage criterion C provides a set of test requirements and measures how many requirements a given suite satisfies. A suite that satisfies 100% of the (feasible) requirements is C-adequate.
Previous rigorous evaluations of coverage criteria mostly focused on such adequate test suites: given criteria C and C′, are C-adequate suites (on average) more effective than C′-adequate suites? However, in many realistic cases producing adequate suites is impractical or even impossible. We present the first extensive study that evaluates coverage criteria for the common case of non-adequate test suites: given criteria C and C′, which one is better to use to compare test suites? Namely, if suites T1, T2 . . . Tn have coverage values c1, c2 . . . cn for C and c′1, c′2 . . . c′n for C′, is it better to compare suites based on c1, c2 . . . cn or based on c′1, c′ 2 . . . c′n?
We evaluate a large set of plausible criteria, including statement and branch coverage, as well as stronger criteria used in recent studies. Two criteria perform best: branch coverage and an intra-procedural acyclic path coverage.

References

[1]
M. Adolfsen. Industrial validation of test coverage quality. Master’s thesis, University of Twente, 2011.
[2]
P. Ammann and J. Offutt. Introduction to Software Testing. Cambridge University Press, 2008.
[3]
J. H. Andrews, L. C. Briand, and Y. Labiche. Is mutation an appropriate tool for testing experiments? In International Conference on Software Engineering, pages 402–411, 2005.
[4]
J. H. Andrews, L. C. Briand, Y. Labiche, and A. S. Namin. Using mutation analysis for assessing and comparing testing coverage criteria. Trans. Softw. Eng., 32:608–624, 2006.
[5]
A. Arcuri and L. C. Briand. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In International Conference on Software Engineering, pages 1–10, 2011.
[6]
T. Ball. A theory of predicate-complete test coverage and generation. Technical Report MSR-TR-2004-28, Microsoft Research, 2004.
[7]
T. Ball. A theory of predicate-complete test coverage and generation. In Formal Methods for Components and Objects, pages 1–22. 2005.
[8]
T. Ball and J. R. Larus. Efficient path profiling. In International Symposium on Microarchitecture, pages 46–57, 1996.
[9]
T. Ball and S. Rajamani. Automatically validating temporal safety properties of interfaces. In Workshop on Model Checking of Software, pages 103–122, 2001.
[10]
X. Cai and M. R. Lyu. The effect of code coverage on fault detection under different testing profiles. In International Workshop on Advances in Model-Based Testing, pages 1–7, 2005.
[11]
S. Chaki, E. M. Clarke, A. Groce, and O. Strichman. Predicate abstraction with minimum predicates. In Correct Hardware Design and Verification Methods, pages 19–34, 2003.
[12]
S. Chaki, A. Groce, and O. Strichman. Explaining abstract counterexamples. In Symposium on the Foundations of Software Engineering, pages 73–82, 2004.
[13]
T. M. Chilimbi, B. Liblit, K. Mehra, A. V. Nori, and K. Vaswani. Holmes: Effective statistical debugging via efficient path profiling. In International Conference on Software Engineering, pages 34–44, 2009.
[14]
N. Cliff. Ordinal Methods for Behavioral Data Analysis. Pyschology Press, 1996.
[15]
Count lines of code. http://cloc.sourceforge.net/.
[16]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Third Edition. The MIT Press, 2009.
[17]
H. L. Costner. Criteria for measures of association. American Sociological Review, 3, 1965.
[18]
Instrumented container classes - predicate coverage. http://mir.cs.illinois.edu/coverage/.
[19]
R. A. DeMillo, R. J. Lipton, and F. G. Sayward. Hints on test data selection: Help for the practicing programmer. Computer, 11:34–41, 1978.
[20]
H. Do, S. G. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Softw. Engg., 10:405–435, 2005.
[21]
P. G. Frankl and O. Iakounenko. Further empirical studies of test effectiveness. In Symposium on the Foundations of Software Engineering, pages 153–162, 1998.
[22]
P. G. Frankl and S. N. Weiss. An experimental comparison of the effectiveness of branch testing and data flow testing. Trans. Software Eng., 19:774–787, 1993.
[23]
J. P. Galeotti, N. Rosner, C. G. López Pombo, and M. F. Frias. Analysis of invariants for efficient bounded verification. In International Symposium on Software Testing and Analysis, pages 25–36, 2010.
[24]
P. Godefroid. Compositional dynamic test generation. In Symposium on Principles of Programming Languages, pages 47–54, 2007.
[25]
A. Groce. (Quickly) testing the tester via path coverage. In Workshop on Dynamic Analysis, pages 22–28, 2009.
[26]
A. Groce. Coverage rewarded: Test input generation via adaptation-based programming. In International Conference on Automated Software Engineering, pages 380–383, 2011.
[27]
A. Groce, A. Fern, J. Pinto, T. Bauer, M. A. Alipour, M. Erwig, and C. Lopez. Lightweight automated testing with adaptation-based programming. In International Symposium on Software Reliability Engineering, pages 161–170, 2012.
[28]
A. Groce, G. Holzmann, and R. Joshi. Randomized differential testing as a prelude to formal verification. In International Conference on Software Engineering, pages 621–631, 2007.
[29]
A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis, pages 78–88, 2012.
[30]
J. P. Guilford. Fundamental Statistics in Pyschology and Education. McGraw-Hill, 1956.
[31]
A. Gupta and P. Jalote. An approach for experimentally evaluating effectiveness and efficiency of coverage criteria for software testing. Softw. Tools Technol. Transf., 10:145–160, 2008.
[32]
R. G. Hamlet. Testing programs with the aid of a compiler. Trans. Softw. Eng., 3:279–290, 1977.
[33]
M. Harder, J. Mellen, and M. D. Ernst. Improving test suites via operational abstraction. In International Conference on Software Engineering, pages 60–71, 2003.
[34]
M. M. Hassan and J. H. Andrews. Comparing multi-point stride coverage and dataflow coverage. In International Conference on Software Engineering, pages 172–181, 2013.
[35]
T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction. In Symposium on Principles of Programming Languages, pages 58–70, 2002.
[36]
M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In International Conference on Software Engineering, pages 191–200, 1994.
[37]
JFreeChart Home Page. http://www.jfree.org/ jfreechart/.
[38]
Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. Trans. Soft. Eng., 37:649–678, 2011.
[39]
JodaTime Home Page. http://joda-time. sourceforge.net/.
[40]
M. Kendall. A new measure of rank correlation. Biometrika, 1-2:81–89, 1938.
[41]
J. R. Larus. Whole program paths. In Programming Language Design and Implementation, pages 259–269, 1999.
[42]
A. S. Namin and J. H. Andrews. The influence of size and coverage on test suite effectiveness. In International Symposium on Software Testing and Analysis, pages 57–68, 2009.
[43]
A. J. Offutt, G. Rothermel, and C. Zapf. An experimental evaluation of selective mutation. In International Conference on Software Engineering, pages 100–107, 1993.
[44]
C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In International Conference on Software Engineering, pages 75–84, 2007.
[45]
G. Rothermel, R. Untch, C. Chu, and M. J. Harrold. Test case prioritization. Trans. Softw. Eng., 27:929–948, 2001.
[46]
D. Schuler and A. Zeller. Javalanche: efficient mutation testing for Java. In Symposium on the Foundations of Software Engineering, pages 297–298, 2009.
[47]
R. Sharma, M. Gligoric, A. Arcuri, G. Fraser, and D. Marinov. Testing container classes: Random or systematic? In Fundamental Approaches to Software Engineering, pages 262–277, 2011.
[48]
R. Sharma, M. Gligoric, V. Jagannath, and D. Marinov. A comparison of constraint-based and sequence-based generation of complex input data structures. In Software Testing, Verification, and Validation Workshops, pages 337–342, 2010.
[49]
A. Siami Namin, J. H. Andrews, and D. J. Murdoch. Sufficient mutation operators for measuring test effectiveness. In International Conference on Software Engineering, pages 351–360, 2008.
[50]
SQLite Home Page. http://www.sqlite.org/.
[51]
W. Visser, C. S. Pasareanu, and R. Pelánek. Test input generation for Java containers using state matching. In International Symposium on Software Testing and Analysis, pages 37–48, 2006.
[52]
M. Vittek, P. Borovansky, and P.-E. Moreau. A simple generic library for C. In International Conference on Software Reuse, pages 423–426, 2006.
[53]
F. I. Vokolos and P. G. Frankl. Empirical evaluation of the textual differencing regression testing technique. In International Conference on Software Maintenance, pages 44–53, 1998.
[54]
T. Wang and A. Roychoudhury. Automated path generation for software fault localization. In International Conference on Automated Software Engineering, pages 347–351, 2005.
[55]
W. Wong, J. Horgan, S. London, and A. Mathur. Effect of test set size and block coverage on the fault detection effectiveness. In International Symposium on Software Reliability, pages 230–238, 1994.
[56]
W. Wong, J. Horgan, S. London, and A. Mathur. Effect of test set minimization on fault detection effectiveness. In International Conference on Software Engineering, pages 41–50, 1995.
[57]
YAFFS: A flash file system for embedded use. http:// www.yaffs.net.
[58]
L. Zhang, S.-S. Hou, J.-J. Hu, T. Xie, and H. Mei. Is operator-based mutant selection superior to random mutant selection? In International Conference on Software Engineering, pages 435–444, 2010.

Cited By

View all
  • (2024)ExLi: An Inline-Test Generation Tool for JavaCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663817(652-656)Online publication date: 10-Jul-2024
  • (2024)Verifying consistency of software product line architectures with product architecturesSoftware and Systems Modeling (SoSyM)10.1007/s10270-023-01114-423:1(195-221)Online publication date: 1-Feb-2024
  • (2023)Input and Output Coverage Needed in File System TestingProceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3599691.3603405(93-101)Online publication date: 9-Jul-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2013: Proceedings of the 2013 International Symposium on Software Testing and Analysis
July 2013
381 pages
ISBN:9781450321594
DOI:10.1145/2483760
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 July 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Coverage criteria
  2. non-adequate test suites

Qualifiers

  • Research-article

Conference

ISSTA '13
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ExLi: An Inline-Test Generation Tool for JavaCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663817(652-656)Online publication date: 10-Jul-2024
  • (2024)Verifying consistency of software product line architectures with product architecturesSoftware and Systems Modeling (SoSyM)10.1007/s10270-023-01114-423:1(195-221)Online publication date: 1-Feb-2024
  • (2023)Input and Output Coverage Needed in File System TestingProceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3599691.3603405(93-101)Online publication date: 9-Jul-2023
  • (2023)Green Fuzzing: A Saturation-Based Stopping Criterion using Vulnerability PredictionProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598043(127-139)Online publication date: 12-Jul-2023
  • (2023)FuzzyCAT: A Framework for Network Configuration Verification Based on Fuzzing2023 IEEE International Performance, Computing, and Communications Conference (IPCCC)10.1109/IPCCC59175.2023.10253841(123-131)Online publication date: 17-Nov-2023
  • (2023)On factors that impact the relationship between code coverage and test suite effectiveness: a survey2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW58534.2023.00071(381-388)Online publication date: Apr-2023
  • (2023)Reachable Coverage: Estimating Saturation in FuzzingProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00042(371-383)Online publication date: 14-May-2023
  • (2023)Verifying contracts among software componentsInformation and Software Technology10.1016/j.infsof.2023.107282163:COnline publication date: 1-Nov-2023
  • (2023)Functional suitability assessment of smart contracts: A survey and first proposalJournal of Software: Evolution and Process10.1002/smr.2636Online publication date: 28-Nov-2023
  • (2023)Revisiting deep neural network test coverage from the test effectiveness perspectiveJournal of Software: Evolution and Process10.1002/smr.2561Online publication date: 16-Mar-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media