research-article

Comparing non-adequate test suites using coverage criteria

Authors:

Milos Gligoric,

Chaoqiang Zhang,

Mohammad Amin Alipour,

Darko MarinovAuthors Info & Claims

ISSTA 2013: Proceedings of the 2013 International Symposium on Software Testing and Analysis

Pages 302 - 313

https://doi.org/10.1145/2483760.2483769

Published: 15 July 2013 Publication History

Abstract

A fundamental question in software testing research is how to compare test suites, often as a means for comparing test-generation techniques. Researchers frequently compare test suites by measuring their coverage. A coverage criterion C provides a set of test requirements and measures how many requirements a given suite satisfies. A suite that satisfies 100% of the (feasible) requirements is C-adequate.

Previous rigorous evaluations of coverage criteria mostly focused on such adequate test suites: given criteria C and C′, are C-adequate suites (on average) more effective than C′-adequate suites? However, in many realistic cases producing adequate suites is impractical or even impossible. We present the first extensive study that evaluates coverage criteria for the common case of non-adequate test suites: given criteria C and C′, which one is better to use to compare test suites? Namely, if suites T1, T2 . . . Tn have coverage values c1, c2 . . . cn for C and c′1, c′2 . . . c′n for C′, is it better to compare suites based on c1, c2 . . . cn or based on c′1, c′ 2 . . . c′n?

We evaluate a large set of plausible criteria, including statement and branch coverage, as well as stronger criteria used in recent studies. Two criteria perform best: branch coverage and an intra-procedural acyclic path coverage.

References

[1]

M. Adolfsen. Industrial validation of test coverage quality. Master’s thesis, University of Twente, 2011.

[2]

P. Ammann and J. Offutt. Introduction to Software Testing. Cambridge University Press, 2008.

Digital Library

[3]

J. H. Andrews, L. C. Briand, and Y. Labiche. Is mutation an appropriate tool for testing experiments? In International Conference on Software Engineering, pages 402–411, 2005.

Digital Library

[4]

J. H. Andrews, L. C. Briand, Y. Labiche, and A. S. Namin. Using mutation analysis for assessing and comparing testing coverage criteria. Trans. Softw. Eng., 32:608–624, 2006.

Digital Library

[5]

A. Arcuri and L. C. Briand. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In International Conference on Software Engineering, pages 1–10, 2011.

Digital Library

[6]

T. Ball. A theory of predicate-complete test coverage and generation. Technical Report MSR-TR-2004-28, Microsoft Research, 2004.

[7]

T. Ball. A theory of predicate-complete test coverage and generation. In Formal Methods for Components and Objects, pages 1–22. 2005.

Digital Library

[8]

T. Ball and J. R. Larus. Eﬃcient path profiling. In International Symposium on Microarchitecture, pages 46–57, 1996.

[9]

T. Ball and S. Rajamani. Automatically validating temporal safety properties of interfaces. In Workshop on Model Checking of Software, pages 103–122, 2001.

Digital Library

[10]

X. Cai and M. R. Lyu. The effect of code coverage on fault detection under different testing profiles. In International Workshop on Advances in Model-Based Testing, pages 1–7, 2005.

Digital Library

[11]

S. Chaki, E. M. Clarke, A. Groce, and O. Strichman. Predicate abstraction with minimum predicates. In Correct Hardware Design and Verification Methods, pages 19–34, 2003.

[12]

S. Chaki, A. Groce, and O. Strichman. Explaining abstract counterexamples. In Symposium on the Foundations of Software Engineering, pages 73–82, 2004.

Digital Library

[13]

T. M. Chilimbi, B. Liblit, K. Mehra, A. V. Nori, and K. Vaswani. Holmes: Effective statistical debugging via eﬃcient path profiling. In International Conference on Software Engineering, pages 34–44, 2009.

Digital Library

[14]

N. Cliff. Ordinal Methods for Behavioral Data Analysis. Pyschology Press, 1996.

[15]

Count lines of code. http://cloc.sourceforge.net/.

[16]

T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Third Edition. The MIT Press, 2009.

Digital Library

[17]

H. L. Costner. Criteria for measures of association. American Sociological Review, 3, 1965.

[18]

Instrumented container classes - predicate coverage. http://mir.cs.illinois.edu/coverage/.

[19]

R. A. DeMillo, R. J. Lipton, and F. G. Sayward. Hints on test data selection: Help for the practicing programmer. Computer, 11:34–41, 1978.

Digital Library

[20]

H. Do, S. G. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Softw. Engg., 10:405–435, 2005.

Digital Library

[21]

P. G. Frankl and O. Iakounenko. Further empirical studies of test effectiveness. In Symposium on the Foundations of Software Engineering, pages 153–162, 1998.

Digital Library

[22]

P. G. Frankl and S. N. Weiss. An experimental comparison of the effectiveness of branch testing and data flow testing. Trans. Software Eng., 19:774–787, 1993.

Digital Library

[23]

J. P. Galeotti, N. Rosner, C. G. López Pombo, and M. F. Frias. Analysis of invariants for eﬃcient bounded verification. In International Symposium on Software Testing and Analysis, pages 25–36, 2010.

Digital Library

[24]

P. Godefroid. Compositional dynamic test generation. In Symposium on Principles of Programming Languages, pages 47–54, 2007.

Digital Library

[25]

A. Groce. (Quickly) testing the tester via path coverage. In Workshop on Dynamic Analysis, pages 22–28, 2009.

Digital Library

[26]

A. Groce. Coverage rewarded: Test input generation via adaptation-based programming. In International Conference on Automated Software Engineering, pages 380–383, 2011.

Digital Library

[27]

A. Groce, A. Fern, J. Pinto, T. Bauer, M. A. Alipour, M. Erwig, and C. Lopez. Lightweight automated testing with adaptation-based programming. In International Symposium on Software Reliability Engineering, pages 161–170, 2012.

Digital Library

[28]

A. Groce, G. Holzmann, and R. Joshi. Randomized differential testing as a prelude to formal verification. In International Conference on Software Engineering, pages 621–631, 2007.

Digital Library

[29]

A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis, pages 78–88, 2012.

Digital Library

[30]

J. P. Guilford. Fundamental Statistics in Pyschology and Education. McGraw-Hill, 1956.

[31]

A. Gupta and P. Jalote. An approach for experimentally evaluating effectiveness and eﬃciency of coverage criteria for software testing. Softw. Tools Technol. Transf., 10:145–160, 2008.

Digital Library

[32]

R. G. Hamlet. Testing programs with the aid of a compiler. Trans. Softw. Eng., 3:279–290, 1977.

Digital Library

[33]

M. Harder, J. Mellen, and M. D. Ernst. Improving test suites via operational abstraction. In International Conference on Software Engineering, pages 60–71, 2003.

Digital Library

[34]

M. M. Hassan and J. H. Andrews. Comparing multi-point stride coverage and dataflow coverage. In International Conference on Software Engineering, pages 172–181, 2013.

Digital Library

[35]

T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction. In Symposium on Principles of Programming Languages, pages 58–70, 2002.

Digital Library

[36]

M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In International Conference on Software Engineering, pages 191–200, 1994.

Digital Library

[37]

JFreeChart Home Page. http://www.jfree.org/ jfreechart/.

[38]

Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. Trans. Soft. Eng., 37:649–678, 2011.

Digital Library

[39]

JodaTime Home Page. http://joda-time. sourceforge.net/.

[40]

M. Kendall. A new measure of rank correlation. Biometrika, 1-2:81–89, 1938.

[41]

J. R. Larus. Whole program paths. In Programming Language Design and Implementation, pages 259–269, 1999.

Digital Library

[42]

A. S. Namin and J. H. Andrews. The influence of size and coverage on test suite effectiveness. In International Symposium on Software Testing and Analysis, pages 57–68, 2009.

Digital Library

[43]

A. J. Offutt, G. Rothermel, and C. Zapf. An experimental evaluation of selective mutation. In International Conference on Software Engineering, pages 100–107, 1993.

Digital Library

[44]

C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In International Conference on Software Engineering, pages 75–84, 2007.

Digital Library

[45]

G. Rothermel, R. Untch, C. Chu, and M. J. Harrold. Test case prioritization. Trans. Softw. Eng., 27:929–948, 2001.

Digital Library

[46]

D. Schuler and A. Zeller. Javalanche: eﬃcient mutation testing for Java. In Symposium on the Foundations of Software Engineering, pages 297–298, 2009.

Digital Library

[47]

R. Sharma, M. Gligoric, A. Arcuri, G. Fraser, and D. Marinov. Testing container classes: Random or systematic? In Fundamental Approaches to Software Engineering, pages 262–277, 2011.

Digital Library

[48]

R. Sharma, M. Gligoric, V. Jagannath, and D. Marinov. A comparison of constraint-based and sequence-based generation of complex input data structures. In Software Testing, Verification, and Validation Workshops, pages 337–342, 2010.

Digital Library

[49]

A. Siami Namin, J. H. Andrews, and D. J. Murdoch. Suﬃcient mutation operators for measuring test effectiveness. In International Conference on Software Engineering, pages 351–360, 2008.

Digital Library

[50]

SQLite Home Page. http://www.sqlite.org/.

[51]

W. Visser, C. S. Pasareanu, and R. Pelánek. Test input generation for Java containers using state matching. In International Symposium on Software Testing and Analysis, pages 37–48, 2006.

Digital Library

[52]

M. Vittek, P. Borovansky, and P.-E. Moreau. A simple generic library for C. In International Conference on Software Reuse, pages 423–426, 2006.

Digital Library

[53]

F. I. Vokolos and P. G. Frankl. Empirical evaluation of the textual differencing regression testing technique. In International Conference on Software Maintenance, pages 44–53, 1998.

Digital Library

[54]

T. Wang and A. Roychoudhury. Automated path generation for software fault localization. In International Conference on Automated Software Engineering, pages 347–351, 2005.

Digital Library

[55]

W. Wong, J. Horgan, S. London, and A. Mathur. Effect of test set size and block coverage on the fault detection effectiveness. In International Symposium on Software Reliability, pages 230–238, 1994.

[56]

W. Wong, J. Horgan, S. London, and A. Mathur. Effect of test set minimization on fault detection effectiveness. In International Conference on Software Engineering, pages 41–50, 1995.

Digital Library

[57]

YAFFS: A flash file system for embedded use. http:// www.yaffs.net.

[58]

L. Zhang, S.-S. Hou, J.-J. Hu, T. Xie, and H. Mei. Is operator-based mutant selection superior to random mutant selection? In International Conference on Software Engineering, pages 435–444, 2010.

Digital Library

Cited By

Liu YThimmaiah ALegunsen OGligoric Md'Amorim M(2024)ExLi: An Inline-Test Generation Tool for JavaCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663817(652-656)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663817
Duran-Limon HVelasco-Elizondo PMora MMeda-Campana MAguilar KHernandez-Ochoa MSumuano L(2024)Verifying consistency of software product line architectures with product architecturesSoftware and Systems Modeling (SoSyM)10.1007/s10270-023-01114-423:1(195-221)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s10270-023-01114-4
Liu YAhuja GKuenning GSmolka SZadok ETarasov VZhang YAnwar AMi N(2023)Input and Output Coverage Needed in File System TestingProceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3599691.3603405(93-101)Online publication date: 9-Jul-2023
https://dl.acm.org/doi/10.1145/3599691.3603405
Show More Cited By

Index Terms

Comparing non-adequate test suites using coverage criteria
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Guidelines for Coverage-Based Comparisons of Non-Adequate Test Suites
Special Issue on ISSTA 2013

A fundamental question in software testing research is how to compare test suites, often as a means for comparing test-generation techniques that produce those test suites. Researchers frequently compare test suites by measuring their coverage. A ...
Using coverage criteria on RepOK to reduce bounded-exhaustive test suites
TAP'12: Proceedings of the 6th international conference on Tests and Proofs

Bounded-exhaustive exploration of test case candidates is a commonly employed approach for test generation in some contexts. Even when small bounds are used for test generation, executing the obtained tests may become prohibitive, despite the time for ...
Using coverage to automate and improve test purpose based testing

Test purposes have been presented as a solution to avoid the state space explosion when selecting test cases from formal models. Although such techniques work very well with regard to the speed of the test derivation, they leave the tester with one ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA 2013: Proceedings of the 2013 International Symposium on Software Testing and Analysis

July 2013

381 pages

ISBN:9781450321594

DOI:10.1145/2483760

General Chair:
Mauro Pezzè,
Program Chair:
Mark Harman

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 July 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISSTA '13

Sponsor:

SIGSOFT

ISSTA '13: Iitsnternational Symposium on Software Testing and Analysis

July 15 - 20, 2013

Lugano, Switzerland

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

96
Total Citations
View Citations
562
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu YThimmaiah ALegunsen OGligoric Md'Amorim M(2024)ExLi: An Inline-Test Generation Tool for JavaCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663817(652-656)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663817
Duran-Limon HVelasco-Elizondo PMora MMeda-Campana MAguilar KHernandez-Ochoa MSumuano L(2024)Verifying consistency of software product line architectures with product architecturesSoftware and Systems Modeling (SoSyM)10.1007/s10270-023-01114-423:1(195-221)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s10270-023-01114-4
Liu YAhuja GKuenning GSmolka SZadok ETarasov VZhang YAnwar AMi N(2023)Input and Output Coverage Needed in File System TestingProceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3599691.3603405(93-101)Online publication date: 9-Jul-2023
https://dl.acm.org/doi/10.1145/3599691.3603405
Lipp SElsner DKacianka SPretschner ABöhme MBanescu SJust RFraser G(2023)Green Fuzzing: A Saturation-Based Stopping Criterion using Vulnerability PredictionProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598043(127-139)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3597926.3598043
Cai JYang GLiu JXie Y(2023)FuzzyCAT: A Framework for Network Configuration Verification Based on Fuzzing2023 IEEE International Performance, Computing, and Communications Conference (IPCCC)10.1109/IPCCC59175.2023.10253841(123-131)Online publication date: 17-Nov-2023
https://doi.org/10.1109/IPCCC59175.2023.10253841
Barani MLabiche YRollet A(2023)On factors that impact the relationship between code coverage and test suite effectiveness: a survey2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW58534.2023.00071(381-388)Online publication date: Apr-2023
https://doi.org/10.1109/ICSTW58534.2023.00071
Liyanage DBöhme MTantithamthavorn CLipp SGrundy JPollock LPenta M(2023)Reachable Coverage: Estimating Saturation in FuzzingProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00042(371-383)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00042
Castillo-Barrera FDuran-Limon H(2023)Verifying contracts among software componentsInformation and Software Technology10.1016/j.infsof.2023.107282163:COnline publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1016/j.infsof.2023.107282
Vacca AFredella MDi Sorbo AVisaggio CPiattini M(2023)Functional suitability assessment of smart contracts: A survey and first proposalJournal of Software: Evolution and Process10.1002/smr.2636Online publication date: 28-Nov-2023
https://doi.org/10.1002/smr.2636
Yan MChen JCao XWu ZKang YWang Z(2023)Revisiting deep neural network test coverage from the test effectiveness perspectiveJournal of Software: Evolution and Process10.1002/smr.2561Online publication date: 16-Mar-2023
https://doi.org/10.1002/smr.2561
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten