skip to main content
10.1145/2483760.2483769acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Comparing non-adequate test suites using coverage criteria

Published:15 July 2013Publication History

ABSTRACT

A fundamental question in software testing research is how to compare test suites, often as a means for comparing test-generation techniques. Researchers frequently compare test suites by measuring their coverage. A coverage criterion C provides a set of test requirements and measures how many requirements a given suite satisfies. A suite that satisfies 100% of the (feasible) requirements is C-adequate.

Previous rigorous evaluations of coverage criteria mostly focused on such adequate test suites: given criteria C and C′, are C-adequate suites (on average) more effective than C′-adequate suites? However, in many realistic cases producing adequate suites is impractical or even impossible. We present the first extensive study that evaluates coverage criteria for the common case of non-adequate test suites: given criteria C and C′, which one is better to use to compare test suites? Namely, if suites T1, T2 . . . Tn have coverage values c1, c2 . . . cn for C and c′1, c′2 . . . c′n for C′, is it better to compare suites based on c1, c2 . . . cn or based on c′1, c′ 2 . . . c′n?

We evaluate a large set of plausible criteria, including statement and branch coverage, as well as stronger criteria used in recent studies. Two criteria perform best: branch coverage and an intra-procedural acyclic path coverage.

References

  1. M. Adolfsen. Industrial validation of test coverage quality. Master’s thesis, University of Twente, 2011.Google ScholarGoogle Scholar
  2. P. Ammann and J. Offutt. Introduction to Software Testing. Cambridge University Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. H. Andrews, L. C. Briand, and Y. Labiche. Is mutation an appropriate tool for testing experiments? In International Conference on Software Engineering, pages 402–411, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. H. Andrews, L. C. Briand, Y. Labiche, and A. S. Namin. Using mutation analysis for assessing and comparing testing coverage criteria. Trans. Softw. Eng., 32:608–624, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Arcuri and L. C. Briand. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In International Conference on Software Engineering, pages 1–10, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Ball. A theory of predicate-complete test coverage and generation. Technical Report MSR-TR-2004-28, Microsoft Research, 2004.Google ScholarGoogle Scholar
  7. T. Ball. A theory of predicate-complete test coverage and generation. In Formal Methods for Components and Objects, pages 1–22. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Ball and J. R. Larus. Efficient path profiling. In International Symposium on Microarchitecture, pages 46–57, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  9. T. Ball and S. Rajamani. Automatically validating temporal safety properties of interfaces. In Workshop on Model Checking of Software, pages 103–122, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. X. Cai and M. R. Lyu. The effect of code coverage on fault detection under different testing profiles. In International Workshop on Advances in Model-Based Testing, pages 1–7, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chaki, E. M. Clarke, A. Groce, and O. Strichman. Predicate abstraction with minimum predicates. In Correct Hardware Design and Verification Methods, pages 19–34, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  12. S. Chaki, A. Groce, and O. Strichman. Explaining abstract counterexamples. In Symposium on the Foundations of Software Engineering, pages 73–82, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. M. Chilimbi, B. Liblit, K. Mehra, A. V. Nori, and K. Vaswani. Holmes: Effective statistical debugging via efficient path profiling. In International Conference on Software Engineering, pages 34–44, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Cliff. Ordinal Methods for Behavioral Data Analysis. Pyschology Press, 1996.Google ScholarGoogle Scholar
  15. Count lines of code. http://cloc.sourceforge.net/.Google ScholarGoogle Scholar
  16. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Third Edition. The MIT Press, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H. L. Costner. Criteria for measures of association. American Sociological Review, 3, 1965.Google ScholarGoogle Scholar
  18. Instrumented container classes - predicate coverage. http://mir.cs.illinois.edu/coverage/.Google ScholarGoogle Scholar
  19. R. A. DeMillo, R. J. Lipton, and F. G. Sayward. Hints on test data selection: Help for the practicing programmer. Computer, 11:34–41, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Do, S. G. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Softw. Engg., 10:405–435, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. G. Frankl and O. Iakounenko. Further empirical studies of test effectiveness. In Symposium on the Foundations of Software Engineering, pages 153–162, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. G. Frankl and S. N. Weiss. An experimental comparison of the effectiveness of branch testing and data flow testing. Trans. Software Eng., 19:774–787, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. P. Galeotti, N. Rosner, C. G. López Pombo, and M. F. Frias. Analysis of invariants for efficient bounded verification. In International Symposium on Software Testing and Analysis, pages 25–36, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Godefroid. Compositional dynamic test generation. In Symposium on Principles of Programming Languages, pages 47–54, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Groce. (Quickly) testing the tester via path coverage. In Workshop on Dynamic Analysis, pages 22–28, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Groce. Coverage rewarded: Test input generation via adaptation-based programming. In International Conference on Automated Software Engineering, pages 380–383, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Groce, A. Fern, J. Pinto, T. Bauer, M. A. Alipour, M. Erwig, and C. Lopez. Lightweight automated testing with adaptation-based programming. In International Symposium on Software Reliability Engineering, pages 161–170, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Groce, G. Holzmann, and R. Joshi. Randomized differential testing as a prelude to formal verification. In International Conference on Software Engineering, pages 621–631, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis, pages 78–88, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. P. Guilford. Fundamental Statistics in Pyschology and Education. McGraw-Hill, 1956.Google ScholarGoogle Scholar
  31. A. Gupta and P. Jalote. An approach for experimentally evaluating effectiveness and efficiency of coverage criteria for software testing. Softw. Tools Technol. Transf., 10:145–160, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. G. Hamlet. Testing programs with the aid of a compiler. Trans. Softw. Eng., 3:279–290, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Harder, J. Mellen, and M. D. Ernst. Improving test suites via operational abstraction. In International Conference on Software Engineering, pages 60–71, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. M. Hassan and J. H. Andrews. Comparing multi-point stride coverage and dataflow coverage. In International Conference on Software Engineering, pages 172–181, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction. In Symposium on Principles of Programming Languages, pages 58–70, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In International Conference on Software Engineering, pages 191–200, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. JFreeChart Home Page. http://www.jfree.org/ jfreechart/.Google ScholarGoogle Scholar
  38. Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. Trans. Soft. Eng., 37:649–678, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. JodaTime Home Page. http://joda-time. sourceforge.net/.Google ScholarGoogle Scholar
  40. M. Kendall. A new measure of rank correlation. Biometrika, 1-2:81–89, 1938.Google ScholarGoogle ScholarCross RefCross Ref
  41. J. R. Larus. Whole program paths. In Programming Language Design and Implementation, pages 259–269, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. S. Namin and J. H. Andrews. The influence of size and coverage on test suite effectiveness. In International Symposium on Software Testing and Analysis, pages 57–68, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. A. J. Offutt, G. Rothermel, and C. Zapf. An experimental evaluation of selective mutation. In International Conference on Software Engineering, pages 100–107, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In International Conference on Software Engineering, pages 75–84, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. Rothermel, R. Untch, C. Chu, and M. J. Harrold. Test case prioritization. Trans. Softw. Eng., 27:929–948, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. D. Schuler and A. Zeller. Javalanche: efficient mutation testing for Java. In Symposium on the Foundations of Software Engineering, pages 297–298, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. R. Sharma, M. Gligoric, A. Arcuri, G. Fraser, and D. Marinov. Testing container classes: Random or systematic? In Fundamental Approaches to Software Engineering, pages 262–277, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Sharma, M. Gligoric, V. Jagannath, and D. Marinov. A comparison of constraint-based and sequence-based generation of complex input data structures. In Software Testing, Verification, and Validation Workshops, pages 337–342, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. A. Siami Namin, J. H. Andrews, and D. J. Murdoch. Sufficient mutation operators for measuring test effectiveness. In International Conference on Software Engineering, pages 351–360, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. SQLite Home Page. http://www.sqlite.org/.Google ScholarGoogle Scholar
  51. W. Visser, C. S. Pasareanu, and R. Pelánek. Test input generation for Java containers using state matching. In International Symposium on Software Testing and Analysis, pages 37–48, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. M. Vittek, P. Borovansky, and P.-E. Moreau. A simple generic library for C. In International Conference on Software Reuse, pages 423–426, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. F. I. Vokolos and P. G. Frankl. Empirical evaluation of the textual differencing regression testing technique. In International Conference on Software Maintenance, pages 44–53, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. T. Wang and A. Roychoudhury. Automated path generation for software fault localization. In International Conference on Automated Software Engineering, pages 347–351, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. W. Wong, J. Horgan, S. London, and A. Mathur. Effect of test set size and block coverage on the fault detection effectiveness. In International Symposium on Software Reliability, pages 230–238, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  56. W. Wong, J. Horgan, S. London, and A. Mathur. Effect of test set minimization on fault detection effectiveness. In International Conference on Software Engineering, pages 41–50, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. YAFFS: A flash file system for embedded use. http:// www.yaffs.net.Google ScholarGoogle Scholar
  58. L. Zhang, S.-S. Hou, J.-J. Hu, T. Xie, and H. Mei. Is operator-based mutant selection superior to random mutant selection? In International Conference on Software Engineering, pages 435–444, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Comparing non-adequate test suites using coverage criteria

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISSTA 2013: Proceedings of the 2013 International Symposium on Software Testing and Analysis
      July 2013
      381 pages
      ISBN:9781450321594
      DOI:10.1145/2483760

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 July 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate58of213submissions,27%

      Upcoming Conference

      ISSTA '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader