Abstract
Coverage criteria provide a useful and widely used means to guide software testing; however, indiscriminately pursuing full coverage may not always be convenient or meaningful, as not all entities are of interest in any usage context. We aim at introducing a more meaningful notion of coverage that takes into account how the software is going to be used. Entities that are not going to be exercised by the user should not contribute to the coverage ratio. We revisit the definition of coverage measures, introducing a notion of relative coverage. According to this notion, we provide a definition and a theoretical framework of relative coverage, within which we discuss implications on testing theory and practice. Through the evaluation of three different instances of relative coverage, we could observe that relative coverage measures provide a more effective strategy than traditional ones: we could reach higher coverage measures, and test cases selected by relative coverage could achieve higher reliability. We hint at several other useful implications of relative coverage notion on different aspects of software testing.
- Andrea Arcuri and Gordon Fraser. 2011. On parameter tuning in search based software engineering. In Proceedings of the International Symposium on Search Based Software Engineering. 33--47.Google ScholarCross Ref
- Matthew Arnold and Barbara G. Ryder. 2001. A framework for reducing the cost of instrumented code. ACM SIGPLAN Notices 36, 5 (2001), 168--179.Google ScholarDigital Library
- Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The oracle problem in software testing: A survey. IEEE Transactions on Software Engineering 41, 5 (2015), 507--525.Google ScholarDigital Library
- Cesare Bartolini, Antonia Bertolino, Sebastian Elbaum, and Eda Marchetti. 2011. Bringing white-box testing to service oriented architectures through a service oriented approach. Journal of Systems and Software 84, 4 (2011), 655--668.Google ScholarDigital Library
- Antonia Bertolino, Breno Miranda, Roberto Pietrantuono, and Stefano Russo. 2017. Adaptive coverage and operational profile-based testing for reliability improvement. In Proceedings of the 39th International Conference on Software Engineering (ICSE’17). IEEE, Los Alamitos, CA, 541--551. DOI:https://doi.org/10.1109/ICSE.2017.56Google ScholarDigital Library
- A. Bertolino, B. Miranda, R. Pietrantuono, and S. Russo. 2019. Adaptive test case allocation, selection and generation using coverage spectrum and operational profile. IEEE Transactions on Software Engineering. Early access. DOI:https://doi.org/10.1109/TSE.2019.2906187Google Scholar
- Antonia Bertolino and Lorenzo Strigini. 1996. On the use of testability measures for dependability assessment. IEEE Transactions on Software Engineering 22, 2 (1996), 97--108. DOI:https://doi.org/10.1109/32.485220Google ScholarDigital Library
- Lionel C. Briand, Yvan Labiche, and Michal M. Sówka. 2006. Automated, contract-based user testing of commercial-off-the-shelf components. In Proceedings of the 28th International Conference on Software Engineering (ICSE’06). ACM, New York, NY, 92--101. DOI:https://doi.org/10.1145/1134285.1134300Google Scholar
- Laura Brandán Briones, Ed Brinksma, and Mariëlle Stoelinga. 2006. A semantic framework for test coverage. In Proceedings of the International Symposium on Automated Technology for Verification and Analysis. 399--414.Google ScholarDigital Library
- Cristian Cadar and Koushik Sen. 2013. Symbolic execution for software testing: Three decades later. Communications of the ACM 56, 2 (Feb. 2013), 82--90. DOI:https://doi.org/10.1145/2408776.2408795Google ScholarDigital Library
- Gerardo Canfora, Aniello Cimitile, and Andrea De Lucia. 1998. Conditioned program slicing. Information and Software Technology 40, 11 (1998), 595--607.Google ScholarCross Ref
- Boyuan Chen, Jian Song, Peng Xu, Xing Hu, and Zhen Ming Jack Jiang. 2018. An automated approach to estimating code coverage measures via execution logs. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, New York, NY, 305--316.Google ScholarDigital Library
- Myra B. Cohen, Matthew B. Dwyer, and Jiangfan Shi. 2006. Coverage and adequacy in software product line testing. In Proceedings of the ISSTA 2006 Workshop on Role of Software Architecture for Testing and Analysis (ROSATEA’06). ACM, New York, NY, 53--63. DOI:https://doi.org/10.1145/1147249.1147257Google ScholarDigital Library
- Domenico Cotroneo, Roberto Natella, and Stefano Russo. 2009. Assessment and improvement of hang detection in the Linux operating system. In Proceedings of the 28th IEEE International Symposium on Reliable Distributed Systems (SRDS’09). IEEE, Los Alamitos, CA, 288--294.Google ScholarDigital Library
- Andrea De Lucia. 2001. Program slicing: Methods and applications. In Proceedings of the 1st IEEE International Workshop on Source Code Analysis and Manipulation. IEEE, Los Alamitos, CA, 142--149.Google ScholarCross Ref
- Arilo C. Dias Neto, Rajesh Subramanyan, Marlon Vieira, and Guilherme H. Travassos. 2007. A survey on model-based testing approaches: A systematic review. In Proceedings of the 1st ACM International Workshop on Empirical Assessment of Software Engineering Languages, and Technologies: Held in Conjunction with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). ACM, New York, NY, 31--36.Google Scholar
- Gencer Erdogan, Yan Li, Ragnhild Kobro Runde, Fredrik Seehusen, and Ketil Stølen. 2014. Approaches for the combined use of risk analysis and testing: A systematic literature review. International Journal on Software Tools for Technology Transfer 16, 5 (Oct. 2014), 627--642. DOI:https://doi.org/10.1007/s10009-014-0330-5Google ScholarDigital Library
- Michael Felderer and Rudolf Ramler. 2014. Integrating risk-based testing in industrial test processes. Software Quality Journal 22, 3 (Sept. 2014), 543--575. DOI:https://doi.org/10.1007/s11219-013-9226-yGoogle ScholarCross Ref
- Michael Felderer and Ina Schieferdecker. 2014. A taxonomy of risk-based testing. International Journal on Software Tools for Technology Transfer 16, 5 (Oct. 2014), 559--568. DOI:https://doi.org/10.1007/s10009-014-0332-3Google Scholar
- P. G. Frankl, R. G. Hamlet, Bev Littlewood, and L. Strigini. 1998. Evaluating testing methods by delivered reliability. IEEE Transactions on Software Engineering 24, 8 (Aug. 1998), 586--601. DOI:https://doi.org/10.1109/32.707695Google ScholarDigital Library
- P. G. Frankl and E. J. Weyuker. 1988. An applicable family of data flow testing criteria. IEEE Transactions on Software Engineering 14, 10 (Oct. 1988), 1483--1498.Google ScholarDigital Library
- Phyllis G. Frankl and Elaine J. Weyuker. 1993. A formal analysis of the fault-detecting ability of testing methods. IEEE Transactions on Software Engineering 19, 3 (1993), 202--213.Google ScholarDigital Library
- Gregory Gay. 2018. To call, or not to call: Contrasting direct and indirect branch coverage in test generation. In Proceedings of the 11th International Workshop on Search-Based Software Testing. ACM, New York, NY, 43--50.Google ScholarDigital Library
- Markus Geimer, Sameer S. Shende, Allen D. Malony, and Felix Wolf. 2009. A generic and configurable source-code instrumentation component. In Proceedings of the International Conference on Computational Science. 696--705.Google ScholarDigital Library
- Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). ACM, New York, NY, 213--223. DOI:https://doi.org/10.1145/1065010.1065036Google ScholarDigital Library
- J. B. Goodenough and S. L. Gerhart. 1975. Toward a theory of test data selection. IEEE Transactions on Software Engineering SE-1, 2 (1975), 156--173. DOI:https://doi.org/10.1109/TSE.1975.6312836Google ScholarDigital Library
- J. B. Goodenough and S. L. Gerhart. 1977. Toward a theory of testing: Data selection criteria. Current Trends in Programming Methodology 2, 2 (1977), 44--79.Google Scholar
- John S. Gourlay. 1983. A mathematical framework for the investigation of testing. IEEE Transactions on Software Engineering6 (1983), 686--709.Google ScholarDigital Library
- Robert J. Hall. 1995. Automatic extraction of executable program subsets by simultaneous dynamic program slicing. Automated Software Engineering 2, 1 (1995), 33--53.Google ScholarCross Ref
- Mary Jean Harrold, Gregg Rothermel, Rui Wu, and Liu Yi. 1998. An empirical investigation of program spectra. In Proceedings of the SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE’98). 83--90. DOI:https://doi.org/10.1145/277631.277647Google ScholarDigital Library
- W. E. Howden. 1976. Reliability of the path analysis testing strategy. IEEE Transactions on Software Engineering 2, 3 (May 1976), 208--215. DOI:https://doi.org/10.1109/TSE.1976.233816Google Scholar
- Gunel Jahangirova, David Clark, Mark Harman, and Paolo Tonella. 2016. Test oracle assessment and improvement. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA’16). ACM, New York, NY, 247--258. DOI:https://doi.org/10.1145/2931037.2931062Google ScholarDigital Library
- Baris Kasikci, Thomas Ball, George Candea, John Erickson, and Madanlal Musuvathi. 2014. Efficient tracing of cold code via bias-free sampling. In Proceedings of the USENIX Annual Technical Conference. 243--254.Google Scholar
- Bogdan Korel and Janusz Laski. 1988. Dynamic program slicing. Information Processing Letters 29, 3 (1988), 155--163.Google ScholarDigital Library
- Rupak Majumdar and Koushik Sen. 2007. Hybrid concolic testing. In Proceedings of the 29th International Conference on Software Engineering (ICSE’07). IEEE, Los Alamitos, CA, 416--426.Google ScholarDigital Library
- Matthias Meitner and Francesca Saglietti. 2014. Target-specific adaptations of coupling-based software reliability testing. In Measurement, Modelling, and Evaluation of Computing Systems and Dependability and Fault Tolerance. Springer, 192--206.Google Scholar
- Breno Miranda. 2014. A proposal for revisiting coverage testing metrics. In Proceedings of the ACM/IEEE International Conference on Automated Software Engineering (ASE’14). 899--902. DOI:https://doi.org/10.1145/2642937.2653471Google ScholarDigital Library
- Breno Miranda. 2016. Redefining and Evaluating Coverage Criteria Based on the Testing Scope. Ph.D. Dissertation. University of Pisa, Italy.Google Scholar
- Breno Miranda and Antonia Bertolino. 2014. Social coverage for customized test adequacy and selection criteria. In Proceedings of the 9th International Workshop on Automation of Software Test (AST’14). 22--28. DOI:https://doi.org/10.1145/2593501.2593505Google ScholarDigital Library
- Breno Miranda and Antonia Bertolino. 2015. Improving test coverage measurement for reused software. In Proceedings of the 41st Euromicro Conference on Software Engineering and Advanced Applications (EUROMICRO-SEAA’15). 27--34. DOI:https://doi.org/10.1109/SEAA.2015.69Google ScholarDigital Library
- Breno Miranda and Antonia Bertolino. 2016. Does code coverage provide a good stopping rule for operational profile based testing? In Proceedings of the 11th International Workshop on Automation of Software Test (AST’16). ACM, New York, NY, 22--28. DOI:https://doi.org/10.1145/2896921.2896934Google ScholarDigital Library
- Breno Miranda and Antonia Bertolino. 2017. Scope-aided test prioritization, selection and minimization for software reuse. Journal of Systems and Software 131 (2017), 528--549. DOI:https://doi.org/10.1016/j.jss.2016.06.058Google ScholarDigital Library
- Breno Miranda and Antonia Bertolino. 2018. An assessment of operational coverage as both an adequacy and a selection criterion for operational profile based testing. Software Quality Journal 26, 4 (Dec. 2018), 1571--1594. DOI:https://doi.org/10.1007/s11219-017-9388-0Google ScholarDigital Library
- John D. Musa. 1993. Operational profiles in software-reliability engineering. IEEE Software 10, 2 (1993), 14--32.Google ScholarDigital Library
- John D. Musa. 1996. Software-reliability-engineered testing. IEEE Computer 29, 11 (1996), 61--68.Google ScholarDigital Library
- Alessandro Orso, Donglin Liang, Mary Jean Harrold, and Richard Lipton. 2002. Gamma system: Continuous evolution of software after deployment. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’02). ACM, New York, NY, 65--69. DOI:https://doi.org/10.1145/566172.566182Google ScholarDigital Library
- Pablo Ponzio, Nazareno Aguirre, Marcelo F. Frias, and Willem Visser. 2016. Field-exhaustive testing. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, New York, NY, 908--919.Google ScholarDigital Library
- Rudolf Ramler, Theodorich Kopetzky, and Wolfgang Platz. 2012. Value-based coverage measurement in requirements-based testing: Lessons learned from an approach implemented in the TOSCA testsuite. In Proceedings of the 2012 38th Euromicro Conference on Software Engineering and Advanced Applications. IEEE, Los Alamitos, CA, 363--366.Google ScholarDigital Library
- T. Ravichandran and Marcus A. Rothenberger. 2003. Software reuse strategies and component markets. Communications of the ACM 46, 8 (Aug. 2003), 109--114. DOI:https://doi.org/10.1145/859670.859678Google ScholarDigital Library
- Sanjai Rayadurgam and Mats Per Erik Heimdahl. 2001. Coverage based test-case generation using model checkers. In Proceedings of the 8th Annual IEEE International Conference and Workshop on the Engineering of Computer-Based Systems (ICBS’01). IEEE, Los Alamitos, CA, 83--91.Google ScholarCross Ref
- Björn Regnell, Per Runeson, and Claes Wohlin. 2000. Towards integration of use case modelling and usage-based testing. Journal of Systems and Software 50, 2 (2000), 117--130.Google ScholarDigital Library
- Thomas Reps. 1998. Program analysis via graph reachability. Information and Software Technology 40, 11–12 (1998), 701--726.Google ScholarCross Ref
- José Miguel Rojas, José Campos, Mattia Vivanti, Gordon Fraser, and Andrea Arcuri. 2015. Combining multiple coverage criteria in search-based unit test generation. In Proceedings of the International Symposium on Search Based Software Engineering. 93--108.Google ScholarCross Ref
- D. S. Rosenblum. 1997. Adequate Testing of Component-Based Software. Technical Report UCI-ICS-97-34. University of California, Irvine.Google Scholar
- Gregg Rothermel and Mary Jean Harrold. 1997. A safe, efficient regression test selection technique. ACM Transactions on Software Engineering and Methodology 6, 2 (1997), 173--210.Google ScholarDigital Library
- Per Runeson and Claes Wohlin. 1995. Statistical usage testing for software reliability control. Informatica 19, 2 (1995), 195--207.Google Scholar
- Koushik Sen, Darko Marinov, and Gul Agha. 2005. CUTE: A concolic unit testing engine for C. SIGSOFT Software Engineering Notes 30, 5 (Sept. 2005), 263--272. DOI:https://doi.org/10.1145/1095430.1081750Google ScholarDigital Library
- Carol Smidts, Chetan Mutha, Manuel Rodríguez, and Matthew J. Gerber. 2014. Software testing with operational profile—OP definition. ACM Computing Surveys 46, 3 (Feb. 2014), Article 39, 39 pages. DOI:https://doi.org/10.1145/2518106Google ScholarDigital Library
- Matt Staats, Gregory Gay, Michael Whalen, and Mats Heimdahl. 2012. On the danger of coverage directed test case generation. In Fundamental Approaches to Software Engineering. Lecture Notes in Computer Science, Vol. 7212. Springer, 409--424. DOI:https://doi.org/10.1007/978-3-642-28872-2_28Google Scholar
- Matt Staats, Michael W. Whalen, and Mats P. E. Heimdahl. 2011. Programs, tests, and oracles: The foundations of testing revisited. In Proceedings of the 33rd International Conference on Software Engineering. ACM, New York, NY, 391--400.Google Scholar
- Tomohiko Takagi, Kazuya Nishimachi, Masayuki Muragishi, Takashi Mitsuhashi, and Zengo Furukawa. 2009. Usage distribution coverage: What percentage of expected use has been executed in software testing? In Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Springer, 57--67.Google Scholar
- Mustafa M. Tikir and Jeffrey K. Hollingsworth. 2002. Efficient instrumentation for code coverage testing. ACM SIGSOFT Software Engineering Notes 27, 86--96.Google ScholarDigital Library
- R. Torkar and S. Mankefors. 2003. A survey on testing and reuse. In Proceedings of the IEEE International Conference onSoftware Science, Technology, and Engineering (SwSTE’03). 164--173. DOI:https://doi.org/10.1109/SWSTE.2003.1245437Google Scholar
- Guda A. Venkatesh. 1991. The semantic approach to program slicing. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation. 107--119.Google ScholarDigital Library
- J. Voas, L. Morell, and K. Miller. 1991. Predicting where faults can hide from testing. IEEE Software 8, 2 (March 1991), 41--48. DOI:https://doi.org/10.1109/52.73748Google ScholarDigital Library
- Jeffrey M. Voas. 1992. PIE: A dynamic failure-based technique. IEEE Transactions on Software Engineering 18, 8 (1992), 717--727. DOI:https://doi.org/10.1109/32.153381Google ScholarDigital Library
- Mark David Weiser. 1979. Program Slices: Formal, Psychological, and Practical Investigations of an Automatic Program Abstraction Method. Ph.D. Dissertation, University of Michigan.Google Scholar
- E. J. Weyuker. 1998. Testing component-based software: A cautionary tale. IEEE Software 15, 5 (Sept. 1998), 54--59. DOI:https://doi.org/10.1109/52.714817Google ScholarDigital Library
- E. J. Weyuker and T. J. Ostrand. 1980. Theories of program testing and the application of revealing subdomains. IEEE Transactions on Software Engineering SE-6, 3 (1980), 236--246. DOI:https://doi.org/10.1109/TSE.1980.234485Google ScholarDigital Library
- Yucheng Zhang and Ali Mesbah. 2015. Assertions are strongly correlated with test suite effectiveness. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’15). 214--224.Google ScholarDigital Library
- Hong Zhu, Patrick A. V. Hall, and John H. R. May. 1997. Software unit test coverage and adequacy. ACM Computing Surveys 29, 4 (1997), 366--427.Google ScholarDigital Library
Index Terms
- Testing Relative to Usage Scope: Revisiting Software Coverage Criteria
Recommendations
An assessment of operational coverage as both an adequacy and a selection criterion for operational profile based testing
While the relation between code coverage measures and fault detection is actively studied, only few works have investigated the correlation between measures of coverage and of reliability. In this work, we introduce a novel approach to measuring code ...
A proposal for revisiting coverage testing metrics
ASE '14: Proceedings of the 29th ACM/IEEE International Conference on Automated Software EngineeringTest coverage information can be very useful for guiding testers in enhancing their test suites to exercise possible uncovered entities and in deciding when to stop testing. Since the concept of test criterion was born, several contributions have been ...
Does code coverage provide a good stopping rule for operational profile based testing?
AST '16: Proceedings of the 11th International Workshop on Automation of Software TestWe introduce a new coverage measure, called the operational coverage, which is customized to the usage profile (count spectrum) of the entities to be covered. Operational coverage is proposed as an adequacy criterion for operational profile based ...
Comments