Abstract
Mature knowledge allows engineering disciplines the achievement of predictable results. Unfortunately, the type of knowledge used in software engineering can be considered to be of a relatively low maturity, and developers are guided by intuition, fashion or market-speak rather than by facts or undisputed statements proper to an engineering discipline. Testing techniques determine different criteria for selecting the test cases that will be used as input to the system under examination, which means that an effective and efficient selection of test cases conditions the success of the tests. The knowledge for selecting testing techniques should come from studies that empirically justify the benefits and application conditions of the different techniques. This paper analyzes the maturity level of the knowledge about testing techniques by examining existing empirical studies about these techniques. We have analyzed their results, and obtained a testing technique knowledge classification based on their factuality and objectivity, according to four parameters.
Similar content being viewed by others
References
Basili, V. R., and Selby, R. W. 1985. Comparing the effectiveness of software testing strategies. Department of Computer Science. University of Maryland. Technical Report TR-1501. College Park.
Basili, V. R., and Selby, R. W. 1987. Comparing the effectiveness of software testing strategies. IEEE Transactions on Software Engineering SE-13(12): 1278–1296.
Beizer, B. 1990. Software Testing Techniques. International Thomson Computer Press, second edition.
Bible, J., Rothermel, G., and Rosenblum, D. 1999. A Comparative Study of Coarse-and Fine-Grained Safe Regression Test Selection. Computer Science Department, Oregon State University. Technical Report 99-60-05.
Bieman, J. M., and Schultz, J. L. 1992. An empirical evaluation (and specification) of the all-du-paths testing criterion. Software Engineering Journal January, 43-51.
Davis, A. 1993. Software Requirements: Objects, Functions and States. PTR Prentice Hall.
Elbaum, S., Mailshevsky, A. G., and Rothermel, G. 2000. Prioritizing test cases for regression testing. In Proceedings of the ACM SIGSOFT 2000 International Symposium on Software Testing and Analysis. Portland, Oregon, USA, August ACM, 102-112.
Frankl, P., and Iakounenko, O. 1998. Further empirical studies of test effectiveness. In Proceedings of the ACM SIGSOFT International Symposium on Foundations on Software Engineering. Lake Buena Vista, Florida, USA, 153-162.
Frankl, P. G., Weiss, S. N., and Hu, C. 1994. All-uses versus mutation: An experimental comparison of effectiveness. Polytechnic University, Computer Science Department. Technical Report. PUCS-94-100.
Frankl, P. G., Weiss, S. N., and Hu, C. 1997. All-uses vs. mutation testing: An experimental comparison of effectiveness. Journal of Systems and Software 38: September, 235–253.
Frankl, P. G., and Weiss, S. N. 1991a. An experimental comparison of the effectiveness of the all-uses and all-edges adequacy criteria. Proceedings of the Symposium on Testing, Analysis and Verification. Victoria, BC, Canada, 154-164.
Frankl, P. G., and Weiss, S. N. 1991b. Comparison of all-uses and all-edges: design, data, and analysis. Hunter College, Computer Science Department. Technical Report. CS-91-03.
Frankl, P. G., and Weiss, S. N. 1993. An experimental comparison of the effectiveness of branch testing and data flow testing. IEEE Transactions on Software Engineering 19(8): 774–787.
Graves, T. L., Harrold, M. J., Kim, J. M., Porter A., and Rothermel, G. 1998. An empirical study of regression test selection techniques. In Proceedings of the 1998 International Conference on Software Engineering. Kyoto, Japan. IEEE Computer Society Press, April, 188-197.
Hamlet, R. 1989. Theoretical comparison of testing methods. In Proceedings of the ACM SIGSOFT '89 Third Symposium on Testing, Analysis and Verification. Key West, Florida, ACM, 28-37.
Hutchins, M., Foster, H., Goradia, T., and Ostrand, T. 1994. Experiments on the effectiveness of dataflow-and controlflow-based test adequacy criteria. Proceedings of the 16th International Conference on Software Engineering. Sorrento, Italy, IEEE, 191-200.
Kamsties, E., and Lott, C. M. 1995. An empirical evaluation of three defect-detection techniques. Proceedings of the Fifth European Software Engineering Conference. Sitges, Spain.
Kim, J. M., Porter, A., and Rothermel, G. 2000. An empirical study of regression test application frequency. In Proceedings of the 22nd International Conference on Software Engineering. Limerick, Ireland: IEEE Computer Society Press, May, 126–135.
Myers, G. J. 1978. A controled experiment in program testing and code walkthroughs/inspections. Communications of the ACM. 21(9): 760–768.
Myers, G. J. 1979. The Art of Software Testing. Wiley-interscience.
Offut, A. J., Rothermel, G., and Zapf, C. 1993. An experimental evaluation of selective mutation. Proceedings of the 15th International Conference on Software Engineering. Baltimore, USA, IEEE, 100–107.
Offut, A. J., Lee, A., Rothermel, G., Untch, R. H., and Zapf, C. 1996. An experimental determination of sufficient mutant operators. ACM Transactions on Software Engineering and Methodology 5(2): 99–118.
Offut, A. J., and Lee, D. 1991. How strong is weak mutation? Proceedings of the Symposium on Testing, Analysis, and Verification. Victoria, BC, Canada, ACM, 200–213.
Offut, A. J., and Lee, S. D. 1994. An empirical evaluation of weak mutation. IEEE Transactions on Software Engineering 20(5): 337–344.
Rothermel, G., Untch, R. H., Chu, C., and Harrold, M. J. 1999. Test case prioritization: An empirical study. In Proceedings of the International Conference on Software Maintenance. September, 179-188.
Rothermel, G., and Harrold, M. J. 1998. Empirical studies of a safe regression test selection technique. IEEE Transactions on Software Engineering 24(6): 401–419, June.
Selby, R. W., and Basili, V. R. 1984. Evaluating software engineering testing strategies. Proceedings of the 9th Annual Software Engineering Workshop. NASA/GSFC, Greenbelt, MD, 42–53.
Vokolos, F., and Frankl, P. G. 1998. Empirical evaluation of the textual differencing regression testing technique. In Proceedings of the International Conference on Software Maintenance (ICSM-98). November, Bethesda, Maryland: IEEE Computer Society Press.
Weyuker, E. 1988. An empirical study of the complexity of data flow testing. Proceedings 2nd Workshop on Software Testing, Verification and Analysis. Banff, Canada, 188-195.
Weyuker, E. J. 1990. The cost of data flow testing: An empirical study. IEEE Transactions on Software Engineering 16(2): 121–128.
Wong, W. E., Horgan, J. R., London, S., and Agrawal, H. 1998. A study of effective regression testing in practice. In Proceedings of the 8th International Symposium on Software Reliability Engineering (ISSRE'97). Bethesda, Maryland: IEEE Computer Society Press, November, 264–274.
Wong, E., and Mathur, A. P. 1995. Fault detection effectiveness of mutation and data-flow testing. Software Quality Journal 4: 69–83.
Wood, M., Roper, M., Brooks, A., and Miller, J. 1997. Comparing and combining software defect detection techniques: A replicated empirical study. Proceedings of the 6th European Software Engineering Conference. Zurich, Switzerland.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Juristo, N., Moreno, A.M. & Vegas, S. Reviewing 25 Years of Testing Technique Experiments. Empirical Software Engineering 9, 7–44 (2004). https://doi.org/10.1023/B:EMSE.0000013513.48963.1b
Issue Date:
DOI: https://doi.org/10.1023/B:EMSE.0000013513.48963.1b