Skip to main content
Log in

Using program data-state scarcity to guide automatic test data generation

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Finding test data to cover structural test coverage criteria such as branch coverage is largely a manual and hence expensive activity. A potential low cost alternative is to generate the required test data automatically. Search-based test data generation is one approach that has attracted recent interest. This approach is based on the definition of an evaluation or cost function that is able to discriminate between candidate test cases with respect to achieving a given test goal. The cost function is implemented by appropriate instrumentation of the program under test. The candidate test is then executed on the instrumented program. This provides an evaluation of the candidate test in terms of the “distance” between the computation achieved by the candidate test and the computation required to achieve the test goal. Providing the cost function is able to discriminate reliably between candidate tests that are close or far from covering the test goal and the goal is feasible, a search process is able to converge to a solution, i.e., a test case that satisfies the coverage goal. For some programs, however, an informative cost function is difficult to define. The operations performed by these programs are such that the cost function returns a constant value for a very wide range of inputs. A typical example of this problem arises in the instrumentation of branch predicates that depend on the value of a Boolean-valued (flag) variable although the problem is not limited to programs that contain flag variables. Although methods are known for overcoming the problems of flag variables in particular cases, the more general problem of a near constant cost function has not been tackled. This paper presents a new heuristic for directing the search when the cost function at a test goal is not able to differentiate between candidate test inputs. The heuristic directs the search toward test cases that produce rare or scarce data states. Scarce inputs for the cost function are more likely to produce new cost values. The proposed method is evaluated empirically for a number of example programs for which existing methods are inadequate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

References

  • Alshraideh, M., & Bottaci, L. (2006). Automatic software test data generation for string data using heuristic search with domain specific search operators. Software Testing, Verification and Reliability, 16(3), 175–203.

    Article  Google Scholar 

  • Baresel, A., Harman, M., Binkley, D., & Korel, B. (2004). Evolutionary testing in the presence of loop-assigned flags: A testability transformation approach. In G. S. Avrunin & G. Rothermel (Eds.), Proceedings of 2004 ACM SIGSOFT international symposium on software testing and analysis, ISSTA 2004, (pp. 108–118). New York: ACM.

  • Booker, L. B. (1982). Intelligent behavior as an adaptation to the task environment. PhD thesis, Ann Arbor, MI: The University of Michigan.

  • Bottaci, L. (2002). Instrumenting programs with flag variables for test data search by genetic algorithm. In Proceedings of genetic and evolutionary computation conference (GECCO 2002), (pp. 1337–1342). San Mateo, CA: Morgan Kaufmann.

  • Bottaci, L. (2003). Predicate expression cost functions to guide evolutionary search for test data. In Proceedings of genetic and evolutionary computation conference (GECCO 2003), (pp. 2455–2464). Chicago, IL: Springer Verlag.

  • Bottaci, L. (2005). Use of branch cost functions to diversify the search for test data. In Proceedings of the UK software testing workshop (UKTest 2005), (pp. 151–163). UK: University of Sheffield.

  • Bouchachia, A. (2007). An immune genetic algorithm for software test data generation. In Proceedings of the 7th international conference on hybrid intelligent systems (HIS 2007), (pp. 84–89). Washington, DC.

  • Bueno, P., Paulo, M. S., Wong, W. E., & Jino, M. (2007). Improving random test sets using the diversity oriented test data generation. In Proceedings of the 2nd international workshop on random testing: co-located with the 22nd IEEE/ACM international conference on automated software engineering (ASE 2007), (pp. 10–17). New York, NY: ACM.

  • Chen, T. Y., Kuo, F. C., Merkel, R. G., & Tse, T. (2009). Adaptive random testing: The art of test case diversity. Journal of Systems and Software (in press), Corrected Proof:–, doi:10.1016/j.jss.2009.02.022, http://www.sciencedirect.com/science/article/B6V0N-4VRX633-1/2/50b8286d83d4352947ae21e1c6930c00

  • Collins, R. J., & Jefferson, D. R. (1991). Selection in massively parallel genetic. In Belew, R. K., Booker, L. B. (Eds.), Proceedings of the fourth international on genetic algorithms, (pp. 249–256). San Mateo, CA: Morgan Kaufmann.

  • Coward, P. D. (1991). Symbolic execution and testing. Information and Software Technique, 33(1), 53–64.

    Article  Google Scholar 

  • Davis, L. (1991). Handbook of Genetic Algorithms. NY: International Thomson Computer Press.

    Google Scholar 

  • de Halleux, J., & Tillmann, N. (2008). Parameterized unit testing with pex. In B. Beckert & R. Hähnle (Eds.), Tests and proofs, proceedings of the second international conference, TAP 2008, (Vol. 4966, pp. 171–181). Springer, Lecture Notes in Computer Science.

  • De Jong, L. A. (1975). An analysis of the behavior of a class of genetic adaptive systems. PhD thesis, Ann Arbor, MI: The University of Michigan.

  • De Jong, E. D., Watson, R., & Pollack, J. (2001). Reducing bloat and promoting diversity using multi-objective methods. In L. Spector et al. (Eds.), Proceedings of the genetic and evolutionary computation conference, (pp. 11–18). San Francisco, CA: Morgan Kaufmann.

  • Deb, K., & Goldberg, D. E. (1989). An investigation of niche and species formation in genetic function optimization. In Proceedings of the third international conference on genetic algorithms, (pp. 42–50). San Mateo, CA: Morgan Kaufmann.

  • Duran, J. E., & Ntafos, S. C. (1984). An evaluation of random testing. IEEE Transactions on Software Engineering, 10(4), 438–444.

    Article  Google Scholar 

  • Feldt, R., Torkar, R., Gorschek, T., & Afzal, W. (2008). Searching for cognitively diverse tests: Towards universal test diversity metrics. In IEEE international conference on software testing verification and validation workshop, ICSTW ’08, (pp. 178–186).

  • Ferguson, R., & Korel, B. (1996). The chaining approach for software test data generation. ACM Transactions on Software Engineering and Methodology, 5(1), 63–86.

    Article  Google Scholar 

  • Goldberg, D., & Richardson, J. (1987). Genetic algorithms with sharing for multimodal function optimization. In Proceedings of the second international conference on genetic algorithms, (pp. 148–154). San Francisco, CA: Morgan Kaufmann.

  • Goldberg, D. E. (1989). Genetic Algorithms in Search Optimization and Machine Learning. Reading, MA: Addison Wesley.

    MATH  Google Scholar 

  • Harman, M., Hu, L., Hierons, R., Baresel, A., & Sthamer, H. (2002). Improving evolutionary testing by flag removal. In Proceedings of genetic and evolutionary computation conference GECCO 2002, New York, USA, pp. 1351–1358.

  • Harman, M., Hu, L., Hierons, R., Wegener, J., Baresel, A., Sthamer, H., & Roper, M. (2004). Testability transformation. IEEE Transactions on Software Engineering, 30(1), 3–16.

    Article  Google Scholar 

  • Jones, B.F., Sthamer, H., & Eyres, D. (1996). Automatic structural testing using genetic algorithms. Software Engineering Journal, 11(5), 299–306.

    Article  Google Scholar 

  • Keijzer, M. (1996). Efficiently representing populations in genetic programming. In Angeline, P., Kinnear, K. Jr. (Eds.), Advances in Genetic Programming, (Vol. 2, pp. 259–278), Cambridge: MIT Press.

    Google Scholar 

  • Korel, B. (1990). Automated software test data generation. IEEE Transactions on Software Engineering, 16(8), 870–879.

    Article  Google Scholar 

  • Korel, B., Harman, M., Chung, S., Apirukvorapinit, P., Gupta, R., & Zhang, Q. (2005). Data dependence based testability transformation in automated test generation. In 16th IEEE international symposium on software reliability engineering (pp. 245–254)

  • Koza, J. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  • Langdon, W. B. (1998). Genetic programming and data structures: Genetic programming + data structures = automatic programming!, genetic programming, (Vol. 1). Boston: Kluwer.

  • MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of fifth berkeley symposium on mathematical statistics and probability, (Vol. 1, pp. 281–297). Berkeley: University of California Press.

  • Malaiya, Y. (1995). Antirandom testing: Getting the most out of black-box testing. In Proceeding of international symposium on software reliability engineering, (pp. 86–95). citeseer.ist.psu.edu/article/malaiya95antirandom.html.

  • Mattiussi, C., Waibel, M., & Floreano, D. (2004). Measures of diversity for populations and distances between individuals with highly reorganizable genomes. Evolutionary Computation, 12(4), 495–515.

    Article  Google Scholar 

  • McMinn, P. (2004). Search-based software test data generation: A survey. Software Testing, Verification and Reliability, 14(2), 105–156.

    Article  Google Scholar 

  • McMinn, P., Binkley, D., Harman, M., & Tonella, P. (2006). The species per path approach to searchbased test data generation. In proceedings of the international symposium on software testing and analysis (ISSTA 2006), (pp. 13–24). Portland, ME.

  • O’Reilly, U. M. (1997). Using a distance metric on genetic programs to understand genetic operators. In IEEE international conference on systems, man, and cybernetics, computational cybernetics and simulation, 5, 4092–4097.

  • Roper, M. (1997). Computer aided software testing using genetic algorithms. In Proceeding of the 10th international software quality week conference, San Francisco, USA, pp. 27–30.

  • Rosca, J. (1995). Entropy-driven adaptive representation. In Rosca, J. (Ed.), Proceedings of the workshop on genetic programming: From theory to real-world applications, Tahoe City, CA, pp. 23–32.

  • Voas, J., & Miller, K. (1995). Software testability: The new verification. IEEE Software, 12(3), 17–28.

    Article  Google Scholar 

  • Wappler, S., Baresel, A., & Wegener, J. (2007). Improving evolutionary testing in the presence of function-assigned flags. In Proceedings of the testing: Academic and industrial conference practice and research techniques, (pp. 23–34). IEEE Digital Library.

  • Wegener, J., Baresel, A., & Sthamer, H. (2001). Evolutionary test environment for automatic structural testing. Information and Software Technology Special Issue on Software Engineering using Metaheuristic Innovative Algorithms, 43(14), 841–854.

    Google Scholar 

  • Wu, S. H., Jandhyala, S., Malaiya, Y. K., & Jayasumana, A. P. (2008). Antirandom testing: A distance-based approach. VLSI Design, 2008(2), 1–9.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors would like to express their gratitude to the anonymous referees for their valuable comments and suggestions for improving the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Alshraideh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alshraideh, M., Bottaci, L. & Mahafzah, B.A. Using program data-state scarcity to guide automatic test data generation. Software Qual J 18, 109–144 (2010). https://doi.org/10.1007/s11219-009-9083-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-009-9083-x

Keywords

Navigation