A systematic review of search-based testing for non-functional system properties

https://doi.org/10.1016/j.infsof.2008.12.005Get rights and content

Abstract

Search-based software testing is the application of metaheuristic search techniques to generate software tests. The test adequacy criterion is transformed into a fitness function and a set of solutions in the search space are evaluated with respect to the fitness function using a metaheuristic search technique. The application of metaheuristic search techniques for testing is promising due to the fact that exhaustive testing is infeasible considering the size and complexity of software under test. Search-based software testing has been applied across the spectrum of test case design methods; this includes white-box (structural), black-box (functional) and grey-box (combination of structural and functional) testing. In addition, metaheuristic search techniques have also been applied to test non-functional properties. The overall objective of undertaking this systematic review is to examine existing work into non-functional search-based software testing (NFSBST). We are interested in types of non-functional testing targeted using metaheuristic search techniques, different fitness functions used in different types of search-based non-functional testing and challenges in the application of these techniques. The systematic review is based on a comprehensive set of 35 articles obtained after a multi-stage selection process and have been published in the time span 1996–2007. The results of the review show that metaheuristic search techniques have been applied for non-functional testing of execution time, quality of service, security, usability and safety. A variety of metaheuristic search techniques are found to be applicable for non-functional testing including simulated annealing, tabu search, genetic algorithms, ant colony methods, grammatical evolution, genetic programming (and its variants including linear genetic programming) and swarm intelligence methods. The review reports on different fitness functions used to guide the search for each of the categories of execution time, safety, usability, quality of service and security; along with a discussion of possible challenges in the application of metaheuristic search techniques.

Introduction

Search-based software engineering (SBSE) is the application of optimization techniques in solving software engineering problems [1], [2]. The applicability of optimization techniques in solving software engineering problems is suitable as these problems frequently encounter competing constraints and require near optimal solutions. Search-based software testing (SBST) research has attracted much attention in recent years as part of a general interest in SBSE approaches. The growing interest in SBST can be attributed to the fact that generation of software tests is generally considered as an undecidable problem, primarily due to the many possible combinations of a program’s input [3]. All approaches to SBST are based on satisfaction of a certain test adequacy criterion represented by a fitness function [2]. McMinn [3] has written a comprehensive survey on search-based software test data generation. The survey shows the application of metaheuristics in white-box, black-box and grey-box testing. Within the domain of non-functional testing, the survey indicates the application of metaheuristic search techniques for checking the best-case and worst case execution times (BCET, WCET) of real-time systems. McMinn highlights possible directions of future research into non-functional testing, which includes searching for input situations that break memory or storage requirements, automatic detection of memory leaks, stress testing and security testing. Our work extends the survey by McMinn [3] as it analyses actual evidence supporting McMinn’s ideas of future directions in search-based testing of non-functional properties. Moreover, we anticipated studies making use of search-based techniques to test non-functional properties not highlighted by McMinn. This work also supports McMinn’s survey by finding further evidence into search-based execution time testing. Another review by Mantere and Alander [4] highlights work using evolutionary computation within software engineering, especially software testing. According to the review, genetic algorithms are highly applicable in testing coverage, timings, parameter values, finding calculation tolerances, bottlenecks, problematic input combinations and sequences. This study also extends and supports Mantere and Alander’s review in actually finding the evidence in support of proposed future extensions.

Within non-functional search-based software testing (NFSBST) research, it is both important and interesting to know the extent of application of metaheuristic search techniques to non-functional testing, not covered by previous studies. This allows us to identify potential non-functional properties suitable for applying these techniques and provides an overview of existing non-functional properties tested using metaheuristic search techniques. In this paper, after identifying existing non-functional properties, we review each of the properties to determine any constraints and limitations. We also identify the range of different fitness functions used within each non-functional property, since the fitness function is crucial in guiding search into promising areas of solution space and is the differentiating factor between quality of different solutions. The contribution of this review is therefore an exploration of non-functional properties tested using metaheuristic search techniques, identification of constraints and limitations encountered and an analysis of different fitness functions used to test individual non-functional property.

Section 2 describes the method of our systematic review that includes the research questions, search strategy, study selection criteria, study quality assessment and data extraction. Sections 3 Results and synthesis of findings, 4 Discussion and areas for future research discusses the results, synthesis of findings, areas of future research and validity threats. Conclusions are presented in Section 6.

Section snippets

Method

A systematic review is a process of assessment and interpretation of all available research related to a research question or subject of interest [5]. Kitchenham [5] also describes several reasons of undertaking a systematic review, the most common are to synthesize the available research concerning a treatment or technology, identification of topics for further investigation and formulation of a background in positioning new research activities.

This section describes our review protocol,

Results and synthesis of findings

In this section we describe the descriptive evaluation of the assessed literature in relation to the research questions. The 35 primary studies were related to the application of metaheuristic search techniques for testing five non-functional properties: execution time, quality of service (QoS), security, usability, and safety. The number of primary studies describing each non-functional property is: 15 (execution time), 2 (quality of service), 7 (security), 7 (usability) and 4 (safety).

Discussion and areas for future research

The body of knowledge into the use of metaheuristic search techniques for verifying the temporal correctness is geared towards real-time embedded systems. For these systems, temporal correctness must be verified along with the logical correctness. The fact that there is a lack of support for dynamic testing of real-time system for temporal correctness caused the research community to take advantage of metaheuristic search techniques. It is possible to differentiate the temporal testing research

Validity threats

There can be different threats to the validity of study results.

Conclusion validity refers to the statistically significant relationship between the treatment and the outcome [63]. One possible threat to conclusion validity is biasness in applying quality assessment and data extraction. In order to minimize this threat, we explicitly define the inclusion and exclusion criteria, which we believe is detailed enough to provide an assessment of how we reached the final set of papers for analysis.

Conclusions

This systematic review investigated the use of metaheuristic search techniques for testing non-functional properties of the SUT. The 35 primary studies are distributed among execution time (15 papers), quality of service (2 papers), safety (4 papers), security (7 papers) and usability (7 papers). While scanning references, we also found two papers relating to robustness testing of autonomous vehicle controllers [64], [65] but we do not include these two papers in our review as they were outside

Acknowledgement

The authors would like to thank Barbara Ann Kitchenham and the anonymous referees for their comments on the earlier drafts of the paper.

References (65)

  • IEEE, IEEE recommended practice for software requirements specifications, Tech. Rep. 830–1998, Institute of Electrical...
  • D.G. Firesmith, Common concepts underlying safety, security, and survivability engineering, Tech. Rep.,...
  • J. Wegener et al.

    Verifying timing constraints of real-time systems by means of evolutionary testing

    Real-Time Systems

    (1998)
  • M. Tlili et al.

    Improving evolutionary real-time testing

  • L.C. Briand et al.

    Stress testing real-time systems with genetic algorithms

  • B.A. Kitchenham et al.

    Cross versus within-company cost estimation studies: A systematic review

    IEEE Transactions on Software Engineering

    (2007)
  • A. Baresel et al.

    Structural and functional sequence test of dynamic and state-based software with evolutionary algorithms

  • Y. Zhan et al.

    Search based automatic test-data generation at an architectural level

  • K. Derderian et al.

    Input sequence generation for testing of communicating finite state machines (CFSMS)

  • E. Alba et al.

    Finding safety errors with aco

  • K.R. Walcott et al.

    Timeaware test suite prioritization

  • S. Bouktif et al.

    Simulated annealing for improving software quality prediction

  • H.G. Kayacik et al.

    Automatically evading IDS using GP authored attacks

  • T. Dybå, T. Dingsøyr, G.K. Hanssen, Applying systematic reviews to diverse study types: An experience report, in: ESEM...
  • J. Wegener, K. Grimm, M. Grochtmann, H. Sthamer, B. Jones, Systematic testing of real-time systems, in: EuroSTAR’96:...
  • J.T. Alander, T. Mantere, G. Moghadampour, J. Matila, Searching protection relay response time extremes using genetic...
  • J. Wegener et al.

    Testing real-time systems using genetic algorithms

    Software Quality Control

    (1997)
  • M. O’Sullivan, S. Vössner, J. Wegener, Testing temporal correctness of real-time systems, in: EuroSTAR’98: Proceedings...
  • N. Tracey, J. Clark, K. Mander, The way forward for unifying dynamic test case generation: The optimisation-based...
  • F. Mueller et al.

    A comparison of static analysis and evolutionary testing for the verification of timing constraints

    Real-Time Systems

    (1998)
  • P. Puschner et al.

    Testing the results of static worst-case execution-time analysis

  • Cited by (353)

    View all citing articles on Scopus
    View full text