Abstract
Test suite reduction strategies aim to produce a smaller and representative suite that presents the same coverage as the original one but is more cost-effective. In the model-based testing (MBT) context, reduction is crucial since automatic generation algorithms may blindly produce several similar test cases. In order to define the degree of similarity between test cases, researchers have investigated a number of distance functions. However, there is still little or no knowledge on whether and how they influence on the performance of reduction strategies, particularly when considering MBT practices. This paper investigates the effectiveness of distance functions in the scope of a MBT reduction strategy based on the similarity degree of test cases. We discuss six distance functions and apply them to three empirical studies. The first two studies are controlled experiments focusing on two real-world applications (and real faults) and ten synthetic specifications automatically generated from the configuration of each application (and faults randomly generated). In the third study, we also apply the reduction strategy to two subsequent versions of an industrial application by considering real faults detected. Results show that the choice of a distance function has little influence on the size of the reduced test suite. However, as reduced suites are different depending on the distance function applied, the choice can significantly affect the fault coverage. Moreover, it can also affect the stability of the reduction strategy regarding coverage of different sets of faults on different executions.









Similar content being viewed by others
Notes
Note that equations expressed as \(a = b = c\), represent \(a = b \wedge b = c \wedge a = c\).
References
Akleman, E., & Chen, J. (1999). Generalized distance functions. In Shape modeling international, IEEE computer society (pp. 72–79). http://dblp.uni-trier.de/db/conf/smi/smi1999.html
Anand, S., Burke, E. K., Chen, T. Y., Clark, J., Cohen, M. B., Grieskamp, W., et al. (2013). An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software, 86(8), 1978–2001. doi:10.1016/j.jss.2013.02.061
Arafeen, M. J., & Do, H. (2013). Test case prioritization using requirements-based clustering. In 2013 IEEE sixth international conference on software testing, verification and validation, Luxembourg, Luxembourg (pp. 312–321), March 18–22, 2013. doi:10.1109/ICST.2013.12
Araújo, J. D. S., Cartaxo, E. G., Neto, F. G. O., & Machado, P. D. L. (2012). Controlando a diversidade e a quantidade de casos de teste na geração automática a partir de modelos com loop. In 6th Brazilian workshop on systematic and automated software testing, 2012, Natal, RN, Brazil.
Bertolino, A., Cartaxo, E., Machado, P., Marchetti, E., & ao Ouriques, J. (2010). Test suite reduction in good order: Comparing heuristics from a new viewpoint. In Proceedings of the 22nd IFIP international conference on testing software and systems: Short papers (pp. 13–18). CRIM.
Cartaxo, E. G. (2011). Estratégias para controlar o tamanho da suíte de teste gerada a partir de abordagens mbt. PhD thesis, Universidade Federal de Campina Grande, Campina Grande, Paraíba.
Cartaxo, E. G., Andrade, W. L., Neto, F. G. O., & Machado, P. D. L. (2008). LTS-BT: A tool to generate and select functional test cases for embedded systems. In Proceedings of the 2008 ACM symposium on Applied computing, ACM, New York, NY, USA (pp. 1540–1544). SAC’08. doi:10.1145/1363686.1364045
Cartaxo, E. G., Machado, P. D. L., & Neto, F. G. O. (2011). On the use of a similarity function for test case selection in the context of model-based testing. Software Testing, Verification and Reliability, 21(2), 75–100. doi:10.1002/stvr.413
Chen, T., Leung, H., & Mak, I. (2005). Adaptive random testing. In M. Maher (Ed.), Advances in computer science-ASIAN 2004: Higher-level decision making: Lecture notes in computer science (pp. 320–329). Berlin: Springer. doi:10.1007/978-3-540-30502-6_23
Chen, T. Y., & Lau, M. F. (1998a). A new heuristic for test suite reduction. Information & Software Technology, 40(5–6), 347–354.
Chen, T. Y., & Lau, M. F. (1998b). A simulation study on some heuristics for test suite reduction. Information & Software Technology, 40(13), 777–787.
Chen, T. Y., Kuo, F. C., Merkel, R. G., & Tse, T. H. (2010). Adaptive random testing: The art of test case diversity. Journal of Systems and Software, 83(1), 60–66.
Chen, Y., Probert, R. L., & Ural, H. (2007). Regression test suite reduction using extended dependence analysis. In: Fourth international workshop on software quality assurance. Conjunction with the 6th ESEC/FSE joint meeting, ACM, New York, NY, USA (pp. 62–69). SOQUA ’07. doi:10.1145/1295074.1295086
Chvätal, V. (1979). A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4(3), 233–235. http://www.jstor.org/stable/3689577
Ciupa, I., Leitner, A., Oriol, M., & Meyer, B. (2008). Artoo: Adaptive random testing for object-oriented software. In Proceedings of the 30th international conference on software engineering, ACM, New York, NY, USA (pp. 71–80). ICSE ’08. doi:10.1145/1368088.1368099
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Mifflin: Houghton.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to algorithms. Cambridge, MA: MIT Press.
Coutinho, A. E. V. B., Cartaxo, E. G., Machado, P. D. L. (2013). Test suite reduction based on similarity of test cases. In: 7st Brazilian workshop on systematic and automated software testing—CBSoft 2013, Brasília, DF, Brazil. http://www.sjc.unifesp.br/sast2013/sites/all/files/www.sjc.unifesp.br.sast2013/files/test-suite-reduction.pdf best paper award winner
da Silva Simao, A., de Mello, R., & Senger, L. (2006). A technique to reduce the test case suites for regression testing based on a self-organizing neural network architecture. In 30th annual international computer software and applications conference, 2006. COMPSAC ’06, Vol. 2, pp 93–96. doi:10.1109/COMPSAC.2006.103
Fang, C., Chen, Z., Wu, K., & Zhao, Z. (2013). Similarity-based test case prioritization using ordered sequences of program entities. Software Quality Journal 1–27. doi:10.1007/s11219-013-9224-0
Felipe, J. C., Traina, A. J. M., Traina, C. Jr (2003) Retrieval by content of medical images using texture for tissue identification. In: CBMS, IEEE Computer Society, pp. 175. http://dblp.uni-trier.de/db/conf/cbms/cbms2003.html
Felipe, J. C., Marques, P. M. A., Balan, A. G. R., Traina, C. J., & Traina, A. J. M. (2006). Comparing images with distance functions based on attribute interaction. In: Proceedings of the 2006 ACM symposium on applied computing, ACM, New York, NY, USA (pp 1398–1399). SAC’06. doi:10.1145/1141277.1141600
Ferreira, F., Neves, L., Silva, M., & Borba, P. (2010). TaRGeT: A model based product line testing tool. In: CBSOFT 2010: Tools Session.
Fraser, G., & Wotawa, F. (2007). Redundancy based test-suite reduction. In: Proceedings of the 10th international conference on fundamental approaches to software engineering (pp. 291–305). Berlin: Springer, FASE’07. http://dl.acm.org/citation.cfm?id=1759394.1759425
Harrold, M. J., Gupta, R., & Soffa, M. L. (1993). A methodology for controlling the size of a test suite. ACM Transactions on Software Engineering and Methodology, 2(3), 270–285. doi:10.1145/152388.152391
Hemmati, H., & Briand, L. (2010). An industrial investigation of similarity measures for model-based test case selection. In: IEEE 21st international symposium on software reliability engineering (ISSRE), 2010 (pp. 141–150). doi:10.1109/ISSRE.2010.9
Hemmati, H., Arcuri, A., & Briand, L. (2013). Achieving scalable model-based testing through test case diversity. ACM Transactions Software Engineering Methodology, 22(1), 1–42. doi:10.1145/2430536.2430540
Heß, A. (2006). An iterative algorithm for ontology mapping capable of using training data. In Proceedings of the 3rd European Conference on The Semantic Web: Research and Applications (pp. 19–33). Berlin: Springer, ESWC’06. doi:10.1007/11762256_5
Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 547–579.
Jain, R. (1991). The art of computer systems performance analysis: Techniques for experimental design, measurement, simulation, and modeling. Hoboken: John Wiley.
Jaro, M. A. (1989). Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. Journal of the American Statistical Association, 84(406), 414–420. doi:10.1080/01621459.1989.10478785
Jiang, B., Zhang, Z., Chan, W. K., & Tse, T. H. (2009). Adaptive random test case prioritization. In Proceedings of the 2009 IEEE/ACM international conference on automated software engineering, IEEE Computer Society, Washington, DC, USA (pp. 233–244), ASE ’09. doi:10.1109/ASE.2009.77
Korel, B., Tahat, L. H., & Vaysburg, B. (2002). Model based regression test reduction using dependence analysis. In ICSM, IEEE Computer Society, pp. 214.
Kovcs, G., Nmeth, G., Subramaniam, M., & Pap, Z. (2009). Optimal string edit distance based test suite reduction for sdl specifications. In R. Reed, A. Bilgic, & R. Gotzhein (Eds.), SDL 2009: Design for motes and mobiles (pp. 82–97)., Lecture notes in computer science, Vol. 5719 Berlin: Springer. doi:10.1007/978-3-642-04554-7_6
Ledru, Y., Petrenko, A., & Boroday, S. (2009), Using string distances for test case prioritisation. In Proceedings of the 2009 IEEE/ACM international conference on automated software engineering, IEEE Computer Society, Washington, DC, USA (pp. 510–514). ASE ’09. doi:10.1109/ASE.2009.23
Ledru, Y., Petrenko, A., Boroday, S., & Mandran, N. (2012). Prioritizing test cases with string distances. Automated Software Engineering, 19(1), 65–95. doi:10.1007/s10515-011-0093-0
Leon, D., & Podgurski, A. (2003). A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases. In Proceedings of the 14th international symposium on software reliability engineering, IEEE Computer Society, Washington, DC, USA (pp. 442), ISSRE ’03.
Levenshtein, V. (1966). Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 10, 707.
Nogueira, S., Cartaxo, E., Torres, D., Aranha, E., & Marques, R. (2007). Model based test generation: An industrial experience. In 1st Brazilian workshop on systematic and automated software testing—SBBD/SBES 2007, João Pessoa, PB, Brazil.
Oliveira Neto, F. G., Feldt, R., Torkar, R., & Machado, P. D. L. (2013). Searching for models to test software technology. In Proceedings of first international workshop on combining modelling and search-based software engineering. CMSBSE/ICSE’2013.
Pezzè, M., & Young, M. (2007). Software testing and analysis: Process, Principles and techniques. Hoboken: Wiley.
Renieres, M., & Reiss, S. (2003). Fault localization with nearest neighbor queries. In: Proceedings. 18th IEEE international conference on automated software engineering, 2003 (pp. 30–39). doi:10.1109/ASE.2003.1240292
Rogstad, E., Briand, L., & Torkar, R. (2013). Test case selection for black-box regression testing of database applications. Information and Software Technology, 55(10), 1781–1795. doi:10.1016/j.infsof.2013.04.004
Sapna, P. G., & Mohanty, H. (2009) Prioritization of scenarios based on uml activity diagrams. In CICSyN, pp. 271–276.
Sellers, P. H. (1980). The theory and computation of evolutionary distances: Pattern recognition. Journal of Algorithms, 1(4), 359–373.
Thakur, A. S., & Sahayam, N. (2013). Speech recognition using euclidean distance. International Journal of Emerging Technology and Advanced Engineering, 3(2), 587–590.
Tretmans, J. (2008). Model based testing with labelled transition systems. In R. M. Hierons, J. P. Bowen, M. Harman (eds.), Formal methods and testing. Berlin: Springer, pp 1–38, http://dl.acm.org/citation.cfm?id=1806209.1806210
Utting, M., & Legeard, B. (2007). Practical model-based testing: A tools approach. Morgan Kaufmann Publishers Inc., San Francisco, CA: USA.
Vargha, A., & Delaney, H. D. (2000). A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25(2), 101–132.
Vinson, A. R., Heuser, C. A., da Silva, A. S., & de Moura, E. S. (2007). An approach to xml path matching. In Proceedings of the 9th Annual ACM International Workshop on Web Information and Data Management, ACM, New York, NY, USA (pp. 17–24). WIDM ’07. doi:10.1145/1316902.1316906
Winkler, W. E. (1999). The state of record linkage and current research problems. Technical report, Statistical Research Division, U.S. Census Bureau. http://www.census.gov/srd/papers/pdf/rr99-04.pdf
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslën, A. (2000). Experimentation in software engineering: An introduction (Vol. 15). Berlin: Kluwer Academic Publishers.
Xie, X., Chen, T. Y., Kuo, F. C., & Xu, B. (2013). A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology, 22(4), 31. doi:10.1145/2522920.2522924
Yoo, S., & Harman, M. (2012). Regression testing minimization, selection and prioritization: A survey. Software Testing, Verification and Reliability, 22(2), 67–120. doi:10.1002/stv.430
Yoo, S., Harman, M., Tonella, P., Susi, A. (2009). Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge. In Proceedings of the eighteenth international symposium on software testing and analysis, ACM, New York, NY, USA (pp. 201–212), ISSTA ’09. doi:10.1145/1572272.1572296
Zhou, Z. Q. (2010). Using coverage information to guide test case selection in adaptive random testing. In 2010 IEEE 34th annual computer software and applications conference workshops (COMPSACW), pp. 208–213. doi:10.1109/COMPSACW.2010.43
Acknowledgments
This work was supported by CNPq grants (Processes 484643/2011-8 and 560014/2010-4). Also, this work was partially supported by the National Institute of Science and Technology for Software Engineering (www.ines.org.br), funded by CNPq/Brasil, Grant 573964/2008-4. This work was developed in the context of a cooperation between UFCG and Ingenico do Brasil Ltda (Ingenico/UFCG 01/2013) incentivated by the Brazilian Informatics Law no. 8.248, 1991. First author is supported by Center of Human and Exact Sciences (State University of Paraíba).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Coutinho, A.E.V.B., Cartaxo, E.G. & Machado, P.D.L. Analysis of distance functions for similarity-based test suite reduction in the context of model-based testing. Software Qual J 24, 407–445 (2016). https://doi.org/10.1007/s11219-014-9265-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-014-9265-z