skip to main content
10.1145/3611643.3616269acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

API-Knowledge Aware Search-Based Software Testing: Where, What, and How

Authors Info & Claims
Published:30 November 2023Publication History

ABSTRACT

Search-based software testing (SBST) has proved its effectiveness in generating test cases to achieve its defined test goals, such as branch and data-dependency coverage. However, to detect more program faults in an effective way, pre-defined goals can hardly be adaptive in diversified projects. In this work, we propose KAT, a novel knowledge-aware SBST approach to generate on-demand assertions in the program under test (PUT) based on its used APIs. KAT constructs an API knowledge graph from the API documentation to derive the constraints that the client codes need to satisfy. Each constraint is instrumented into the PUT as a program branch, serving as a test goal to guide SBST to detect faults. We evaluate KAT with two baselines (i.e., EvoSuite and Catcher) with a close-world and an open-world experiment to detect API bugs. The close-world experiment shows that KAT outperforms the baselines in the F1-score (0.55 vs. 0.24 and 0.30) to detect API-related bugs. The open-world experiment shows that KAT can detect 59.64% and 9.05% more bugs than the baselines in practice.

References

  1. KK Aggarwal, Yogesh Singh, Arvinder Kaur, and OP Sangwan. 2004. A neural net based approach to test oracle. ACM SIGSOFT Software Engineering Notes, 29, 3 (2004), 1–6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Shaukat Ali, Lionel C Briand, Hadi Hemmati, and Rajwinder Kaur Panesar-Walawege. 2009. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Transactions on Software Engineering, 36, 6 (2009), 742–762. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M Moein Almasi, Hadi Hemmati, Gordon Fraser, Andrea Arcuri, and Janis Benefelds. 2017. An industrial evaluation of unit test generation: Finding real faults in a financial application. In 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). 263–272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Saswat Anand, Edmund K Burke, Tsong Yueh Chen, John Clark, Myra B Cohen, Wolfgang Grieskamp, Mark Harman, Mary Jean Harrold, Phil McMinn, and Antonia Bertolino. 2013. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software, 86, 8 (2013), 1978–2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andrea Arcuri, José Campos, and Gordon Fraser. 2016. Unit test generation during software development: Evosuite plugins for maven, intellij and jenkins. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). 401–408. Google ScholarGoogle ScholarCross RefCross Ref
  6. Andrea Arcuri and Gordon Fraser. 2013. Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 18, 3 (2013), 594–623. Google ScholarGoogle ScholarCross RefCross Ref
  7. Jon Ayerdi, Pablo Valle, Sergio Segura, Aitor Arrieta, Goiuria Sagardui, and Maite Arratibel. 2022. Performance-driven metamorphic testing of cyber-physical systems. IEEE Transactions on Reliability. Google ScholarGoogle Scholar
  8. Earl T Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2014. The oracle problem in software testing: A survey. IEEE transactions on software engineering, 41, 5 (2014), 507–525. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Arianna Blasi, Alberto Goffi, Konstantin Kuznetsov, Alessandra Gorla, Michael D Ernst, Mauro Pezzè, and Sergio Delgado Castellanos. 2018. Translating code comments to procedure specifications. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 242–253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Javassist: Java bytecode engineering toolkit. [n. d.]. https://www.javassist.org/ Google ScholarGoogle Scholar
  11. José Campos, Andrea Arcuri, Gordon Fraser, and Rui Abreu. 2014. Continuous test generation: Enhancing continuous integration with automated test generation. In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. 55–66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kai H Chang, JAMES H CROSS II, W Homer Carlisle, and Shih-Sung Liao. 1996. A performance evaluation of heuristics-based test case generation methods for software branch coverage. International Journal of Software Engineering and Knowledge Engineering, 6, 04 (1996), 585–608. Google ScholarGoogle ScholarCross RefCross Ref
  13. Tsong Y Chen, Shing C Cheung, and Shiu Ming Yiu. 2020. Metamorphic testing: a new approach for generating next test cases. arXiv preprint arXiv:2002.12543. Google ScholarGoogle Scholar
  14. Yoonsik Cheon and Gary T Leavens. 2002. A simple and practical approach to unit testing: The JML and JUnit way. In ECOOP 2002—Object-Oriented Programming: 16th European Conference Málaga, Spain, June 10–14, 2002 Proceedings 16. 231–255. Google ScholarGoogle ScholarCross RefCross Ref
  15. Myra B Cohen. 2019. The maturation of search-based software testing: successes and challenges. In 2019 IEEE/ACM 12th International Workshop on Search-Based Software Testing (SBST). 13–14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xin Feng, David Lorge Parnas, TH Tse, and Tony O’Callaghan. 2011. A comparison of tabular expression-based testing strategies. IEEE Transactions on Software Engineering, 37, 5 (2011), 616–634. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Roger Ferguson and Bogdan Korel. 1996. The chaining approach for software test data generation. ACM Transactions on Software Engineering and Methodology (TOSEM), 5, 1 (1996), 63–86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Gordon Fraser and Andrea Arcuri. 2011. Evolutionary generation of whole test suites. In 2011 11th International Conference on Quality Software. 31–40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416–419. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Gordon Fraser and Andrea Arcuri. 2012. Whole test suite generation. IEEE Transactions on Software Engineering, 39, 2 (2012), 276–291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gordon Fraser and Andrea Arcuri. 2014. A large-scale evaluation of automated unit test generation using evosuite. ACM Transactions on Software Engineering and Methodology (TOSEM), 24, 2 (2014), 1–42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gordon Fraser and Andrea Arcuri. 2015. 1600 faults in 100 projects: automatically finding faults while achieving high coverage with evosuite. Empirical software engineering, 20, 3 (2015), 611–639. Google ScholarGoogle Scholar
  23. Gordon Fraser and Andreas Zeller. 2011. Mutation-driven generation of unit tests and oracles. IEEE Transactions on Software Engineering, 38, 2 (2011), 278–292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Marie-Claude Gaudel. 2001. Testing from formal specifications, a generic approach. In Reliable SoftwareTechnologies—Ada-Europe 2001: 6th Ada-Europe International Conference on Reliable Software Technologies Leuven, Belgium, May 14–18, 2001 Proceedings. 35–48. Google ScholarGoogle ScholarCross RefCross Ref
  25. Gensim. [n. d.]. https://pypi.org/project/gensim/ Google ScholarGoogle Scholar
  26. Alberto Goffi, Alessandra Gorla, Michael D Ernst, and Mauro Pezzè. 2016. Automatic generation of oracles for exceptional behaviors. In Proceedings of the 25th international symposium on software testing and analysis. 213–224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, open problems and challenges for search based software testing. In 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST). 1–12. Google ScholarGoogle ScholarCross RefCross Ref
  28. Mark Harman, Sung Gon Kim, Kiran Lakhotia, Phil McMinn, and Shin Yoo. 2010. Optimizing for the number of tests generated in search based test data generation with an application to the oracle cost problem. In 2010 Third International Conference on Software Testing, Verification, and Validation Workshops. 182–191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Johannes Henkel, Christoph Reichenbach, and Amer Diwan. 2007. Discovering documentation for Java container classes. IEEE Transactions on Software Engineering, 33, 8 (2007), 526–543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Standard Edition & Java Development Kit Version 17 API Specification Java Platform. [n. d.]. https://docs.oracle.com/en/java/javase/17/docs/api Google ScholarGoogle Scholar
  31. Yue Jia. 2015. Hyperheuristic search for sbst. In 2015 IEEE/ACM 8th International Workshop on Search-Based Software Testing. 15–16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. 437–440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Maria Kechagia, Xavier Devroey, Annibale Panichella, Georgios Gousios, and Arie van Deursen. 2019. Effective and efficient API misuse detection via exception propagation and search-based testing. In Proceedings of the 28th ACM SIGSOFT international symposium on software testing and analysis. 192–203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Pavneet Singh Kochhar, Ferdian Thung, and David Lo. 2015. Code coverage and test suite effectiveness: Empirical study with real bugs in large systems. In 2015 IEEE 22nd international conference on software analysis, evolution, and reengineering (SANER). 560–564. Google ScholarGoogle ScholarCross RefCross Ref
  35. J Richard Landis and Gary G Koch. 1977. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, 363–374. Google ScholarGoogle Scholar
  36. Hongwei Li, Sirui Li, Jiamou Sun, Zhenchang Xing, Xin Peng, Mingwei Liu, and Xuejiao Zhao. 2018. Improving api caveats accessibility by mining api caveats knowledge graph. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). 183–193. Google ScholarGoogle ScholarCross RefCross Ref
  37. Nan Li and Jeff Offutt. 2016. Test oracle strategies for model-based testing. IEEE Transactions on Software Engineering, 43, 4 (2016), 372–395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yun Lin, You Sheng Ong, Jun Sun, Gordon Fraser, and Jin Song Dong. 2021. Graph-based seed object synthesis for search-based unit testing. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1068–1080. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Mingwei Liu, Xin Peng, Andrian Marcus, Zhenchang Xing, Wenkai Xie, Shuangshuang Xing, and Yang Liu. 2019. Generating query-specific class API summaries. In Proceedings of the 2019 27th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. 120–130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Phil McMinn. 2011. Search-based software testing: Past, present and future. In 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops. 153–163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Mahshid Helali Moghadam, Markus Borg, and Seyed Jalaleddin Mousavirad. 2021. Deeper at the sbst 2021 tool competition: ADAS testing using multi-objective search. In 2021 IEEE/ACM 14th International Workshop on Search-Based Software Testing (SBST). 40–41. Google ScholarGoogle ScholarCross RefCross Ref
  42. Martin Monperrus, Michael Eichberg, Elif Tekes, and Mira Mezini. 2012. What should developers be aware of? An empirical study on the directives of API documentation. Empirical Software Engineering, 17, 6 (2012), 703–737. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. neuralcoref. [n. d.]. https://spacy.io/universe/project/neuralcoref Google ScholarGoogle Scholar
  44. Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815–816. Google ScholarGoogle Scholar
  45. Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2015. Reformulating branch coverage as a many-objective optimization problem. In 2015 IEEE 8th international conference on software testing, verification and validation (ICST). 1–10. Google ScholarGoogle ScholarCross RefCross Ref
  46. Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2017. Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets. IEEE Transactions on Software Engineering, 44, 2 (2017), 122–158. Google ScholarGoogle ScholarCross RefCross Ref
  47. Anjana Perera, Aldeida Aleti, Marcel Böhme, and Burak Turhan. 2020. Defect prediction guided search-based software testing. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). 448–460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Dennis K Peters and David Lorge Parnas. 2002. Requirements-based monitors for real-time systems. IEEE Transactions on Software Engineering, 28, 2 (2002), 146–158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Z3 Prover. [n. d.]. https://github.com/Z3Prover/z3 Google ScholarGoogle Scholar
  50. Xiaoxue Ren, Zhenchang Xing, Xin Xia, Guoqiang Li, and Jianling Sun. 2019. Discovering, explaining and summarizing controversial discussions in community q&a sites. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 151–162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Xiaoxue Ren, Xinyuan Ye, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Jianling Sun. 2020. API-misuse detection driven by fine-grained API-constraint knowledge graph. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). 461–472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2016. Seeding strategies in search-based unit test generation. Software Testing, Verification and Reliability, 26, 5 (2016), 366–401. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. José Miguel Rojas, Mattia Vivanti, Andrea Arcuri, and Gordon Fraser. 2017. A detailed investigation of the effectiveness of whole test suite generation. Empirical Software Engineering, 22, 2 (2017), 852–893. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Simone Scalabrino, Giovanni Grano, Dario Di Nucci, Rocco Oliveto, and Andrea De Lucia. 2016. Search-based testing of procedural programs: Iterative single-target or multi-target approach? In International symposium on search based software engineering. 64–79. Google ScholarGoogle ScholarCross RefCross Ref
  55. Sergio Segura, Gordon Fraser, Ana B Sanchez, and Antonio Ruiz-Cortés. 2016. A survey on metamorphic testing. IEEE Transactions on software engineering, 42, 9 (2016), 805–824. Google ScholarGoogle ScholarCross RefCross Ref
  56. Mohammad Mehdi Dejam Shahabi, S Parsa Badiei, S Ehsan Beheshtian, Reza Akbari, and S Mohammad Reza Moosavi. 2017. EVOTLBO: A TLBO based Method for Automatic Test Data Generation in EvoSuite. International Journal of Advanced Computer Science and Applications, 8, 6 (2017). Google ScholarGoogle Scholar
  57. Seyed Reza Shahamiri, Wan Wan-Kadir, Suhaimi Ibrahim, and Siti Zaiton Mohd Hashim. 2012. Artificial neural networks as multi-networks automated test oracle. Automated Software Engineering, 19, 3 (2012), 303–334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Ravindra Singh and Naurang Singh Mangat. 2013. Elements of survey sampling. 15, Springer Science & Business Media. Google ScholarGoogle Scholar
  59. Victor Sobreira, Thomas Durieux, Fernanda Madeiral, Martin Monperrus, and Marcelo A. Maia. 2018. Dissection of a Bug Dataset: Anatomy of 395 Patches from Defects4J. In Proceedings of SANER. Google ScholarGoogle Scholar
  60. Soot. [n. d.]. http://soot-oss.github.io/soot/ Google ScholarGoogle Scholar
  61. Spacy. [n. d.]. https://spacy.io/ Google ScholarGoogle Scholar
  62. Suresh Thummalapenta, Tao Xie, Nikolai Tillmann, Jonathan De Halleux, and Zhendong Su. 2011. Synthesizing method sequences for high-coverage testing. ACM SIGPLAN Notices, 46, 10 (2011), 189–206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Raja Vallee-Rai and Laurie J Hendren. 1998. Jimple: Simplifying Java bytecode for analyses and transformations. Google ScholarGoogle Scholar
  64. Yi Wei, Carlo A Furia, Nikolay Kazmin, and Bertrand Meyer. 2011. Inferring better contracts. In Proceedings of the 33rd International Conference on Software Engineering. 191–200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Word2vec. [n. d.]. https://code.google.com/archive/p/word2vec/ Google ScholarGoogle Scholar
  66. Tao Xie. 2006. Augmenting automatically generated unit-test suites with regression oracle checking. In ECOOP 2006–Object-Oriented Programming: 20th European Conference, Nantes, France, July 3-7, 2006. Proceedings 20. 380–403. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Xiaoyuan Xie, Joshua WK Ho, Christian Murphy, Gail Kaiser, Baowen Xu, and Tsong Yueh Chen. 2011. Testing and validating machine learning classifiers by metamorphic testing. Journal of Systems and Software, 84, 4 (2011), 544–558. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Juan Zhai, Jianjun Huang, Shiqing Ma, Xiangyu Zhang, Lin Tan, Jianhua Zhao, and Feng Qin. 2016. Automatic model generation from documentation for Java API functions. In Proceedings of the 38th International Conference on Software Engineering. 380–391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Hao Zhong, Lu Zhang, Tao Xie, and Hong Mei. 2009. Inferring resource specifications from natural language API documentation. In 2009 IEEE/ACM International Conference on Automated Software Engineering. 307–318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Yu Zhou, Changzhi Wang, Xin Yan, Taolue Chen, Sebastiano Panichella, and Harald Gall. 2018. Automatic detection and repair recommendation of directive defects in Java API documentation. IEEE Transactions on Software Engineering, 46, 9 (2018), 1004–1023. Google ScholarGoogle ScholarCross RefCross Ref
  71. Ziming Zhu, Xiong Xu, and Li Jiao. 2017. Improved evolutionary generation of test data for multiple paths in search-based software testing. In 2017 IEEE Congress on Evolutionary Computation (CEC). 612–620. Google ScholarGoogle Scholar

Index Terms

  1. API-Knowledge Aware Search-Based Software Testing: Where, What, and How

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
      November 2023
      2215 pages
      ISBN:9798400703270
      DOI:10.1145/3611643

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 November 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate112of543submissions,21%
    • Article Metrics

      • Downloads (Last 12 months)141
      • Downloads (Last 6 weeks)18

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader