research-article

API-Knowledge Aware Search-Based Software Testing: Where, What, and How

Authors:
Xiaoxue Ren

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Xinyuan Ye

Australian National University, Canberra, Australia

Australian National University, Canberra, Australia
View Profile

,
Yun Lin

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

,
Zhenchang Xing

CSIRO?s Data61, Sydney, Australia / Australian National University, Canberra, Australia

CSIRO?s Data61, Sydney, Australia / Australian National University, Canberra, Australia
View Profile

,
Shuqing Li

Chinese University of Hong Kong, Hong Kong, China

Chinese University of Hong Kong, Hong Kong, China
View Profile

,
Michael R. Lyu

Chinese University of Hong Kong, Hong Kong, China

Chinese University of Hong Kong, Hong Kong, China
View Profile

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringNovember 2023Pages 1320–1332https://doi.org/10.1145/3611643.3616269

Published:30 November 2023Publication History

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 1320–1332

ABSTRACT

Search-based software testing (SBST) has proved its effectiveness in generating test cases to achieve its defined test goals, such as branch and data-dependency coverage. However, to detect more program faults in an effective way, pre-defined goals can hardly be adaptive in diversified projects. In this work, we propose KAT, a novel knowledge-aware SBST approach to generate on-demand assertions in the program under test (PUT) based on its used APIs. KAT constructs an API knowledge graph from the API documentation to derive the constraints that the client codes need to satisfy. Each constraint is instrumented into the PUT as a program branch, serving as a test goal to guide SBST to detect faults. We evaluate KAT with two baselines (i.e., EvoSuite and Catcher) with a close-world and an open-world experiment to detect API bugs. The close-world experiment shows that KAT outperforms the baselines in the F1-score (0.55 vs. 0.24 and 0.30) to detect API-related bugs. The open-world experiment shows that KAT can detect 59.64% and 9.05% more bugs than the baselines in practice.

References

KK Aggarwal, Yogesh Singh, Arvinder Kaur, and OP Sangwan. 2004. A neural net based approach to test oracle. ACM SIGSOFT Software Engineering Notes, 29, 3 (2004), 1–6. Google ScholarDigital Library
Shaukat Ali, Lionel C Briand, Hadi Hemmati, and Rajwinder Kaur Panesar-Walawege. 2009. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Transactions on Software Engineering, 36, 6 (2009), 742–762. Google ScholarDigital Library
M Moein Almasi, Hadi Hemmati, Gordon Fraser, Andrea Arcuri, and Janis Benefelds. 2017. An industrial evaluation of unit test generation: Finding real faults in a financial application. In 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). 263–272. Google ScholarDigital Library
Saswat Anand, Edmund K Burke, Tsong Yueh Chen, John Clark, Myra B Cohen, Wolfgang Grieskamp, Mark Harman, Mary Jean Harrold, Phil McMinn, and Antonia Bertolino. 2013. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software, 86, 8 (2013), 1978–2001. Google ScholarDigital Library
Andrea Arcuri, José Campos, and Gordon Fraser. 2016. Unit test generation during software development: Evosuite plugins for maven, intellij and jenkins. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). 401–408. Google ScholarCross Ref
Andrea Arcuri and Gordon Fraser. 2013. Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering, 18, 3 (2013), 594–623. Google ScholarCross Ref
Jon Ayerdi, Pablo Valle, Sergio Segura, Aitor Arrieta, Goiuria Sagardui, and Maite Arratibel. 2022. Performance-driven metamorphic testing of cyber-physical systems. IEEE Transactions on Reliability. Google Scholar
Earl T Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2014. The oracle problem in software testing: A survey. IEEE transactions on software engineering, 41, 5 (2014), 507–525. Google ScholarDigital Library
Arianna Blasi, Alberto Goffi, Konstantin Kuznetsov, Alessandra Gorla, Michael D Ernst, Mauro Pezzè, and Sergio Delgado Castellanos. 2018. Translating code comments to procedure specifications. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 242–253. Google ScholarDigital Library
Javassist: Java bytecode engineering toolkit. [n. d.]. https://www.javassist.org/ Google Scholar
José Campos, Andrea Arcuri, Gordon Fraser, and Rui Abreu. 2014. Continuous test generation: Enhancing continuous integration with automated test generation. In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. 55–66. Google ScholarDigital Library
Kai H Chang, JAMES H CROSS II, W Homer Carlisle, and Shih-Sung Liao. 1996. A performance evaluation of heuristics-based test case generation methods for software branch coverage. International Journal of Software Engineering and Knowledge Engineering, 6, 04 (1996), 585–608. Google ScholarCross Ref
Tsong Y Chen, Shing C Cheung, and Shiu Ming Yiu. 2020. Metamorphic testing: a new approach for generating next test cases. arXiv preprint arXiv:2002.12543. Google Scholar
Yoonsik Cheon and Gary T Leavens. 2002. A simple and practical approach to unit testing: The JML and JUnit way. In ECOOP 2002—Object-Oriented Programming: 16th European Conference Málaga, Spain, June 10–14, 2002 Proceedings 16. 231–255. Google ScholarCross Ref
Myra B Cohen. 2019. The maturation of search-based software testing: successes and challenges. In 2019 IEEE/ACM 12th International Workshop on Search-Based Software Testing (SBST). 13–14. Google ScholarDigital Library
Xin Feng, David Lorge Parnas, TH Tse, and Tony O’Callaghan. 2011. A comparison of tabular expression-based testing strategies. IEEE Transactions on Software Engineering, 37, 5 (2011), 616–634. Google ScholarDigital Library
Roger Ferguson and Bogdan Korel. 1996. The chaining approach for software test data generation. ACM Transactions on Software Engineering and Methodology (TOSEM), 5, 1 (1996), 63–86. Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2011. Evolutionary generation of whole test suites. In 2011 11th International Conference on Quality Software. 31–40. Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416–419. Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2012. Whole test suite generation. IEEE Transactions on Software Engineering, 39, 2 (2012), 276–291. Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2014. A large-scale evaluation of automated unit test generation using evosuite. ACM Transactions on Software Engineering and Methodology (TOSEM), 24, 2 (2014), 1–42. Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2015. 1600 faults in 100 projects: automatically finding faults while achieving high coverage with evosuite. Empirical software engineering, 20, 3 (2015), 611–639. Google Scholar
Gordon Fraser and Andreas Zeller. 2011. Mutation-driven generation of unit tests and oracles. IEEE Transactions on Software Engineering, 38, 2 (2011), 278–292. Google ScholarDigital Library
Marie-Claude Gaudel. 2001. Testing from formal specifications, a generic approach. In Reliable SoftwareTechnologies—Ada-Europe 2001: 6th Ada-Europe International Conference on Reliable Software Technologies Leuven, Belgium, May 14–18, 2001 Proceedings. 35–48. Google ScholarCross Ref
Gensim. [n. d.]. https://pypi.org/project/gensim/ Google Scholar
Alberto Goffi, Alessandra Gorla, Michael D Ernst, and Mauro Pezzè. 2016. Automatic generation of oracles for exceptional behaviors. In Proceedings of the 25th international symposium on software testing and analysis. 213–224. Google ScholarDigital Library
Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, open problems and challenges for search based software testing. In 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST). 1–12. Google ScholarCross Ref
Mark Harman, Sung Gon Kim, Kiran Lakhotia, Phil McMinn, and Shin Yoo. 2010. Optimizing for the number of tests generated in search based test data generation with an application to the oracle cost problem. In 2010 Third International Conference on Software Testing, Verification, and Validation Workshops. 182–191. Google ScholarDigital Library
Johannes Henkel, Christoph Reichenbach, and Amer Diwan. 2007. Discovering documentation for Java container classes. IEEE Transactions on Software Engineering, 33, 8 (2007), 526–543. Google ScholarDigital Library
Standard Edition & Java Development Kit Version 17 API Specification Java Platform. [n. d.]. https://docs.oracle.com/en/java/javase/17/docs/api Google Scholar
Yue Jia. 2015. Hyperheuristic search for sbst. In 2015 IEEE/ACM 8th International Workshop on Search-Based Software Testing. 15–16. Google ScholarDigital Library
René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. 437–440. Google ScholarDigital Library
Maria Kechagia, Xavier Devroey, Annibale Panichella, Georgios Gousios, and Arie van Deursen. 2019. Effective and efficient API misuse detection via exception propagation and search-based testing. In Proceedings of the 28th ACM SIGSOFT international symposium on software testing and analysis. 192–203. Google ScholarDigital Library
Pavneet Singh Kochhar, Ferdian Thung, and David Lo. 2015. Code coverage and test suite effectiveness: Empirical study with real bugs in large systems. In 2015 IEEE 22nd international conference on software analysis, evolution, and reengineering (SANER). 560–564. Google ScholarCross Ref
J Richard Landis and Gary G Koch. 1977. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, 363–374. Google Scholar
Hongwei Li, Sirui Li, Jiamou Sun, Zhenchang Xing, Xin Peng, Mingwei Liu, and Xuejiao Zhao. 2018. Improving api caveats accessibility by mining api caveats knowledge graph. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). 183–193. Google ScholarCross Ref
Nan Li and Jeff Offutt. 2016. Test oracle strategies for model-based testing. IEEE Transactions on Software Engineering, 43, 4 (2016), 372–395. Google ScholarDigital Library
Yun Lin, You Sheng Ong, Jun Sun, Gordon Fraser, and Jin Song Dong. 2021. Graph-based seed object synthesis for search-based unit testing. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1068–1080. Google ScholarDigital Library
Mingwei Liu, Xin Peng, Andrian Marcus, Zhenchang Xing, Wenkai Xie, Shuangshuang Xing, and Yang Liu. 2019. Generating query-specific class API summaries. In Proceedings of the 2019 27th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. 120–130. Google ScholarDigital Library
Phil McMinn. 2011. Search-based software testing: Past, present and future. In 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops. 153–163. Google ScholarDigital Library
Mahshid Helali Moghadam, Markus Borg, and Seyed Jalaleddin Mousavirad. 2021. Deeper at the sbst 2021 tool competition: ADAS testing using multi-objective search. In 2021 IEEE/ACM 14th International Workshop on Search-Based Software Testing (SBST). 40–41. Google ScholarCross Ref
Martin Monperrus, Michael Eichberg, Elif Tekes, and Mira Mezini. 2012. What should developers be aware of? An empirical study on the directives of API documentation. Empirical Software Engineering, 17, 6 (2012), 703–737. Google ScholarDigital Library
neuralcoref. [n. d.]. https://spacy.io/universe/project/neuralcoref Google Scholar
Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815–816. Google Scholar
Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2015. Reformulating branch coverage as a many-objective optimization problem. In 2015 IEEE 8th international conference on software testing, verification and validation (ICST). 1–10. Google ScholarCross Ref
Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2017. Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets. IEEE Transactions on Software Engineering, 44, 2 (2017), 122–158. Google ScholarCross Ref
Anjana Perera, Aldeida Aleti, Marcel Böhme, and Burak Turhan. 2020. Defect prediction guided search-based software testing. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). 448–460. Google ScholarDigital Library
Dennis K Peters and David Lorge Parnas. 2002. Requirements-based monitors for real-time systems. IEEE Transactions on Software Engineering, 28, 2 (2002), 146–158. Google ScholarDigital Library
Z3 Prover. [n. d.]. https://github.com/Z3Prover/z3 Google Scholar
Xiaoxue Ren, Zhenchang Xing, Xin Xia, Guoqiang Li, and Jianling Sun. 2019. Discovering, explaining and summarizing controversial discussions in community q&a sites. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 151–162. Google ScholarDigital Library
Xiaoxue Ren, Xinyuan Ye, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Jianling Sun. 2020. API-misuse detection driven by fine-grained API-constraint knowledge graph. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). 461–472. Google ScholarDigital Library
José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2016. Seeding strategies in search-based unit test generation. Software Testing, Verification and Reliability, 26, 5 (2016), 366–401. Google ScholarDigital Library
José Miguel Rojas, Mattia Vivanti, Andrea Arcuri, and Gordon Fraser. 2017. A detailed investigation of the effectiveness of whole test suite generation. Empirical Software Engineering, 22, 2 (2017), 852–893. Google ScholarDigital Library
Simone Scalabrino, Giovanni Grano, Dario Di Nucci, Rocco Oliveto, and Andrea De Lucia. 2016. Search-based testing of procedural programs: Iterative single-target or multi-target approach? In International symposium on search based software engineering. 64–79. Google ScholarCross Ref
Sergio Segura, Gordon Fraser, Ana B Sanchez, and Antonio Ruiz-Cortés. 2016. A survey on metamorphic testing. IEEE Transactions on software engineering, 42, 9 (2016), 805–824. Google ScholarCross Ref
Mohammad Mehdi Dejam Shahabi, S Parsa Badiei, S Ehsan Beheshtian, Reza Akbari, and S Mohammad Reza Moosavi. 2017. EVOTLBO: A TLBO based Method for Automatic Test Data Generation in EvoSuite. International Journal of Advanced Computer Science and Applications, 8, 6 (2017). Google Scholar
Seyed Reza Shahamiri, Wan Wan-Kadir, Suhaimi Ibrahim, and Siti Zaiton Mohd Hashim. 2012. Artificial neural networks as multi-networks automated test oracle. Automated Software Engineering, 19, 3 (2012), 303–334. Google ScholarDigital Library
Ravindra Singh and Naurang Singh Mangat. 2013. Elements of survey sampling. 15, Springer Science & Business Media. Google Scholar
Victor Sobreira, Thomas Durieux, Fernanda Madeiral, Martin Monperrus, and Marcelo A. Maia. 2018. Dissection of a Bug Dataset: Anatomy of 395 Patches from Defects4J. In Proceedings of SANER. Google Scholar
Soot. [n. d.]. http://soot-oss.github.io/soot/ Google Scholar
Spacy. [n. d.]. https://spacy.io/ Google Scholar
Suresh Thummalapenta, Tao Xie, Nikolai Tillmann, Jonathan De Halleux, and Zhendong Su. 2011. Synthesizing method sequences for high-coverage testing. ACM SIGPLAN Notices, 46, 10 (2011), 189–206. Google ScholarDigital Library
Raja Vallee-Rai and Laurie J Hendren. 1998. Jimple: Simplifying Java bytecode for analyses and transformations. Google Scholar
Yi Wei, Carlo A Furia, Nikolay Kazmin, and Bertrand Meyer. 2011. Inferring better contracts. In Proceedings of the 33rd International Conference on Software Engineering. 191–200. Google ScholarDigital Library
Word2vec. [n. d.]. https://code.google.com/archive/p/word2vec/ Google Scholar
Tao Xie. 2006. Augmenting automatically generated unit-test suites with regression oracle checking. In ECOOP 2006–Object-Oriented Programming: 20th European Conference, Nantes, France, July 3-7, 2006. Proceedings 20. 380–403. Google ScholarDigital Library
Xiaoyuan Xie, Joshua WK Ho, Christian Murphy, Gail Kaiser, Baowen Xu, and Tsong Yueh Chen. 2011. Testing and validating machine learning classifiers by metamorphic testing. Journal of Systems and Software, 84, 4 (2011), 544–558. Google ScholarDigital Library
Juan Zhai, Jianjun Huang, Shiqing Ma, Xiangyu Zhang, Lin Tan, Jianhua Zhao, and Feng Qin. 2016. Automatic model generation from documentation for Java API functions. In Proceedings of the 38th International Conference on Software Engineering. 380–391. Google ScholarDigital Library
Hao Zhong, Lu Zhang, Tao Xie, and Hong Mei. 2009. Inferring resource specifications from natural language API documentation. In 2009 IEEE/ACM International Conference on Automated Software Engineering. 307–318. Google ScholarDigital Library
Yu Zhou, Changzhi Wang, Xin Yan, Taolue Chen, Sebastiano Panichella, and Harald Gall. 2018. Automatic detection and repair recommendation of directive defects in Java API documentation. IEEE Transactions on Software Engineering, 46, 9 (2018), 1004–1023. Google ScholarCross Ref
Ziming Zhu, Xiong Xu, and Li Jiao. 2017. Improved evolutionary generation of test data for multiple paths in search-based software testing. In 2017 IEEE Congress on Evolutionary Computation (CEC). 612–620. Google Scholar

Index Terms

API-Knowledge Aware Search-Based Software Testing: Where, What, and How
1. Security and privacy
  1. Software and application security

Recommendations

An experimental comparison of the effectiveness of control flow based testing approaches on seeded faults
TACAS'06: Proceedings of the 12th international conference on Tools and Algorithms for the Construction and Analysis of Systems

In this paper, we describe the results of an experiment comparing the effectiveness of three structural coverage-testing methods, namely, block coverage, branch coverage and predicate coverage criteria on seeded faults. The implications of our work is ...
Read More
Priority based data flow testing
ICSM '95: Proceedings of the International Conference on Software Maintenance

Software testing is an expensive component of software development and maintenance. For data flow testing, test cases must be found to test the def-use pairs in a program. Since some of the def-use pairs identified through static analysis may be ...
Read More
The experimental applications of search-based techniques for model-based testing

Graphical abstractDisplay Omitted HighlightsA systematic review of applications of search-based techniques for model-based testing is provided.Four taxonomies are proposed to classify the applications based on the purpose, problems, solutions and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2023
2215 pages
ISBN:9798400703270
DOI:10.1145/3611643
General Chair:
Satish Chandra
Google, USA
,
Program Chairs:
Kelly Blincoe
University of Auckland, New Zealand
,
Paolo Tonella
USI Lugano, Switzerland
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 November 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Knowledge Graph
Software Testing
Test Case Generation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate112of543submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 141
  Total Downloads
- Downloads (Last 12 months)141
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

API-Knowledge Aware Search-Based Software Testing: Where, What, and How

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

An experimental comparison of the effectiveness of control flow based testing approaches on seeded faults

Priority based data flow testing

The experimental applications of search-based techniques for model-based testing