research-article

Call Me Maybe: Using NLP to Automatically Generate Unit Test Cases Respecting Temporal Constraints

Authors:

Alessandra Gorla,

Michael D. Ernst,

Mauro PezzèAuthors Info & Claims

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Article No.: 19, Pages 1 - 11

https://doi.org/10.1145/3551349.3556961

Published: 05 January 2023 Publication History

Abstract

A class may need to obey temporal constraints in order to function correctly. For example, the correct usage protocol for an iterator is to always check whether there is a next element before asking for it; iterating over a collection when there are no items left leads to a NoSuchElementException. Automatic test case generation tools such as Randoop and EvoSuite do not have any notion of these temporal constraints. Generating test cases by randomly invoking methods on a new instance of the class under test may raise run time exceptions that do not necessarily expose software faults, but are rather a consequence of violations of temporal properties.

This paper presents CallMeMaybe, a novel technique that uses natural language processing to analyze Javadoc comments to identify temporal constraints. This information can guide a test case generator towards executing sequences of method calls that respect the temporal constraints. Our evaluation on 73 subjects from seven popular Java systems shows that CallMeMaybe achieves a precision of 83% and a recall of 70% when translating temporal constraints into Java expressions. For the two biggest subjects, the integration with Randoop flags 11,818 false alarms and enriches 12,024 correctly failing test cases due to violations of temporal constraints with clear explanation that can help software developers.

References

[1]

Glenn Ammons, Ras Bodik, and James R Larus. 2002. Mining Specifications. In Proceedings of the Symposium on Principles of Programming Languages. ACM, 4–16.

[2]

Glenn Ammons, Rastislav Bodík, and James R. Larus. 2002. Mining specifications. In POPL 2002: Proceedings of the 29th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. Portland, Oregon, 4–16.

Digital Library

[3]

Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering 41, 5 (2015), 507–525.

Digital Library

[4]

A. W. Biermann and J. A. Feldman. 1972. On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput. C-21, 6 (June 1972), 592–597.

Digital Library

[5]

Arianna Blasi, Alberto Goffi, Konstantin Kuznetsov, Alessandra Gorla, Michael D. Ernst, Mauro Pezzè, and Sergio Delgado Castellanos. 2018. Translating Code Comments to Procedure Specifications. In Proceedings of the International Symposium on Software Testing and Analysis(ISSTA ’18). ACM.

Digital Library

[6]

Arianna Blasi, Alessandra Gorla, Michael D Ernst, Mauro Pezze, and Antonio Carzaniga. 2021. MeMo: Automatically identifying metamorphic relations in Javadoc comments for test automation. Journal of Systems and Software 181 (2021), 111041.

Digital Library

[7]

E. M. Clarke, Orna Grumberg, and Doron Peled. 1999. Model Checking. MIT Press.

[8]

Guido de Caso, Victor Braberman, Diego Garbervetsky, and Sebastian Uchitel. 2013. Enabledness-based program abstractions for behavior validation. ACM Transactions on Software Engineering and Methodology 22, 3 (July 2013), 25:1–25:46.

[9]

Luciano Del Corro and Rainer Gemulla. 2013. ClausIE: Clause-based Open Information Extraction. In Proceedings of the International Conference on World Wide Web(WWW ’13). ACM, 355–366.

Digital Library

[10]

Vasiliki Efstathiou, Christos Chatzilenas, and Diomidis Spinellis. 2018. Word embeddings for the software engineering domain. In Proceedings of the Working Conference on Mining Software Repositories. ACM, 38–41.

Digital Library

[11]

Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic Test Suite Generation for Object-Oriented Software. In Proceedings of the European Software Engineering Conference held jointly with the ACM SIGSOFT International Symposium on Foundations of Software Engineering(ESEC/FSE ’11). ACM, 416–419.

Digital Library

[12]

Mark Gabel and Zhendong Su. 2008. Javert: Fully automatic mining of general temporal properties from dynamic traces. In FSE 2008: Proceedings of the ACM SIGSOFT 16th Symposium on the Foundations of Software Engineering. Atlanta, GA, USA, 339–349.

Digital Library

[13]

Dimitra Giannakopoulou and Corina S. Păsăreanu. 2009. Interface generation and compositional verification in JavaPathfinder. In FASE 2009: Fundamental Approaches to Software Engineering. York, UK, 94–108.

Digital Library

[14]

Alberto Goffi, Alessandra Gorla, Michael D. Ernst, and Mauro Pezzè. 2016. Automatic Generation of Oracles for Exceptional Behaviors. In Proceedings of the International Symposium on Software Testing and Analysis(ISSTA ’16). ACM, 213–224.

Digital Library

[15]

Gerard J. Holzmann. 1997. The Model Checker SPIN. IEEE Transactions on Software Engineering 23, 5 (May 1997), 279–295. Special Issue: Formal Methods in Software Practice.

Digital Library

[16]

Ruihong Huang, Ignacio Cases, Dan Jurafsky, Cleo Condoravdi, and Ellen Riloff. 2016. Distinguishing Past, On-going, and Future Events: The EventStatus Corpus. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.

[17]

Caroline Lemieux, Dennis Park, and Ivan Beschastnikh. 2015. General LTL Specification Mining. In ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering. Lincoln, NE, USA.

[18]

David Lo and Siau-Cheng Khoo. 2006. SMArTIC: Towards building an accurate, robust and scalable specification miner. In FSE 2006: Proceedings of the ACM SIGSOFT 14th Symposium on the Foundations of Software Engineering. Portland, OR, USA, 265–275.

Digital Library

[19]

Davide Lorenzoli, Leonardo Mariani, and Mauro Pezzè. 2008. Automatic Generation of Software Behavioral Models. In 30th International Conference on Software Engineering (ICSE). IEEE Computer Society.

[20]

Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of the Annual Meeting of the Association for Computational Linguistics: System Demonstrations(ACL ’14). Association for Computational Linguistics, 55–60.

[21]

Kenneth L. McMillan. 1993. Symbolic model checking. Kluwer Academic Publishers.

Digital Library

[22]

Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. 2007. Feedback-Directed Random Test Generation. In Proceedings of the International Conference on Software Engineering(ICSE ’07). ACM, 75–84.

Digital Library

[23]

Rahul Pandita, Kunal Taneja, Laurie Williams, and Teresa Tung. 2016. ICON: Inferring temporal constraints from natural language api descriptions. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. IEEE Computer Society, 378–388.

[24]

Rahul Pandita, Xusheng Xiao, Hao Zhong, Tao Xie, Stephen Oney, and Amit Paradkar. 2012. Inferring Method Specifications from Natural Language API Descriptions. In Proceedings of the International Conference on Software Engineering(ICSE ’12). IEEE Computer Society, 815–825.

[25]

Michael Pradel, Philipp Bichsel, and Thomas R Gross. 2010. A framework for the evaluation of specification miners based on finite state machines. In 2010 IEEE International Conference on Software Maintenance. IEEE, 1–10.

Digital Library

[26]

Michael Pradel and Thomas R Gross. 2012. Leveraging test generation and specification mining for automated bug detection without false positives. In Proceedings of the International Conference on Software Engineering. IEEE, 288–298.

[27]

Murali Krishna Ramanathan, Ananth Grama, and Suresh Jagannathan. 2007. Path-sensitive inference of function precedence protocols. In Proceedings of the International Conference on Software Engineering. IEEE, 240–250.

Digital Library

[28]

Pooja Rani, Sebastiano Panichella, Manuel Leuenberger, Andrea Di Sorbo, and Oscar Nierstrasz. 2021. How to identify class comment types? A multi-language approach for class comment classification. Journal of Systems and Software 181 (2021), 111047.

Digital Library

[29]

Tomohiro Sakaguchi, Daisuke Kawahara, and Sadao Kurohashi. 2018. Comprehensive Annotation of Various Types of Temporal Information on the Time Axis. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA).

[30]

Sebastian Schuster and Christopher D Manning. 2016. Enhanced english universal dependencies: An improved representation for natural language understanding tasks. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). 2371–2378.

[31]

Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. /* iComment: Bugs or Bad Comments? */. In Proceedings of the Symposium on Operating Systems Principles(SOSP ’07). ACM, 145–158.

Digital Library

[32]

Lin Tan, Yuanyuan Zhou, and Yoann Padioleau. 2011. aComment: Mining Annotations from Comments and Code to Detect Interrupt Related Concurrency Bugs. In Proceedings of the International Conference on Software Engineering(ICSE ’11). 11–20.

Digital Library

[33]

Shin Hwei Tan, Darko Marinov, Lin Tan, and Gary T. Leavens. 2012. @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies. In Proceedings of the International Conference on Software Testing, Verification and Validation(ICST ’12). IEEE Computer Society, 260–269.

Digital Library

[34]

Suresh Thummalapenta, Tao Xie, Nikolai Tillmann, Jonathan de Halleux, and Wolfram Schulte. 2009. MSeqGen: Object-Oriented Unit-Test Generation Via Mining Source Code. In Proceedings of the European Software Engineering Conference held jointly with the ACM SIGSOFT International Symposium on Foundations of Software Engineering(ESEC/FSE ’09). ACM, 193–202.

Digital Library

[35]

Westley Weimer and George Necula. 2005. Mining temporal specifications for error detection. In TACAS 2005: Tools and Algorithms for the Construction and Analysis of Systems (TACAS). Edinburgh, UK, 461–476.

Digital Library

[36]

Jinlin Yang, David Evans, Deepali Bhardwaj, Thirumalesh Bhat, and Manuvir Das. 2006. Perracotta: Mining Temporal API Rules from Imperfect Traces. In ICSE 2006, Proceedings of the 28th International Conference on Software Engineering. Shanghai, China, 282–291.

Digital Library

[37]

Hao Zhong, Lu Zhang, Tao Xie, and Hong Mei. 2009. Inferring Resource Specifications from Natural Language API Documentation. In Proceedings of the International Conference on Automated Software Engineering(ASE ’09). IEEE Computer Society, 307–318.

Digital Library

Cited By

Yu XLiu LHu XKeung JXia XLo DChristakis MPradel M(2024)Practitioners’ Expectations on Automated Test GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680386(1618-1630)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680386
Vu AKehrer T(2024)Towards Generating Contracts for Scientific Data Analysis WorkflowsSC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SCW63240.2024.00256(2048-2055)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SCW63240.2024.00256
Li MChen SFan GZhang LWu HXue XFeng Z(2024)Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack DefenseCollaborative Computing: Networking, Applications and Worksharing10.1007/978-3-031-54528-3_16(281-300)Online publication date: 23-Feb-2024
https://doi.org/10.1007/978-3-031-54528-3_16
Show More Cited By

Index Terms

Call Me Maybe: Using NLP to Automatically Generate Unit Test Cases Respecting Temporal Constraints
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Documentation
    2. Software verification and validation
      1. Empirical software validation

Recommendations

A Static Approach to Prioritizing JUnit Test Cases

Test case prioritization is used in regression testing to schedule the execution order of test cases so as to expose faults earlier in testing. Over the past few years, many test case prioritization techniques have been proposed in the literature. Most ...
Generating integration test cases automatically
FSE '10: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering

In this thesis, I investigate the problem of automatically generating test cases. In particular, I focus on the problem of automatic generation of integration test cases from unit test cases. I start from the observation that software is usually ...
Translating code comments to procedure specifications
ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

Procedure specifications are useful in many software development tasks. As one example, in automatic test case generation they can guide testing, act as test oracles able to reveal bugs, and identify illegal inputs. Whereas formal specifications are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

October 2022

2006 pages

ISBN:9781450394758

DOI:10.1145/3551349

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Evaluated & Reusable / v1.1

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Swiss National Science Foundation

Conference

ASE '22

ASE '22: 37th IEEE/ACM International Conference on Automated Software Engineering

October 10 - 14, 2022

MI, Rochester, USA

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
287
Total Downloads

Downloads (Last 12 months)100
Downloads (Last 6 weeks)7

Reflects downloads up to 16 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yu XLiu LHu XKeung JXia XLo DChristakis MPradel M(2024)Practitioners’ Expectations on Automated Test GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680386(1618-1630)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680386
Vu AKehrer T(2024)Towards Generating Contracts for Scientific Data Analysis WorkflowsSC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SCW63240.2024.00256(2048-2055)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SCW63240.2024.00256
Li MChen SFan GZhang LWu HXue XFeng Z(2024)Robustness-Enhanced Assertion Generation Method Based on Code Mutation and Attack DefenseCollaborative Computing: Networking, Applications and Worksharing10.1007/978-3-031-54528-3_16(281-300)Online publication date: 23-Feb-2024
https://doi.org/10.1007/978-3-031-54528-3_16
Hao SNan YZheng ZLiu X(2023)SmartCoCo: Checking Comment-Code Inconsistency in Smart Contracts via Constraint Propagation and Binding2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00142(294-306)Online publication date: 11-Sep-2023
https://doi.org/10.1109/ASE56229.2023.00142

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents