short-paper

Generating counterexamples in the form of unit tests from hoare-style verification attempts

Authors:
Amirfarhad Nilizadeh

University of Central Florida

University of Central Florida
View Profile

,
Marlon Calvo

University of Central Florida

University of Central Florida
View Profile

,
Gary T. Leavens

University of Central Florida

University of Central Florida
View Profile

,
David R. Cok

Safer Software Consulting, LLC

Safer Software Consulting, LLC
View Profile

FormaliSE '22: Proceedings of the IEEE/ACM 10th International Conference on Formal Methods in Software EngineeringMay 2022Pages 124–128https://doi.org/10.1145/3524482.3527656

Published:21 July 2022Publication History

FormaliSE '22: Proceedings of the IEEE/ACM 10th International Conference on Formal Methods in Software Engineering

Pages 124–128

ABSTRACT

Unit tests that demonstrate why a program is incorrect have many potential uses, including localizing bugs (i.e., showing where code is wrong), improving test suites, and better code synthesis. However, counterexamples produced by failed attempts at Hoare-style verification (e.g., by SMT solvers) are difficult to translate into unit tests. We explain how to generate unit tests from counterexamples generated by an SMT solver and how this process could be embodied in a prototype tool. This process combines static verification techniques and runtime assertion checking.

References

Wolfgang Ahrendt, Bernhard Beckert, Richard Bubel, Reiner Hähnle, Peter H. Schmitt, and Mattias Ulbrich (Eds.). 2016. Deductive Software Verification - The KeY Book. Lecture Notes in Computer Science, Vol. 10001. Springer-Verlag, Cham, Switzerland.Google Scholar
Rajeev Alur, Rishabh Singh, Dana Fisman, and Armando Solar-Lezama. 2018. Search-based program synthesis. Commun. ACM 61, 12 (2018), 84--93.Google ScholarDigital Library
Krzysztof R Apt and Ernst-Rüdiger Olderog. 2019. Fifty years of Hoare's logic. Formal Aspects of Computing 31, 6 (2019), 751--807.Google ScholarDigital Library
Gianluca Barbon, Vincent Leroy, and Gwen Salaün. 2017. Debugging of Concurrent Systems Using Counterexample Analysis. In Fundamentals of Software Engineering, Mehdi Dastani and Marjan Sirjani (Eds.). Springer International Publishing, Cham, 20--34.Google Scholar
Bernhard Beckert, Reiner Hähnle, and Peter H. Schmitt. 2007. Verification of Object-Oriented Software: The KeY Approach. Lecture Notes in Computer Science, Vol. 4334. Springer-Verlag, Berlin.Google ScholarDigital Library
Ilan Beer, Shoham Ben-David, Hana Chockler, Avigail Orni, and Richard Trefler. 2012. Explaining counterexamples using causality. Formal Methods in System Design 40, 1 (2012), 20--40.Google ScholarDigital Library
Dirk Beyer, Adam J. Chlipala, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar. 2004. Generating tests from counterexamples. In Proceedings. 26th International Conference on Software Engineering (ICSE'04). IEEE, 326--335.Google ScholarDigital Library
Dirk Beyer and Marie-Christine Jakobs. 2019. CoVeriTest: Cooperative Verifier-Based Testing. In Fundamental Approaches to Software Engineering, Reiner Hähnle and Wil van der Aalst (Eds.). Springer International Publishing, Cham, 389--408.Google Scholar
Jasmin Christian Blanchette and Tobias Nipkow. 2010. Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder. In Interactive Theorem Proving, Matt Kaufmann and Lawrence C. Paulson (Eds.). Springer, Berlin, Heidelberg, 131--146.Google Scholar
Thomas Bochot, Pierre Virelizier, Helene Waeselynck, and Virginie Wiels. 2010. Paths to Property Violation: A Structural Approach for Analyzing Counter-Examples. In 2010 IEEE 12th International Symposium on High Assurance Systems Engineering. IEEE, Los Altos, CA, 74--83. Google ScholarDigital Library
Jonathan P Bowen and Michael G Hinchey. 1995. Ten commandments of formal methods. Computer 28, 4 (1995), 56--63.Google ScholarDigital Library
Chandrasekhar Boyapati, Sarfraz Khurshid, and Darko Marinov. 2002. Korat: Automated testing based on Java predicates. ACM SIGSOFT Software Engineering Notes 27, 4 (2002), 123--133.Google ScholarDigital Library
Lilian Burdy, Yoonsik Cheon, David R Cok, Michael D Ernst, Joseph R Kiniry, Gary T Leavens, K Rustan M Leino, and Erik Poll. 2005. An overview of JML tools and applications. International journal on software tools for technology transfer 7, 3 (2005), 212--232.Google Scholar
Cristian Cadar, Patrice Godefroid, Sarfraz Khurshid, Corina S. Pasareanu, Koushik Sen, Nikolai Tillmann, and Willem Visser. 2011. Symbolic execution for software testing in practice: preliminary assessment. In 33rd International Conference on Software Engineering (ICSE). IEEE, 1066--1071.Google ScholarDigital Library
E. M. Clarke, O. Grumberg, K. L. McMillan, and X. Zhao. 1995. Efficient Generation of Counterexamples and Witnesses in Symbolic Model Checking. In Proceedings of the 32nd Annual ACM/IEEE Design Automation Conference (San Francisco, California, USA) (DAC '95). Association for Computing Machinery, New York, NY, USA, 427--432. Google ScholarDigital Library
Edmund M Clarke and Jeannette M Wing. 1996. Formal methods: State of the art and future directions. ACM Computing Surveys (CSUR) 28, 4 (1996), 626--643.Google ScholarDigital Library
David Cok. 2011. OpenJML: JML for Java 7 by Extending OpenJDK. In NASA Formal Methods, Mihaela Bobaru, Klaus Havelund, Gerard Holzmann, and Rajeev Joshi (Eds.). Lecture Notes in Computer Science, Vol. 6617. Springer-Verlag, Berlin, 472--479. https://tinyurl.com/3rympeb8Google Scholar
David R. Cok. 2021. JML and OpenJML for Java 16. In Proceedings of the 23rd ACM International Workshop on Formal Techniques for Java-like Programs (Virtual, Denmark) (FTfJP 2021). Association for Computing Machinery, New York, NY, USA, 65--67. Google ScholarDigital Library
Byron Cook. 2018. Formal Reasoning About the Security of Amazon Web Services. In Computer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham, 38--47.Google Scholar
D. S. Coutant, S. Meloy, and M. Ruscetta. 1988. DOC: A Practical Approach to Source-Level Debugging of Globally Optimized Code. SIGPLAN Not. 23, 7 (jun 1988), 125--134. Google ScholarDigital Library
Alessandro Fantechi, Stefania Gnesi, and Adriana Maggiore. 2005. Enhancing test coverage by back-tracing model-checker counterexamples. Electronic Notes in Theoretical Computer Science 116 (2005), 199--211.Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic Test Suite Generation for Object-Oriented Software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (Szeged, Hungary) (ESEC/FSE '11). Association for Computing Machinery, New York, NY, USA, 416--419. Google ScholarDigital Library
Mary Jean Harrold, Gregg Rothermel, Kent Sayre, Rui Wu, and Liu Yi. 2000. An empirical investigation of the relationship between spectra differences and regression faults. Software Testing, Verification and Reliability 10, 3 (2000), 171--194.Google ScholarCross Ref
C. A. R. Hoare. 1969. An Axiomatic Basis for Computer Programming. Commun. ACM 12, 10 (Oct. 1969), 576--580,583. Google ScholarDigital Library
Arut Prakash Kaleeswaran, Arne Nordmann, Thomas Vogel, and Lars Grunske. 2020. Counterexample Interpretation for Contract-Based Design. In Model-Based Safety and Assessment, Marc Zeller and Kai Höfig (Eds.). Springer International Publishing, Cham, 99--114.Google Scholar
Gary T. Leavens, Albert L. Baker, and Clyde Ruby. 1999. JML: A Notation for Detailed Design. In Behavioral Specifications of Businesses and Systems, Haim Kilov, Bernhard Rumpe, and Ian Simmonds (Eds.). Kluwer Academic Publishers, Boston, 175--188.Google Scholar
Gary T Leavens, Albert L Baker, and Clyde Ruby. 2006. Preliminary design of JML: A behavioral interface specification language for Java. ACM SIGSOFT Software Engineering Notes 31, 3 (2006), 1--38.Google ScholarDigital Library
Gary T Leavens, David R Cok, and Amirfarhad Nilizadeh. 2022. Further Lessons from the JML Project. In Festschrift for Reiner Hähnle (In Press). Springer.Google Scholar
Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, 691--701.Google ScholarDigital Library
Bertrand Meyer. 1992. Applying 'Design by Contract'. Computer 25, 10 (Oct. 1992), 40--51.Google ScholarDigital Library
Amirfarhad Nilizadeh. 2021. Test Overfitting: Challenges, Approaches, and Measurements. Technical Report. University of Central Florida, Computer Science.Google Scholar
Amirfarhad Nilizadeh. 2022. Automated program repair and test overfitting: measurements and approaches using formal methods. In 2022 15th IEEE Conference on Software Testing, Verification and Validation (ICST) (In Press). IEEE.Google ScholarCross Ref
Amirfarhad Nilizadeh, Marlon Calvo, Gary T. Leavens, and Xuan-Bach D. Le. 2021. More Reliable Test Suites for Dynamic APR by using Counterexamples. In 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE, 208--219.Google Scholar
Amirfarhad Nilizadeh and Gary T. Leavens. 2022. Be Realistic: Automated Program Repair is a Combination of Undecidable Problems. In 2022 IEEE/ACM International Workshop on Automated Program Repair (APR) (In Press). IEEE.Google Scholar
Amirfarhad Nilizadeh, Gary T. Leavens, Xuan-Bach D. Le, Corina S. Păsăreanu, and David R. Cok. 2021. Exploring True Test Overfitting in Dynamic Automated Program Repair using Formal Methods. In 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). IEEE, 229--240.Google Scholar
Amirfarhad Nilizadeh, Gary T. Leavens, and Corina S. Păsăreanu. 2021. Using a Guided Fuzzer and Preconditions to Achieve Branch Coverage with Valid Inputs. In Tests and Proofs, Frédéric Loulergue and Franz Wotawa (Eds.). Springer International Publishing, Cham, 72--84. https://tinyurl.com/4xzxxrn2Google Scholar
Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815--816.Google Scholar
Nankai Pan and Eunjee Song. 2012. An Aspect-Oriented Testability Framework. In Proceedings of the 2012 ACM Research in Applied Computation Symposium (San Antonio, Texas) (RACS '12). Association for Computing Machinery, New York, NY, USA, 356--363. Google ScholarDigital Library
Hila Peleg and Nadia Polikarpova. 2020. Perfect Is the Enemy of Good: Best-Effort Program Synthesis. In 34th European Conference on Object-Oriented Programming (ECOOP 2020) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 166), Robert Hirschfeld and Tobias Pape (Eds.). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2:1--2:30.Google Scholar
Erik Poll. 2009. Teaching Program Specification and Verification Using JML and ESC/Java2. In Teaching Formal Methods, Jeremy Gibbons and José Nuno Oliveira (Eds.). Springer, Berlin, Heidelberg, 92--104.Google Scholar
Edward K Smith, Earl T Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? overfitting in automated program repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 532--543.Google ScholarDigital Library
Lionel van den Berg, Paul Strooper, and Wendy Johnston. 2007. An automated approach for the interpretation of counter-examples. Electronic Notes in Theoretical Computer Science 174, 4 (2007), 19--35.Google ScholarDigital Library
Yi Wei, Yu Pei, Carlo A Furia, Lucas S Silva, Stefan Buchholz, Bertrand Meyer, and Andreas Zeller. 2010. Automated fixing of programs with contracts. In Proceedings of the 19th international symposium on Software testing and analysis. 61--72.Google ScholarDigital Library
Jifeng Xuan, Matias Martinez, Favio Demarco, Maxime Clement, Sebastian Lamelas Marcote, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2016. Nopol: Automatic repair of conditional statement bugs in Java programs. IEEE Transactions on Software Engineering 43, 1 (2016), 34--55.Google ScholarDigital Library
Luciano Zemín, Simón Gutiérrez Brida, Ariel Godio, César Cornejo, Renzo Degiovanni, Germán Regis, Nazareno Aguirre, and Marcelo Frias. 2017. An analysis of the suitability of test-based patch acceptance criteria. In 2017 IEEE/ACM 10th International Workshop on Search-Based Software Testing (SBST). IEEE, 14--20.Google ScholarCross Ref
Daniel M. Zimmerman and Rinkesh Nagmoti. 2010. JMLUnit: The Next Generation. In International Conference on Formal Verification of Object-Oriented Software (FoVeOOS 2010) (Paris, France). Springer-Verlag, Berlin, 183--197.Google Scholar

Recommendations

Generating parameterized unit tests
ISSTA '11: Proceedings of the 2011 International Symposium on Software Testing and Analysis

State-of-the art techniques for automated test generation focus on generating executions that cover program behavior. As they do not generate oracles, it is up to the developer to figure out what a test does and how to check the correctness of the ...
Read More
Generating unit tests from formal proofs
TAP'07: Proceedings of the 1st international conference on Tests and proofs

We present a new automatic test generation method for JAVA CARD based on attempts at formal verification of the implementation under test (IUT). Self-contained unit tests in JUnit format are generated automatically. The advantages of the approach are: (...
Read More
Generating Tests from Counterexamples
ICSE '04: Proceedings of the 26th International Conference on Software Engineering

We have extended the software model checker BLAST toautomatically generate test suites that guarantee full coveragewith respect to a given predicate. More precisely, givena C program and a target predicate p, BLAST determinesthe set L of program ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
FormaliSE '22: Proceedings of the IEEE/ACM 10th International Conference on Formal Methods in Software Engineering
May 2022
137 pages
ISBN:9781450392877
DOI:10.1145/3524482
General Chairs:
Stefania Gnesi
ISTI-CNR, Italy
,
Nico Plat
Thanos, The Netherlands
,
Program Chairs:
Arnd Hartmanns
University of Twente, The Netherlands
,
Ina Schaefer
Technische Universität Braunschweig, Germany
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 July 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- short-paper
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 63
  Total Downloads
- Downloads (Last 12 months)29
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Generating counterexamples in the form of unit tests from hoare-style verification attempts

FormaliSE '22: Proceedings of the IEEE/ACM 10th International Conference on Formal Methods in Software Engineering

ABSTRACT

References

Cited By

Recommendations

Generating parameterized unit tests

Generating unit tests from formal proofs

Generating Tests from Counterexamples