Skip to main content
Log in

Automatic equivalence checking of programs with uninterpreted functions and integer arithmetic

  • SPIN 2013
  • Published:
International Journal on Software Tools for Technology Transfer Aims and scope Submit manuscript

Abstract

Proving equivalence of programs has several important applications, including algorithm recognition, regression checking, compiler optimization verification and validation, and information flow checking. Despite being a topic with so many important applications, program equivalence checking has seen little advances over the past decades due to its inherent (high) complexity. In this paper, we propose, to the best of our knowledge, the first semi-algorithm for the automatic verification of partial equivalence of two programs over the combined theory of uninterpreted function symbols and integer arithmetic (UF+IA). The proposed algorithm supports, in particular, programs with nested loops. The crux of the technique is a transformation of uninterpreted functions (UFs) applications into integer polynomials, which enables the precise summarization of loops with UF applications using recurrences. The equivalence checking algorithm then proceeds on loop-free, integer only programs. We implemented the proposed technique in CORK, a tool that automatically verifies the correctness of compiler optimizations, and we show that it can prove more optimizations correct than state-of-the-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. As an anecdote, when developing this example, we forgot the \(\mathbf{if } \) command in the program on the right. Fortunately, our prototype quickly pointed out our mistake (of different values of \(i\) at the end of the programs when the loops do not execute).

  2. Prototype and benchmarks available from http://web.ist.utl.pt/nuno.lopes/cork/.

References

  1. Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: principles, techniques, and tools, 2nd edn. Addison-Wesley (2006)

  2. Albarghouthi, A., Li, Y., Gurfinkel, A., Chechik, M.: UFO: a frame work for abstraction-and interpolation-based software verification. In: Proceedings of the 24th International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2012)

  3. Alias, C., Barthou, D.: On the recognition of algorithm templates. In: Proceedings of the 2nd International Workshop on Compiler Optimization Meets Compiler Verification. Elsevier, Amsterdam (2003)

  4. Ball, T., Rajamani, S.K.: The SLAM project: debugging system software via static analysis. In: Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, ACM, New York (2002)

  5. Barthe, G., D’Argenio, P.R., Rezk, T.: Secure information flow by self-composition. In: Proceedings of the 17th IEEE Workshop on Computer Security Foundations. IEEE Computer Society, Washington (2004)

  6. Barthe, G., Crespo, J.M., Kunz, C.: Relational verification using product programs. In: Proceedings of the 17th International Conference on Formal Methods. Springer, Berlin, Heidelberg (2011)

  7. Barthou, D., Feautrier, P., Redon, X.: On the equivalence of two systems of affine recurrence equations. In: Proceedings of the 8th International Euro-Par Conference on Parallel Processing. Springer, Berlin, Heidelberg (2002)

  8. Benton, N.: Simple relational correctness proofs for static analyses and program transformations. In: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2004)

  9. Bertot, Y., Castéran, P.: Interactive theorem proving and program development. Coq’Art: the calculus of inductive constructions. Springer–Verlag (2004)

  10. Beyer, D., Keremoglu, M.E.: CPAchecker: a tool for configurable software verification. In: Proceedings of the 23rd International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2011)

  11. Beyer, D., Henzinger, T.A., Majumdar, R., Rybalchenko, A.: Invariant synthesis for combined theories. In: Proceedings of the 8th International Conference on Verification, Model Checking, and Abstract Interpretation. Springer, Berlin, Heidelberg (2007)

  12. Blanc, R., Henzinger, T.A., Hottelier, T., Kovács, L.: ABC: algebraic bound computation for loops. In: Proceedings of the 16th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning. Springer, Berlin, Heidelberg (2010)

  13. Bozga, M., Iosif, R., Konečný, F.: Fast acceleration of ultimately periodic relations. In: Proceedings of the 22nd International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2010)

  14. Cachera, D., Jensen, T., Jobin, A., Kirchner, F.: Inference of polynomial invariants for imperative programs: a farewell to Gröbner bases. In: Proceedings of the 19th International Conference on Static Analysis. Springer, Berlin, Heidelberg (2012)

  15. Cadar, C., Dunbar, D., Engler, D.: KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation. USENIX Association, Berkeley (2008)

  16. Cahen, P.-J., Chabert, J.-L.: Integer-valued polynomials. In: Mathematical Surveys and Monographs. vol. 48. American Mathematical Society, Rhode Island (1997)

  17. Cahen, P.-J., Chabert, J.-L., Frisch, S.: Interpolation domains. J. Algebra 225(2), 794–803 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  18. Caniart, N., Fleury, E., Leroux, J., Zeitoun, M.: Accelerating interpolation-based model-checking. In: Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, Berlin, Heidelberg (2008)

  19. Chaki, S., Gurfinkel, A., Strichman, O.: Regression verification for multi-threaded programs. In: Proceedings of the 13th International Conference on Verification, Model Checking, and Abstract Interpretation. Springer, Berlin, Heidelberg (2012)

  20. Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. ACM, New York (1978)

  21. Dai, L., Xia, B., Zhan, N.: Generating non-linear interpolants by semidefinite programming. In: Proceedings of the 25th International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2013)

  22. de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, Berlin, Heidelberg (2008)

  23. Dissegna, S., Logozzo, F., Ranzato, F.: Tracing compilation by abstract interpretation. In: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2014)

  24. Frisch, S.: Interpolation by integer-valued polynomials. J. Algebra 211(2), 562–577 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  25. Gasca, M., Sauer, T.: On the history of multivariate polynomial interpolation. J. Comput. Appl. Math. 122(1–2), 23–35 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  26. Godlin, B., Strichman, O.: Inference rules for proving the equivalence of recursive procedures. Acta Inform. 45(6), 403–439 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  27. Godlin, B., Strichman, O.: Regression verification. In: Proceedings of the 46th Annual Design Automation Conference. IEEE Computer Society, Washington (2009)

  28. Goldberg, B., Zuck, L., Barrett, C.: Into the loops: practical issues in translation validation for optimizing compilers. Electron. Notes Theor. Comput. Sci. 132(1), 53–71 (2005)

    Article  Google Scholar 

  29. Gonnord, L., Schrammel, P.: Abstract acceleration in linear relation analysis. Sci. Comput. Program. 93, 125–153 (2014)

  30. Grebenshchikov, S., Lopes, N.P., Popeea, C., Rybalchenko, A.: Synthesizing software verifiers from proof rules. In: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York (2012)

  31. Gulwani, S., Tiwari, A.: Assertion checking over combined abstraction of linear arithmetic and uninterpreted functions. In: Proceedings of the 15th European Conference on Programming Languages and Systems. Springer, Berlin, Heidelberg (2006)

  32. Gulwani, S., Mehra, K.K., Chilimbi, T.: SPEED: precise and efficient static estimation of program computational complexity. In: Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2009)

  33. Gulwani, S., Srivastava, S., Venkatesan, R.: Constraint-based invariant inference over predicate abstraction. In: Proceedings of the 10th International Conference on Predicate Abstraction. Springer, Berlin, Heidelberg (2009)

  34. Guo, S.-Y., Palsberg, J.: The essence of compiling with traces. In: Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2011)

  35. Gupta, A., Popeea, C., Rybalchenko, A.: Solving recursion-free horn clauses over LI+UIF. In: Proceedings of the 9th Asian Conference on Programming Languages and Systems. Springer, Berlin, Heidelberg (2011)

  36. Hawblitzel, C., Kawaguchi, M., Lahiri, S.K., Rebêlo, H.: Towards modularly comparing programs using automated theorem provers. In: Proceedings of the 24th International Conference on Automated Deduction. Springer, Berlin, Heidelberg (2013)

  37. Henzinger, T.A., Jhala, R., Majumdar, R., McMillan, K.L.: Abstractions from proofs. In: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2004)

  38. Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, NewYork (2002)

  39. Hojjat, H., Iosif, R., Konečný, F., Kuncak, V., Rümmer, P.: Accelerating interpolants. In: Proceedings of the 10th International Conference on Automated Technology for Verification and Analysis. Springer, Berlin, Heidelberg (2012)

  40. Jhala, R., Majumdar, R.: Software model checking. ACM Comput. Surv. 41(4), 21:1–21:54 (2009)

    Article  Google Scholar 

  41. Kundu, S., Tatlock, Z., Lerner, S.: Proving optimizations correct using parameterized program equivalence. In: Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York (2009)

  42. Lahiri, S.K., Hawblitzel, C., Kawaguchi, M., Rebêlo, H.: SymDiff: a language-agnostic semantic diff tool for imperative programs. In: Proceedings of the 24th International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2012)

  43. Le, V., Afshari, M., Su, Z.: Compiler validation via equivalence modulo inputs. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York (2014)

  44. Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009)

    Article  Google Scholar 

  45. Li, Y., Albarghouthi, A., Kincaid, Z., Gurfinkel, A., Chechik, M.: Symbolic optimization with SMT solvers. In: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2014)

  46. Liang, H., Feng, X., Fu, M.: A rely-guarantee-based simulation for verifying concurrent program transformations. In: Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2012)

  47. Lopes, N.P., Monteiro, J.: Automatic equivalence checking of UF+IA programs. In: Proceedings of the 20th International SPIN Symposium on Model Checking of Software. Springer, Berlin, Heidelberg (2013)

  48. Lopes, N.P., Monteiro, J.: Weakest precondition synthesis for compiler optimizations. In: Proceedings of the 15th International Conference on Verification, Model Checking, and Abstract Interpretation. Springer, Berlin, Heidelberg (2014)

  49. Matsumoto, T., Saito, H., Fujita, M.: Equivalence checking of C programs by locally performing symbolic simulation on dependence graphs. In: Proceedings of the 7th International Symposium on Quality Electronic Design. IEEE Computer Society, Washington (2006)

  50. McMillan, K.L.: Interpolants from Z3 proofs. In: Proceedings of the International Conference on Formal Methods in Computer-Aided Design. Springer, Berlin, Heidelberg (2011)

  51. McMillan, K.L., Rybalchenko, A.: Computing relational fixed points using interpolation. Technical Report MSR-TR-2013-6 (2013)

  52. Muchnick, S.S.: Advanced compiler design and implementation. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  53. Müller-Olm, M., Seidl, H.: Computing polynomial program invariants. Inf. Process. Lett. 91(5), 233–244 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  54. Namjoshi, K.S., Zuck, L.D.:Witnessing program transformations. In: Proceedings of the 20th International Conference on Static Analysis. Springer, Berlin, Heidelberg (2013)

  55. Necula, G.C.: Translation validation for an optimizing compiler. In: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation. ACM, New York (2000)

  56. Olver, P.J.: On multivariate interpolation. Stud. Appl. Math. 116(2), 201–240 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  57. Person, S., Dwyer, M.B., Elbaum, S., Pǎsǎreanu, C.S.: Differential symbolic execution. In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, New York (2008)

  58. Pnueli, A., Siegel, M., Singerman, E.: Translation validation. In: Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems. Springer, Berlin, Heidelberg (1998)

  59. Ramos, D.A., Engler, D.R.: Practical, low-effort equivalence verification of real code. In: Proceedings of the 23rd International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2011)

  60. Rodríguez-Carbonell, E., Kapur, D.: Generating all polynomial invariants in simple loops. J. Symb. Comput. 42(4), 443–476 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  61. Rybalchenko, A., Sofronie-Stokkermans, V.: Constraint solving for interpolation. In: Proceedings of the 8th International Conference on Verification, Model Checking, and Abstract Interpretation. Springer, Berlin, Heidelberg (2007)

  62. Sangiorgi, D.: On the origins of bisimulation and coinduction. ACM Trans. Program. Lang. Syst. 31(4), 15:1–15:41 (2009)

  63. Sankaranarayanan, S., Sipma, H.B., Manna, Z.: Non-linear loopinvariant generation using Gröbner bases. In: Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, New York (2004)

  64. Sharma, R., Dillig, I., Dillig, T., Aiken, A.: Simplifying loop invariant generation using splitter predicates. In: Proceedings of the 23rd International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2011)

  65. Shashidhar, K.C., Bruynooghe, M., Catthoor, F., Janssens, G.: Verification of source code transformations by program equivalence checking. In: Proceedings of the 14th International Conference on Compiler Construction. Springer, Berlin, Heidelberg (2005)

  66. Stepp, M., Tate, R., Lerner, S.: Equality-based translation validator for LLVM. In: Proceedings of the 23rd International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2011)

  67. Strang, G.: Linear algebra and its applications, 2nd edn. Academic Press, New York (1980)

  68. Tatlock, Z., Lerner, S.: Bringing extensibility to verified compilers. In: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York (2010)

  69. Terauchi, T.: Aiken, A.: Secure information flow as a safety problem. In: Proceedings of the 12th International Conference on Static Analysis. Springer, Berlin, Heidelberg (2005)

  70. Tristan, J.-B., Govereau, P., Morrisett, G.: Evaluating value-graph translation validation for LLVM. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York (2011)

  71. Verdoolaege, S., Janssens, G., Bruynooghe, M.: Equivalence checking of static affine programs using widening to handle recurrences. In: Proceedings of the 21st International Conference on Computer Aided Verification. Springer, Berlin, Heidelberg (2009)

  72. Yang, X., Chen, Y., Eide, E., Regehr, J.: Finding and understanding bugs in C compilers. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York (2011)

  73. Zaks, A., Pnueli, A.: CoVaC: Compiler validation by program analysis of the cross-product. In: Proceedings of the 15th International Symposium on Formal Methods. Springer, Berlin, Heidelberg (2008)

  74. Zhao, J., Nagarakatte, S., Martin, M.M.K., Zdancewic, S.: Formal verification of SSA-based optimizations for LLVM. In: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York (2013)

  75. Zuck, L., Pnueli, A., Goldberg, B., Barrett, C., Fang, Y., Hu, Y.: Translation and run-time validation of loop transformations. Form. Methods Syst. Des. 27(3), 335–360 (2005)

Download references

Acknowledgments

The authors thank João Pedro Afonso, Ruslán Ledesma-Garza, and the anonymous reviewers for their comments and suggestions on earlier drafts of this paper, as well as João Pimentel Nunes for insightful discussions. This work was partially supported by the FCT grants SFRH/BD/63609/2009 and INESC-ID multiannual funding PEst-OE/EEI/LA0021/2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nuno P. Lopes.

Appendices

Appendix A: Proof of soundness and completeness

Let \({\mathsf {S}}^{P}(\sigma _{})\) be a copy of state \(\sigma _{}\) where the interpretation of every uninterpreted function (UF) symbol is replaced with values for variables \(f_i\) used in Sect. 4.2.2. Moreover, the values for variables \(f_i\) and the initial variables values \(v_0\) are chosen such that for every boolean expression \(b\) appearing in program \(P_{}\), it is guaranteed that \(\sigma _{}(b) = {\mathsf {S}}^{P}(\sigma _{})({\mathsf {T}}(b))\). The justification of the existence of such an assignment is given in Theorem 1.

In this section, we use the term free variable to denote logic or program variables (depending on the context) that are not constrained and, therefore, can take any value. In particular, a free program variable is never assigned to and cannot be constrained in any program path.

Let \({\mathbb {Q}}\), \({\mathbb {R}}\), and \({\mathbb {C}}\) be, respectively, the set of rational, real, and complex numbers.

Lemma 1

(Solution for nested \(f(x)\)). For a function \(f(x)=a_n\,x^n+\cdots +a_1\,x+a_0\), an arbitrary number of nested applications of \(f\) to \(x\) can take any value in the codomain (\({\mathbb {R}}\) or \({\mathbb {C}}\)) if \(x\) and \(a_n\) are free, i.e., \(f(f(\cdots f(x)\cdots ))=b\) always has a solution for fixed \(a_{n-1},\ldots ,a_0,b\) and free \(a_n\) and \(x\).

Proof

The maximum degree of the polynomial given by \(p=f(f(\cdots f(x)\cdots ))-b\) in \(x\) is \(n^k\), where \(k > 0\) is the number of applications of \(f\). If \(n\) (the degree of \(x\) in \(f(x)\)) is odd, then \(n^k\) will be odd as well. Therefore, if \(n\) is odd, it follows from the intermediate value theorem [67] that there always exists a value for \(x\) for arbitrary \(a_n,\ldots ,a_0,b\) such that \(p=0\).

The maximum degree of \(a_n\) in \(p\) is given by the following recurrence: \(d(k)=n\,d(k-1)+1\) and \(d(0)=0\). The closed-form solution for this recurrence is \(d(k)=\frac{n^k-1}{n-1}\). Now assume that \(n\) is even, since we already proved the lemma for \(n\) odd. We can then conclude that \(d(k)\) is odd for any nonnegative \(k\) and, therefore, there exists \(a_n\) for arbitrary \(a_{n-1},\ldots ,a_0,b,x\) such that \(p=0\). \(\square \)

Lemma 2

(Solution for conjunction of nested \(f(x)\)). For a function \(f(x)=a_n\,x^n+\cdots +a_1\,x+a_0\), a conjunction of nested applications of \(f\) of the form \(f(f(\cdots f(x_1)\cdots ))=b_1 \mathrel {\wedge }\cdots \mathrel {\wedge }f(f(\cdots f(x_q)\cdots ))=b_q\) is satisfiable if any of the following statements holds:

  1. 1.

    Coefficients \(a_i\) range over \({\mathbb {C}}\) and at least \(q \le n+1\) of those are free.

  2. 2.

    Variables \(x_i\) range over \({\mathbb {C}}\) and are free.

  3. 3.

    \(n\) is odd and variables \(x_i\) range over \({\mathbb {R}}\) and are free.

  4. 4.

    \(q=1\) and \(a_n\) and \(x_1\) range over \({\mathbb {R}}\) and are free.

Proof

For condition 1, we note that there is at least one free coefficient \(a_i\) for each polynomial in the conjunction. Then it follows from the fundamental theorem of algebra [67] that it is always possible to find values for the free \(a_i\) that satisfy the equalities. Similar reasoning apply for condition 2, where each equality can be solved in order of its respective \(x_i\).

Conditions 3 and 4 follow directly from Lemma 1. \(\square \)

We now state under which conditions the transformation \({\mathsf {T}}\) as given in Sect. 4.2.2 is sound and complete, which we will use later to prove Theorems 1 and 2.

Definition 1

\({\mathsf {T}}\) is sound and complete if one of the following statements holds:

  1. 1.

    There are no nested applications of UFs in loops and program variables range over \({\mathbb {Q}}\), \({\mathbb {R}}\), or \({\mathbb {C}}\).

  2. 2.

    Variables \(f_i\) range over \({\mathbb {C}}\).

  3. 3.

    There is only one nested UF application produced by a loop, say \(f(f(\cdots f(x)\cdots ))\), with \(x\) being free, and variables \(f_i\) and \(x\) ranging over \({\mathbb {R}}\).

  4. 4.

    \({\mathsf {u}}(f)\) is odd for all \(f\) appearing in nested applications in loops, and the input to these applications are variables that are free and range over \({\mathbb {R}}\).

We note that \({\mathsf {u}}(f)\) can always be arbitrarily increased (to, e.g., become odd) if need be. Also, to guarantee soundness, program variables and polynomial coefficients can be changed to take values in larger domains (say, convert from \({\mathbb {Z}}\) to \({\mathbb {R}}\)), by giving up on completeness. With such a change, the algorithm remains sound, i.e., if it proves that two programs are equivalent then they are. However, losing completeness means that the algorithm may fail to prove equivalence of two equivalent programs because a larger variable domain may increase the set of possible behaviors/outcomes of a program, which can lead to the loss of equivalence.

Definition 2

We define statically implied equalities of UF symbols as the set of all equalities involving applications of UFs that are implied by any static path in a given program (e.g., \(f(x) = 3\)). Nested applications of UFs arising from loops are not unfolded. For example, for a program “\(\mathbf{while }\ \ldots \ \mathbf{do }\ x :=f(x)\)” and a path that traverses the loop three times, we only consider the equality \(x=f(f(f(x_0)))\).

Theorem 1

(Existence of \({\mathsf {S}}^{P}(\sigma _{0})\)). For every program \(P_{}\) respecting Definition 1, \(\sigma _{0}\) is a possible initial state of \(P_{}\) iff \({\mathsf {S}}^{P}(\sigma _{0})\) also is.

Proof

If \(P\) does not contain any application of UF symbols, then the statement is trivially correct, since \(P = {\mathsf {T}}(P_{})\) and therefore \(\sigma _{0}={\mathsf {S}}^{P}(\sigma _{0})\).

Otherwise, we consider the set of statically implied equalities of UF symbols. Let \(c\) be the conjunction of the elements of said set that refer only to non-nested UF applications, and \(r\) the conjunction of the remaining elements (arbitrarily nested applications from loops). Moreover, we trivially have that \(\sigma _{0}(b) = \sigma _{0}(c \mathrel {\wedge }r)\).

We now assume that all UFs have only one input parameter and that there is only one UF symbol \(f\). Therefore, \({\mathsf {T}}(c)\) can be seen as a linear system \(Ax = b\), where \(A\) is a square matrix of size \(n \times n\) with the powers \(0\) to \((n-1)\) of the input parameters of the UF applications, and \(x\) is a vector with fresh variables \(f_i\). Moreover, \(A\) is a Vandermonde matrix [67].

For example, for \(c = f(x_1) = b_1 \mathrel {\wedge }\cdots \mathrel {\wedge }f(x_n) = b_n\), \({\mathsf {T}}(c)\) results in the following linear system:

$$\begin{aligned} \begin{bmatrix} 1&\quad x_1^1&\quad \cdots&\quad x_1^{n-1}\\ 1&\quad \vdots&\quad \ddots&\quad \vdots \\ 1&\quad x_n^1&\quad \cdots&\quad x_n^{n-1} \end{bmatrix} \begin{bmatrix} f_1\\ \vdots \\ f_n \end{bmatrix} = \begin{bmatrix} b_1\\ \vdots \\ b_n \end{bmatrix} \end{aligned}$$

If \(x_i \ne x_j\) for all \(i \ne j\), then \(c\) is satisfiable. Moreover, the lines and the columns of the coefficient matrix \(A\) are linearly independent, which guarantees that the system has a solution (by the unisolvence theorem [67]). Therefore, \({\mathsf {T}}(c)\) is also satisfiable.

If there are \(i,j\) with \(i \ne j\) such that \(x_i = x_j\), and \(c\) is satisfiable, then \(b_i=b_j\). In this case, the corresponding system of \({\mathsf {T}}(c)\) has infinitely many solutions, and therefore \({\mathsf {T}}(c)\) is satisfiable as well.

Finally, if \(c\) is unsatisfiable, then there are \(i,j\) with \(i \ne j\) such that \(x_i = x_j\) and \(b_i \ne b_j\). The linear system of \({\mathsf {T}}(c)\) has no solution, and therefore \({\mathsf {T}}(c)\) is unsatisfiable as well.

If \(c\) is unsatisfiable, then there is no interpretation for the UF symbols that makes \(c\) be \(\mathsf {true}\), and therefore we have \(\sigma _{0}(c)={\mathsf {S}}^{P}(\sigma _{0})({\mathsf {T}}(c))=\mathsf {false}\). If \(c\) is satisfiable, then \(\sigma _{}(c)\) may or may not be \(\mathsf {true}\) depending on the interpretation of the UFs in \(\sigma _{0}\), but we are always guaranteed to be able to find coefficients for \({\mathsf {S}}^{P}(\sigma _{0})\) either way such that \(\sigma _{0}(c)={\mathsf {S}}^{P}(\sigma _{0})({\mathsf {T}}(c))\).

If \(c\) contains UFs symbols with more than one parameter, then the resulting polynomials in \({\mathsf {T}}(c)\) are more complex. Similar reasoning can be done by using generalized versions of the unisolvence theorem for multivariate polynomial interpolation (cf. [25, 56]).

If \(c\) contains multiple UF symbols, the evaluation of \(c\) can be split in multiple linear systems, one per symbol.

If \(r\) is empty, then the proof is completed.

Otherwise, let \(\#c\) and \(\#r\) be, respectively, the number of equalities in \(c\) and \(r\). By definition of \({\mathsf {u}}\), we have for any \(f\) that \(\#c+\#r \le {\mathsf {u}}(f)\). Moreover, only the first \(f_1,\ldots ,f_{\#c}\) coefficients are defined by \(c\), and the remaining \(f_{\#c+1},\ldots ,f_{{\mathsf {u}}(f)}\) remain free. Therefore, the proof follows immediately from Lemma 2.

For UFs with more than one input parameter, Lemma 2 also applies by observing that the degree of the polynomial obtained by nested applications is dominated by the nested input variable and the coefficient of that parameter. \(\square \)

From Theorem 1, it follows that if \({\mathsf {u}}(f)\) is odd, then \({\mathsf {u}}(f)\) does not need to count with applications with free variables as input. This fact can be used as an optimization to reduce the degree of polynomials to the smallest odd number that is greater than or equal to the number of applications with non-free inputs.

Theorem 2

(Soundness and completeness of \({\mathsf {T}}\)). Transformation \({\mathsf {T}}\) preserves safety of programs, i.e., for any state \(\sigma _{0}\) and program \(P_{}\) respecting Definition 1, the following holds:

$$\begin{aligned} \langle P_{},\ \sigma _{0} \rangle \rightarrow ^* \sigma _{} \iff \langle {\mathsf {T}}(P_{}),\ {\mathsf {S}}^{P}(\sigma _{0}) \rangle \rightarrow ^* \sigma _{}^{\prime } \end{aligned}$$

Proof

The proof goes by structural induction on the syntax of \(P_{}\).

The base cases are: \(P_{}=\mathbf{skip } \), \(P_{}= v :=e\), and \(P_{}=\mathbf{abort } \), which are all trivially correct.

For the induction step, we need to consider three cases. As the induction hypothesis, assume that the theorem holds for commands \(c_1\) and \(c_2\).

For \(P_{} = \mathbf{if }\ b\ \mathbf{do }\ c_1\ \mathbf{else }\ c_2\), we have \({\mathsf {T}}(P_{}) = \mathbf{if }\ {\mathsf {T}}(b)\ \mathbf{do }\ {\mathsf {T}}(c_1)\ \mathbf{else }\ {\mathsf {T}}(c_2)\). By definition of \({\mathsf {S}}^{P}(.)\), we know that \(\sigma _{}(b)={\mathsf {S}}^{P}(\sigma _{})({\mathsf {T}}(b))\) and therefore \(P_{}\) reduces to \(c_1\) (resp. \(c_2\)) iff \({\mathsf {T}}(P_{})\) reduces to \({\mathsf {T}}(c_1)\) (resp. \({\mathsf {T}}(c_2)\)).

For \(P_{}= \mathbf{while }\ b\ \mathbf{do }\ c_1\), we note that since \(b\) cannot include nor depend on UF applications (per restriction 4 in Sect. 4.1), then \({\mathsf {T}}(b)=b\), and therefore \({\mathsf {T}}(P_{})= \mathbf{while }\ b\ \mathbf{do }\ {\mathsf {T}}(c_1)\). Moreover, we have that \(\sigma _{1}(b)=\sigma _{1}'(b)\) for every states \(\sigma _{1}\) and \(\sigma _{1}'\) resulting from the reduction of \(c_1\) and \({\mathsf {T}}(c_1)\), respectively, since \(b\) cannot depend on the result of any UF symbol. Therefore we are left to prove that \(\langle c_1\ ;\ \ldots \ ;\ c_1,\ \sigma _{0} \rangle \rightarrow ^* \sigma _{}\) iff \(\langle {\mathsf {T}}(c_1)\ ;\ \ldots \ ;\ {\mathsf {T}}(c_1),\ {\mathsf {S}}^{P}(\sigma _{0}) \rangle \rightarrow ^* \sigma _{}'\), which is covered in the following case.

For \(P_{} = c_1\ ;\ c_2\), assume that \(\langle c_1,\ \sigma _{0} \rangle \rightarrow ^* \sigma _{1}\) and \(\langle {\mathsf {T}}(c_1),\ {\mathsf {S}}^{P}(\sigma _{0}) \rangle \rightarrow ^* \sigma _{1}'\). If \({\mathsf {S}}^{P}(\sigma _{1})=\sigma _{1}'\), then the theorem is trivially correct. Otherwise, and without loss of generality, consider that \({\mathsf {S}}^{P}(\sigma _{1})\) and \(\sigma _{1}'\) differ only in the value of variable \(v\) because \(c_1\) contained an assignment of the form \(v :=f(x)\). Let \(c_2'\) be a copy of \(c_2\) where references to \(v\) were replaced with \(f(x)\). Therefore, our proof goal of \(\langle c_2,\ \sigma _{1} \rangle \rightarrow ^* \sigma _{2} \iff \langle {\mathsf {T}}(c_2),\ \sigma _{1}' \rangle \rightarrow ^* \sigma _{2}'\) is equivalent to \(\langle c_2',\ \sigma _{1} \rangle \rightarrow ^* \sigma _{2}'' \iff \langle {\mathsf {T}}(c_2'),\ {\mathsf {S}}^{c_2'}(\sigma _{1}) \rangle \rightarrow ^* \sigma _{2}'''\), which holds per the induction hypothesis. \(\square \)

We speculate, but leave the proof for future work, that restriction 4 in Sect. 4.1 could be lifted altogether, i.e., it may be possible to allow UFs in loop guards. We believe this could be done by counting UF symbols in loop guards twice when computing \({\mathsf {u}}(f)\) for any symbol \(f\). Intuitively, we may only need to interpolate the values of an UF symbol when the loop guard flips (i.e., when in one iteration it was true and in the following it became false).

Appendix B: Discussion on polynomial interpolation

The polynomial for \({\mathsf {p}}\left( f, e_1,\ldots ,e_n\right) \) given in Sect. 4.2.2 requires coefficients to range over the set of rational \(({\mathbb {Q}})\), real \(({\mathbb {R}})\), or complex numbers \(({\mathbb {C}})\). Therefore, for the domain of integers \(({\mathbb {Z}})\), it is unsound to use the given polynomial, since in general there may not exist integer values for variables \(f_i\) such that Theorem 1 holds.

Integer-valued polynomials are polynomials with coefficients in some domain, whose value for every point (or for every interpolating point) is an integer [16]. In particular, it is possible to interpolate a set of points using integer-valued polynomials with rational coefficients [17, 24]. However, these polynomials can only be used if the verification tool used in the algorithm supports rational numbers and their combined operation with integer variables from the remainder of the program.

There is still ongoing research on interpolation by integer-valued polynomials that may yield interesting results that could be of use for our algorithm. We leave as a conjecture that the following polynomial can interpolate any set of \(n+1\) integer points:

$$\begin{aligned} f(x) = \sum _{i=0}^n \left\lfloor \dfrac{a_i\,x^i}{b_i} \right\rfloor \end{aligned}$$

where \(a_i\) and \(b_i\) are integer coefficients, and \(\left\lfloor \dfrac{x}{y} \right\rfloor \) is the integer division. A drawback of this polynomial is that solving recurrences with integer division is harder than with, say, division in \({\mathbb {Q}}\), because the function may become discontinuous. Moreover, it is unclear whether it would be possible to amend Theorem 1 for such a polynomial.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lopes, N.P., Monteiro, J. Automatic equivalence checking of programs with uninterpreted functions and integer arithmetic. Int J Softw Tools Technol Transfer 18, 359–374 (2016). https://doi.org/10.1007/s10009-015-0366-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10009-015-0366-1

Keywords

Navigation