Skip to main content
Log in

Toward a theory of program repair

  • Original Article
  • Published:
Acta Informatica Aims and scope Submit manuscript

Abstract

To repair a program does not mean to make it (absolutely) correct; it only means to make it more-correct than it was originally. This is not a mundane academic distinction: given that programs typically have about a dozen faults per KLOC, it is important for program repair methods and tools to be designed in such a way that they map an incorrect program into a more-correct, albeit still potentially incorrect, program. Yet in the absence of a concept of relative correctness, many program repair methods and tools resort to approximations of absolute correctness; since these methods and tools are often validated against programs with a single fault, making them absolutely correct is indistinguishable from making them more-correct; this has contributed to conceal/obscure the absence of (and the need for) relative correctness. In this paper, we propose a theory of program repair based on a concept of relative correctness. We aspire to encourage researchers in program repair to explicitly specify what concept of relative correctness their method or tool is based upon; and to validate their method or tool by proving that it does enhance relative correctness, as defined.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Abreu, R.: Gzoltar: a toolset for automatic test suite minimization and fault identification. In: International Workshop on the Future of Debugging, Lugano, Switzerland (2013)

  2. Abrial, J.R.: The B Book: Assigning Programs to Meanings. Cambridge University Press, Cambridge (1996)

    Book  MATH  Google Scholar 

  3. Anonymous: Addendum, the bane of generate-and-validate program repair, crcfix data. Technical report. https://anonymous.4open.science/r/7c54e6e6-1c2f-491c-bf5a-d7f451fb463c/ (May 2020)

  4. Anonymous: Addendum, the bane of generate-and-validate program repair, crcfix tool. Technical report. https://anonymous.4open.science/r/95d330a9-97bf-44ab-9144-f214dce174d2/ (September 2020)

  5. Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.E.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Dependable Secure Comput. 1(1), 11–33 (2004)

    Article  Google Scholar 

  6. Bergstra, J.A.: Instruction sequence faults with formal change justification. Sci. Ann. Comput. Sci. 30(2), 105–166 (2020)

    MathSciNet  MATH  Google Scholar 

  7. Boudriga, N., Elloumi, F., Mili, A.: The lattice of specifications: applications to a specification methodology. Formal Aspects Comput. 4(6), 544–571 (1992)

    Article  MATH  Google Scholar 

  8. Brink, C., Kahl, W., Schmidt, G.: Relational Methods in Computer Science. Advances in Computer Science. Springer, Berlin (1997)

    MATH  Google Scholar 

  9. Christakis, M., Heizmann, M., Mansur, M.N., Schilling, C., Wuestholz, V.: Semantic fault localization and suspiciousness ranking. In: Vojnar, T., Zhang, L. (eds.) Proceedings, TACAS 2019, Number 11427 in LNCS, pp. 226–243 (2019)

  10. Debroy, V., Eric Wong, W.: Combining mutation and fault localization for automated program debugging. J. Syst. Softw. 90, 45–60 (2013)

    Article  Google Scholar 

  11. DeMarco, F., Xuan, J., Berra, D.L., Monperrus, M.: Automatic repair of buggy if conditions and missing preconditions with SMT. In: Proceedings, CSTVA, pp. 30–39 (2014)

  12. Demarco, F., Xuan, J., Berre, D.L., Monperrus, M.: Automatic repair of buggy if conditions and missing preconditions with SMT. In Proceedings, CSTVA, pp. 30–39 (2014)

  13. Desharnais, J., Diallo, N., Ghardallou, W., Frias, M.F., Jaoua, A., Mili, A.: Relational mathematics for relative correctness. In: RAMICS, 2015, volume 9348 of LNCS, Braga, Portugal. Springer, pp 191–208, September (2015)

  14. Desharnais, J., Diallo, N., Ghardallou, W., Ali, M.: Definitions and implications. In: Science of Computer Programming, Projecting programs on specifications (2017)

  15. Diallo, N., Ghardallou, W., Desharnais, J., Frias, M., Jaoua, A., Mili, A.: What is a fault? and why does it matter? ISSE 19, 219–239 (2017)

    Google Scholar 

  16. Dijkstra, E.W.: A Discipline of Programming. Prentice Hall, Englewood Cliffs (1976)

    MATH  Google Scholar 

  17. Ermis, E., Schaef, M., Wies, T.: Error invariants. In: Giannakopoulou, D., Mery, D. (eds.) Proceedings, FM 2012, Number 7436 in LNCS, pp. 187–201 (2012)

  18. Frenkel, H., Grumberg, O., Pasareanu, C., Sheinvald, S.: Assume, guarantee or repair. In: Biere, A., Parker, D. (eds.) Proceedings, TACAS 2020, Number 12078 in LNCS. Springer (2020)

  19. Gazzola, L., Micucci, D., Mariani, L.: Automatic software repair: a survey. IEEE Trans. Soft. Eng. 45(1), 34–67 (2019)

    Article  Google Scholar 

  20. Ghardallou, W., Diallo, N., Mili, A., Frias, M.: Debugging without testing. In: Proceedings, International Conference on Software Testing, Chicago, IL (April 2016)

  21. Gopinath, R., Alipour, A., Ahmed, I., Jensen, C., Groce, A.: Measuring effectiveness of mutant sets. In: Proceedings, Ninth International Conference on Software Testing, Chicago, IL, April 11–15 (2016)

  22. Gries, D.: The Science of Programming. Springer, New York (1981)

    Book  MATH  Google Scholar 

  23. Gupta, R., Pal, S., Kanade, A., Shevade, S.K.: Deepfix: Fixing common c language errors by deep learning. In: Proceedings, AAAI, pp. 1345–1351 (2017)

  24. Hehner, E.C.R.: A Practical Theory of Programming. Prentice Hall, Englewood Cliffs (1992)

    MATH  Google Scholar 

  25. Hoare, C.A.R.: An axiomatic basis for computer programming. Commun. ACM 12(10), 576–583 (1969)

    Article  MATH  Google Scholar 

  26. Hoare, C.A.R.: Unified theories of programming. In: Mathematical Methods in Program Development. Springer (1997)

  27. IEEE Std 7-4.3.2-2003. Ieee standard criteria for digital computers in safety systems of nuclear power generating stations. Technical report, The Institute of Electrical and Electronics Engineers (2003)

  28. Jiang, J.J., Xiong, Y.F., Zhang, H.Y., Gao, Q., Chen, X.C.: Shaping program repair space with existing patches and similar code. In: Proceedings, ISSTA, pp. 298–309 (2018)

  29. Jose, M., Majumdar, R.: Cause clue clauses: error localization using maximum satisfiability. In: Proceedings, PLDI, pp. 437–446 (2011)

  30. Just, R., Jalali, D., Ernst, M.D.: Defects4j: a database of existing faults to enable controlled testing studies for java programs. In: Proceedings. ISSTA 2014, pp. 437–440. CA, USA, San Jose (July 2014)

  31. Ke, Y., Stolee, K.T., Le Goues, C., Brun, Y.: Repairing programs with semantic code search. In: International Conference on Automated Software Engineering (2015)

  32. Khaireddine, B., Martinez, M., Mili, A.: Program repair at arbitrary fault depth. In: Proceedings, ICST 2019, Xi’an, China (April 2019)

  33. Khaireddine, B., Mili, A.: Quantifying faultiness: What does it mean to have \(n\) faults? In: Proceedings, FormaliSE 2021, ICSE 2021 Colocated Conference (May 2021)

  34. Khaireddine, B., Zakharchenko, A., Mili, A.: A generic algorithm for program repair. In: Proceedings, FormaliSE, Buenos Aires, Argentina (May 2017)

  35. Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches. In: International Conference on Software Engineering (ICSE), pp. 802–811 (2013)

  36. Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches. In: ICSE, pp. 802–811 (2013)

  37. Koyuncu, A., Liu, K., Bissiande, T.F., Kim, D., Klein, J., Monperrus, M., LeTraon, Y.: Fixminer: Mining relevant fix patterns for automated program repairs. In: Empirical Software Engineering, pp. 1–45 (2020)

  38. Laprie, J.C.: Dependable computing: concepts, challenges, directions. In: Proceedings, COMPSAC (2004)

  39. Le, X.-B.D., Chu, D.-H., Lo, D., Goues, C.L., Visser, W.: S3: Syntax and semantic guided repair synthesis via programming examples. In Proceedings, FSE 2017, Paderborn, Germany, September 4–8 (2017)

  40. LeGoues, C., Forrest, S., Weimer, W.: Current challenges in automatic software repair. Softw. Qual. J. 21(3), 421–443 (2013)

    Article  Google Scholar 

  41. LeGoues, C., Dewey, V.M., Forrest, S., Weimer, W.: A systematic study of automated program repair: fixing 55 out of 105 bugs for \$8 each. In: Proceedings, ICSE 2012, pp. 3–13 (2012)

  42. Li, Y., Wang, S., Nguyen, T.N.: Dlfix: context-based code transformation learning for automated program repair. In: Proceedings, ICSE 2020, Seoul, South Korea (May 2020)

  43. Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: Genprog: a generic method for automated software repair. IEEE Trans. Softw. Eng. 31(1) (2012)

  44. Lin, D., Koppel, J., Chen, A., Solar-Lezma, A.: Quixbugs: a multilingual program repair benchmark set based on the quixey challenge. In: Proceedings, SPALSH (2017)

  45. Le Goues, C., Pradel, M., Roychoudhury, A.: Automated program repair. Commun. ACM 62(12), 56–65 (2019)

    Article  Google Scholar 

  46. Liu, K., et al.: Lsrepair: Live search of fix ingredients for automated program repair. In: Proceedings, 25th Asia-Pacific Software Engineering Conference. IEEE (2018)

  47. Long, F., Rinard, M.: Prophet: automatic patch generation via learning from successful patches. Technical Report Technical Report MIT-CSAIL-TR-2015, MIT (2015)

  48. Long, F., Rinard, M.: Staged program repair with condition synthesis. In: Proceedings, ESEC-FSE, (2015)

  49. Long, F., Rinard, M.: Staged program repair with condition synthesis. In: ESEC-FSE, (2015)

  50. Long, F., Rinard, M.: An analysis of the search spaces for generate-and-validate patch generation systems. In: ICSE 2016 (2016)

  51. Lou, Y., Ghanbari, A., Li, X., Zhang, L., Zhang, H., Hao, D., Zhang, L.: Can automated program repair refine fault localization? A unified debugging approach. In: Proceedings, ISSTA, pp. 75–87 (2020)

  52. Manna, Z.: A Mathematical Theory of Computation. McGraw-Hill, New York (1974)

    MATH  Google Scholar 

  53. Martinez M., Monperrus M.: Mining software repair models for reasoning on the search space of automated program fixing. In: Empirical Software Engineering (2013)

  54. Martinez, M., Monperrus, M.: Astor: a program repair library for java. In: Proceedings. ISSTA 2016, pp. 441–444. Saarbrucken, Germany (2016)

  55. Martinez, M., Monperrus, M.: Astor: exploring the design space of generate-and-validate program repair beyond genprog (2018)

  56. Martinez, M., Monperrus, M.: Ultra large repair search space with automatically mined templates: the cardumen mode of astor. In: Proceedings, SSBSE, pp. 65–86 (2018)

  57. Mechtaev, S., Yi, J., Roychoudhury, A.: Angelix: scalable multiline program patch synthesis via symbolic analysis. In: Proceedings, ICSE 2016, Austin, TX (May 2016)

  58. Mili, A., Frias, M., Jaoua, A.: On faults and faulty programs. In: Hoefner, P., Jipsen, P., Kahl, W., Mueller, M.E. (eds.) Proceedings, RAMICS 2014, Volume 8428 of LNCS, pp. 191–207 (2014)

  59. Mills, H.D., Basili, V.R., Gannon, J.D., Hamlet, D.R.: Structured Programming: A Mathematical Approach. Allyn and Bacon, Boston (1986)

    MATH  Google Scholar 

  60. Monperrus, M.: A critical review of patch generation learned from human written patches: essay on the problem statement and evaluation of automatic software repair. In: Proceedings, ICSE 2014, Hyderabad, India (2014)

  61. Morgan, C.C.: Programming from Specifications, International Series in Computer Sciences, 2nd edn. Prentice Hall, London (1998)

    Google Scholar 

  62. Musa, J.D.: Operational profile in software reliability engineering. IEEE Softw. 10(2), 14–32 (1993)

    Article  Google Scholar 

  63. Nguyen, H.D.T., Qi, D.W., Roychoudhury, A., Chandra, S.: Semfix: Program repair via semantic analysis. In: Proceedings, ICSE, pp. 772–781 (2013)

  64. Nilizadeh, A., Calvo, M., Leavens, G.T., Cok, D.R.: Generating counter examples in the form of unit tests from hoare-style verification attempts. In: Proceedings, 32nd IEEE/ACM International Conference on Formal Methods in Software Engineering, pp. 124–128. IEEE/ACM, Pittsburgh, PA (2022)

  65. Nilizadeh, A., Calvo, M., Leavens, G.T., Le, X.-B.D.: More reliable test suites for dynamic APR by using counter-examples. In: Proceedings, 32nd IEEE International Symposium on Software Reliability Engineering, pp. 208–219. IEEE (2021)

  66. Nilizadeh, A., Leavens, G.T., Le, X.-B.D., Pasareanu, C.S., Cok, D.R.: Exploring true test overfitting in dynamic automated program repair using formal methods. In: Proceedings, 14th IEEE International Conference on Software Testing, Verification and Validation, pp. 229–240. IEEE (2021)

  67. Qi, Z., Long, F., Achour, S., Rinard, M.: An analysis of patch plausibility and correctness for generate-and-validate patch generation systems. In: Proceedings, ISSTA 2015, Baltimore, MD, July (2015)

  68. Rothenberg, B.-C., Grumberg, O.: Sound and complete mutation-based program repair. In: Proceedings, FM, pp. 593–611 (2016)

  69. Rothenberg, B.-C., Grumberg, O.: Must fault localization for program repair. In: Proceedings, CAV, pp. 658–680 (2020)

  70. Saha, S., Saha, R., Prasad, M.: Harnessing evolution for multi-hunk program repair. In: Proceedings, ICSE (2019)

  71. Soto, M., Le Goues, C.: Using a probabilistic model to predict bug fixes. In: Proceedings, SANER, pp. 221–231 (2018)

  72. Tan, S.H., Roychoudhury, A.: Relifix: Automated repair of software regressions. In: ICSE (2015)

  73. Weimer, W., Nguyen, T., Le Goues, C., Forrest, S.: Automatically finding patches using genetic programming. In: Proceedings, International Conference on Software Engineering (ICSE), pp. 364–374 (2009)

  74. Wen, W., Chen, J.J., Wu, R., Hao, D., Cheung, S.C.: Context-aware patch generation for better automated program repair. In: Proceedings, ICSE 2018, Gothenburg, Sweden, May 27–June 3 (2018)

  75. Wong, W.R., Gao, R., Li, Y.H., Abreu, R., Wotawa, F.: A survey of software fault localization. IEEE Trans. Softw. Eng. 42, 707–740 (2016)

    Article  Google Scholar 

  76. Xin, Q., Reiss, S.P.: Leveraging syntax-related code for automated program repair. In: Proceedings, ASE 2017, Urbana Champaign, IL, October 30–November 3 (2017)

  77. Xiong, Y.F., Wang, J., Yan, R.F., Zhang, J.C., Han, S., Huang, G., Zhang, L.: Precise condition synthesis for program repair. In Proceedings, ICSE, pp. 416–426 (2017)

  78. Xuan, J., Martinez, M., Demarco, F., Clement, M., Lamelas Marcotte, S., Durieux, T., LeBerre, D., Monperrus, M.: Nopol: Automatic repair of conditional statement bugs in java programs. In: IEEE-TSE (2016)

  79. Xuan, J., Monperrus, M.: Test case purification for improving fault localization. In: Proceedings, FSE (2014)

Download references

Acknowledgements

The authors are very grateful to the anonymous reviewers for their valuable feedback, which has greatly enhanced the presentation and content of our paper. This work is partially supported by the NSF under grant number DGE1565478.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Mili.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is partially supported by NSF under grant number DGE 1565478.

Appendices

Proof of Proposition 8

Proof

Proof of Sufficiency. Program P satisfies oracle \(\varOmega (s,s')\) for test suite T if and only if:

$$\begin{aligned} \forall s\in T: \varOmega (s,P(s)). \end{aligned}$$

By the definition of \(\varOmega (s,s')\), we can rewrite this as:

   \(\forall s\in T: s\in { dom}(R)\Rightarrow (s,P(s))\in R\).

Distributing the clause \((s\in T)\), we write:

   \(\forall s: s\in T\wedge s\in { dom}(R)\Rightarrow s\in T\wedge (s,P(s))\in R\).

By set theory, we write the left hand side as:

   \(\forall s: s\in (T\cap { dom}(R))\Rightarrow s\in T\wedge (s,P(s))\in R\).

According to the definitions given in Sect. 3.1 we can write \(T\cap { dom}(R)= { dom}(_{T\backslash }R)\), hence:

   \(\forall s: s\in ({ dom}(_{T\backslash }R))\Rightarrow s\in T\wedge (s,P(s))\in R\).

Since (sP(s)) is also, by definition, an element of P, this can be written as:

   \(\forall s: s\in { dom}(_{T\backslash }R)\Rightarrow s\in T\wedge (s,P(s))\in (R\cap P)\).

If we now view T as a vector rather than a set, we can rewrite

   \(\forall s: s\in { dom}(_{T\backslash }R)\Rightarrow (s,P(s))\in T\wedge (s,P(s))\in (R\cap P)\).

Taking the intersection, and using associativity, we find:

   \(\forall s: s\in { dom}(_{T\backslash }R)\Rightarrow (s,P(s))\in ((T\cap R)\cap P)\).

Rewriting \((T\cap R)\) as the pre-restriction of R to T, we find:

   \(\forall s: s\in { dom}(_{T\backslash }R)\Rightarrow (s,P(s))\in ((_{T\backslash } R)\cap P)\).

From the right hand side, we infer that s is in the domain of \((_{T\backslash }R\cap P)\):

   \(\forall s: s\in { dom}(_{T\backslash }R)\Rightarrow s\in { dom}((_{T\backslash } R)\cap P)\).

Since this is true for all s, we write:

   \({ dom}(_{T\backslash }R)\subseteq { dom}((_{T\backslash } R)\cap P)\),

which we rewrite as:

   \(_{T\backslash }RL\subseteq ((_{T\backslash } R)\cap P)L\).

Given that the inverse inclusion is a tautology, we find:

   \(_{T\backslash }RL= ((_{T\backslash } R)\cap P)L\).

Hence, by proposition 2, P is absolutely correct with respect to \(_{T\backslash }R\).

Proof of Necessity. If P is correct with respect to \(_{T\backslash }R\), then by proposition 2\(_{T\backslash }RL\subseteq (_{T\backslash }R\cap P)L\). Interpreting this formula in logical terms, we find:

   \(\forall s: s\in { dom}(_{T\backslash }R)\Rightarrow s\in { dom}(_{T\backslash }R \cap P)\).

If this formula holds for all s in S, it holds a fortiori for all s in T:

   \(\forall s\in T: s\in { dom}(_{T\backslash }R)\Rightarrow s\in { dom}(_{T\backslash }R \cap P)\).

Because \(_{T\backslash }R\) can be written as \(T\cap R\), where T is reinterpreted as a vector, because \({ dom}(_{T\backslash }R)=T\cap { dom}(R)\), we can write:

   \(\forall s\in T: s\in T\cap { dom}(R)\Rightarrow s\in T\cap { dom}(R\cap P)\).

Isolating the clause \((s\in T)\), we get:

   \(\forall s\in T: s\in T\wedge s\in { dom}(R)\Rightarrow s\in T\wedge s\in { dom}(R\cap P)\).

Removing the clause \((s\in T)\), which is now redundant, we find:

   \(\forall s\in T: s\in { dom}(R)\Rightarrow s\in { dom}(R\cap P)\).

Since P is deterministic,

   \(\forall s\in T: s\in { dom}(R)\Rightarrow (s,P(s))\in (R\cap P)\).

By the definition of the oracle of absolute correctness:

   \(\forall s\in T: \varOmega (s,P(s)).\) \(\square \)

Proof of Proposition 9

Proof

Proof of Sufficiency. If the execution of \(P'\) for every element of T satisfies the oracle \(\omega (s,s')\) then:

   \(\forall s\in T: \varOmega (s,P(s))\Rightarrow \varOmega (s,P'(s))\).

Replacing \(\varOmega (,)\) by its definition, we find:

   \(\forall s\in T: (s\in { dom}(R)\Rightarrow (s,P(s))\in R) \Rightarrow (s\in { dom}(R)\Rightarrow (s,P'(s))\in R)\).

The body of this quantified formula has the form: \((a\Rightarrow b) \Rightarrow (a\Rightarrow c)\). If we simplify this Boolean expression, we find that it can be written as: \((a\wedge b)\Rightarrow c\). Given that in our case b (which is \((s,P(s))\in R\)) logically implies a (which is \(s\in { dom}(R))\)), this can further be simplified to: \((b\Rightarrow c)\). Hence, we write:

   \(\forall s\in T: ((s,P(s))\in R)\Rightarrow ((s,P'(s))\in R)\).

Because P and \(P'\) are deterministic, this can be written as:

   \(\forall s\in T: \exists s': s'=P(s)\wedge ((s,s')\in R) \Rightarrow \exists s': s'=P'(s)\wedge ((s,P'(s))\in R)\).

By rewriting \(s'=P(s)\) in relational form as \((s,s')\in P\) and taking the intersection, we find:

   \(\forall s\in T: \exists s': ((s,s')\in R\cap P) \Rightarrow \exists s': ((s,P'(s))\in R\cap P')\).

By the definition of domain, we write:

   \(\forall s\in T: s\in { dom}(R\cap P) \Rightarrow s\in { dom}(R\cap P')\).

Factoring the term \((s\in T)\) into the formula, we find:

   \(\forall s\in S: s\in T\wedge s\in { dom}(R\cap P) \Rightarrow s\in T\wedge s\in { dom}(R\cap P')\).

Using the same argument as the proof of the previous proposition, we find:

   \(\forall s\in S: s\in { dom}(_{T\backslash }R\cap P) \Rightarrow s\in { dom}(_{T\backslash }R\cap P')\).

From which we infer, by rewriting in relational form:

   \((_{T\backslash }R\cap P)L\subseteq (_{T\backslash }R\cap P')L\).

In other words, \(P'\) is more-correct than P with respect to \(_{T\backslash }R\).

Proof of Necessity. If \(P'\) is more-correct than P with respect to \(_{T\backslash }R\) then \((_{T\backslash }R\cap P)L\subseteq (_{T\backslash }R\cap P')L\), which we represent by the following logic formula:

   \(\forall s\in S: s\in { dom}(_{T\backslash }R\cap P) \Rightarrow s\in { dom}(_{T\backslash }R\cap P')\).

If this formula holds for all s in S, it holds necessarily for all s in T.

   \(\forall s\in T: s\in { dom}(_{T\backslash }R\cap P)\Rightarrow s\in { dom}(_{T\backslash }R\cap P')\).

By factoring out the pre-restriction from the domain, we get:

   \(\forall s\in T: s\in T\wedge s\in { dom}(R\cap P)\Rightarrow s\in T\wedge s\in { dom}(R\cap P')\).

We remove the clause \((s\in T)\), which is now redundant:

   \(\forall s\in T: s\in { dom}(R\cap P)\Rightarrow s\in { dom}(R\cap P')\).

Because P and \(P'\) are deterministic, this formula can be written as:

   \(\forall s\in T: (s,P(s))\in (R\cap P)\Rightarrow (s,P'(s))\in (R\cap P')\).

Using the Boolean manipulations we showed in the previous proof, we find this to be equivalent to:

   \(\forall s\in T: (s\in { dom}(R)\Rightarrow (s,P(s))\in R) \Rightarrow (s\in { dom}(R)\Rightarrow (s,P'(s))\in R)\).

Using the formula of the oracle of absolute correctness with respect to R, we find:

   \(\forall s\in T: \varOmega (s,P(s))\Rightarrow \varOmega (s,P'(s))\). \(\square \)

Proof of Proposition 10

Proof

Proof of Sufficiency. Let program \(P'\) satisfy the oracle of strict relative correctness; then according to the definition of this oracle, it satisfies the condition \((\forall s\in T: \omega (s,P'(s)))\). By proposition 9, \(P'\) is more-correct than P with respect to \(_{T\backslash }R\), i.e., \((_{T\backslash }R\cap P)L\subseteq _{T\backslash }R\cap P')L\). To prove strict relative correctness, we must prove that there exists an element s in the domain of \((_{T\backslash }R\cap P')\) that is not in the domain of \((_{T\backslash }R\cap P)\). To this effect, we consider the second clause of the oracle:

\((\exists s\in T:\lnot \varOmega (s,P(s))\wedge \varOmega (s,P'(s))).\)

By the definition of \(\varOmega (,)\), we find:

\((\exists s\in T: s\in { dom}(R)\wedge (s,P(s))\not \in R\wedge (s\in { dom}(R)\Rightarrow (s,P'(s))\in R)).\)

Using Boolean identities, we simplify this to:

\((\exists s\in T: s\in { dom}(R)\wedge (s,P(s))\not \in R\wedge (s,P'(s))\in R).\)

Since \((s,P'(s))\in R\) logically implies \(s\in { dom}(R)\), we write:

\((\exists s\in T:(s,P(s))\not \in R\wedge (s,P'(s))\in R).\)

From \((s\in T\wedge (s,P'(s))\in R\) we easily infer \(s\in { dom}(_{T\backslash }R\cap P')\), following the same argument that we used in the proofs of propositions 8 and 9. From \((s,P(s))\not \in R\) we infer \(s\not \in { dom}(R\cap P)\), whence we infer \(s\not \in { dom}(_{T\backslash }R\cap P)\), since \(_{T\backslash }R\subseteq R\).

Proof of Necessity If \(P'\) is strictly more-correct than P with respect to \(_{T\backslash }R\), then it is more-correct, hence by proposition 9, \((\forall s\in T: \omega (s,P(s)))\). On the other hand, we know that there exists an element of \({ dom}(_{T\backslash }R\cap P')\) that is not in \({ dom}(_{T\backslash }R\cap P)\). Using the same arguments cited in the proof of proposition 9, we infer: \((s,P(s))\not \in _{T\backslash }R\wedge (s,P'(s))\in _{T\backslash }R\). From the second clause, we infer that s is in T, which we use to rewrite the formula as: \(\exists s\in T: (s,P(s))\not \in R\wedge (s,P'(s))\in R\). Using the Boolean transformation alluded to above, we find this to be equivalent to: \((\exists s\in T: \lnot \varOmega (s,P(s))\wedge \varOmega (s,P'(s))).\) \(\square \)

Proof of Proposition 11

We propose to prove that the following Hoare formula is valid in Hoare’s deductive logic:

v: \(\{(\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P)\}\)

figure j

\(\{Pp\sqsupset _{R'} P\}\).

Applying the sequence rule to v, with the following intermediate predicate int:

$$\begin{aligned}{} & {} (\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P)\\{} & {} \wedge m=1\wedge \lnot inc\wedge Pp=P \end{aligned}$$

yields the following lemmas:

\(v_0\): \(\{(\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P)\}\)

figure k

\(\{(\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P) \wedge m=1\wedge \lnot inc\wedge Pp=P\}\).

\(v_1\): \(\{(\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P) \wedge m=1\wedge \lnot inc\wedge Pp=P\}\)

figure l

\(\{Pp\sqsupset _{R'} P\}\).

If we apply the (concurrent) assignment rule to \(v_0\), we get:

\(\begin{array}{l}{v_{00}}:(\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P)\\ \Rightarrow \\ (\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P) \wedge 1=1\wedge \textbf{true}~\wedge P=P\}\end{array}\).

This formula is clearly a tautology, hence we turn our attention to \(v_1\), to which we apply the while rule, with the following loop invariant inv:

$$\begin{aligned}{} & {} inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\\{} & {} \vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P))), \end{aligned}$$

where inb(m) (stands for: in bounds) is shorthand for: \(1\le m\le M\). Application of the while rule to \(v_1\) with the selected loop invariant yields three lemmas:

\(\begin{array}{l} {v_{10}}:(\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P) \wedge m=1\wedge \lnot inc\wedge Pp=P\\ \Rightarrow \\ inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\end{array}\).

\(v_{11}\): \(\{(\lnot inc\wedge m\le M) \wedge inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\}\)

figure m

\(\{inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\}.\)

\(\begin{array}{l}{v_{12}}:\lnot (\lnot inc\wedge m\le M)\wedge inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\\ \vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\\ \Rightarrow \\ Pp\sqsupset _{R'} P\end{array}\).

To check the validity of \(v_{10}\), we rewrite it by distributing inb(m) over the disjunction and replacing m by 1 on the right hand side:

\(\begin{array}{l}{v_{10}}:(\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'} P) \wedge m=1\wedge \lnot inc\wedge Pp=P\\ \Rightarrow \\ (inb(m)\wedge inc\wedge Pp\sqsupset _{R'} P)\vee (inb(m)\wedge \lnot inc\wedge (\exists h: 1\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P))\end{array}\).

Now it is clear that \(v_{10}\) is a tautology, since the left hand side logically implies the second disjunct of the right hand side, assuming, as we do, that \(M\ge 1\). As for \(v_{12}\), its left hand side can be simplified into \((inc\wedge Pp\sqsupset _{R'}P)\), due to the contradiction between \(m>M\) and inb(m), and the contradiction between inc and \(\lnot inc\). Hence, \(v_{12}\) is also a tautology. We turn our attention to \(v_{11}\), which we first simplify as follows:

\(v_{11}\): \(\{\lnot inc\wedge inb(m)\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)\}\)

figure n

\(\{inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\}.\)

We apply the sequence rule to \(v_{11}\), with the following intermediate predicate \(int'\):

$$\begin{aligned}{} & {} (Pp\sqsupset _{R'}P\vee PS(m)=\epsilon )\wedge \\{} & {} \lnot inc\wedge inb(m)\wedge \\{} & {} (Pp\sqsupset _{R'}P\vee (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)). \end{aligned}$$

This yields the following two lemmas:

\(v_{110}\): \(\{\lnot inc\wedge inb(m)\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P))\}\)

figure o

\(\{ (Pp\sqsupset _{R'}P\vee PS(m)=\epsilon )\wedge \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)). \}.\)

\(v_{111}\): \(\{ (Pp\sqsupset _{R'}P\vee PS(m)=\epsilon )\wedge \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)). \}\)

figure p

\(\{inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\}.\)

We apply the while rule to \(v_{110}\), with the following loop invariant, \(inv'\):

$$\begin{aligned} \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h:m\le h\le M:\exists Q\in PS(h):Q\sqsupset _{R'}P)). \end{aligned}$$

This yields the following three lemmas:

\(\begin{array}{l}{v_{1100}}:\lnot inc\wedge inb(m)\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P))\\ \Rightarrow \\ \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h:m\le h\le M:\exists Q\in PS(h):Q\sqsupset _{R'}P)).\end{array}\)

\(\begin{array}{l}{v_{1101}}:\{\lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h:m\le h\le M:\exists Q\in PS(h):Q\sqsupset _{R'}P))\\ \wedge \lnot (Pp\sqsupset _{R'}P\wedge PS(m)\ne \epsilon )\}\end{array}\)

figure q

\(\{ \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h:m\le h\le M:\exists Q\in PS(h):Q\sqsupset _{R'}P))\}\)

\(\begin{array}{l}{v_{1102}}:\lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h:m\le h\\ \le M:\exists Q\in PS(h):Q\sqsupset _{R'}P)) \wedge (Pp\sqsupset _{R'}P\vee PS(m)=\epsilon )\\ \Rightarrow \\ (Pp\sqsupset _{R'}P\vee PS(m)=\epsilon )\wedge \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)). \end{array}\)

To see that \(v_{1100}\) is a tautology, it suffices to distribute the \(\wedge \) over the \(\vee \) on the right hand side of the implication, and to notice that the second disjunct on the right hand side is a copy of the left hand side of the implication. As for \(v_{1102}\), it is clearly a tautology, since the right hand side of \(\Rightarrow \) is merely a copy of the left hand side. We turn our attention to \(v_{1101}\) now, and we begin by simplifying its precondition by virtue of Boolean identities:

\(v_{1101}\): \(\{\lnot inc\wedge inb(m)\wedge (\exists h:m\le h\le M:\exists Q\in PS(h):Q\sqsupset _{R'}P) \wedge \lnot (Pp\sqsupset _{R'}P)\wedge PS(m)\ne \epsilon )\}\)

figure r

\(\{ \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h:m\le h\le M:\exists Q\in PS(h):Q\sqsupset _{R'}P))\}\)

We consider \(v_{1101}\), to which we must apply the assignment statement rule; to this effect, we must analyze the semantics of function NextPatch(P,m). We assume that this function performs the following operations:

figure s

Hence, application of the assignment rule yields the following formula:

\(\begin{array}{l}v_{11010}:\lnot inc\wedge inb(m)\\ \wedge (Pp\sqsupset _{R'}P\vee (\exists h:m\le h\le M:\exists Q\in PS(h):Q\sqsupset _{R'}P))\\ \wedge (\lnot (Pp\sqsupset _{R'}P\wedge PS(m)\ne \epsilon )\\ \Rightarrow \\ \lnot inc\wedge inb(m)\wedge (head(PS(m))\sqsupset _{R'}P\vee \\ (\exists Q\in tail(PS(m)):Q\sqsupset _{R'}P)\vee (\exists h:m+1\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P))\end{array}\).

We consider the first two disjuncts in the parenthesized expression:

\((head(PS(m))\sqsupset _{R'}P)\vee (\exists Q\in tail(PS(m)):Q\sqsupset _{R'}P)\) and we merge them into a single expression:

\( (\exists Q\in PS(m):Q\sqsupset _{R'}P)\).

Now we merge this expression with the third disjunct above: \( (\exists Q\in PS(m):Q\sqsupset _{R'}P)\vee (\exists h:m+1\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P) \), to obtain:

\( (\exists h:m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P) \).

Replacing these in \(v_{11010}\), we find that the right hand side is a logical conclusion of the left hand side, hence \(v_{11010}\) is a tautology. We now consider \(v_{111}\), to which we apply the if-then-else rule, which yields two lemmas:

\(v_{1110}\): \(\{(Pp\sqsupset _{R'}P)\wedge (Pp\sqsupset _{R'}P\vee PS(m)=\epsilon )\wedge \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)). \}\)

figure t

\(\{inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\}.\)

\(v_{1111}\): \(\{\lnot (Pp\sqsupset _{R'}P)\wedge (Pp\sqsupset _{R'}P\vee PS(m)=\epsilon )\wedge \lnot inc\wedge inb(m)\wedge (Pp\sqsupset _{R'}P\vee (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)). \}\)

figure u

\(\{inb(m)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\}.\)

We simplify \(v_{1110}\) and apply the assignment rule to it, yielding:

: \(\begin{array}{l}{v_{11100}}:(Pp\sqsupset _{R'}P)\wedge \lnot inc\wedge inb(m)\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)\\ \Rightarrow \\ inb(m)\wedge (Pp\sqsupset _{R'} P) ,\end{array}\)

This is clearly a tautology. We simplify \(v_{1111}\) and apply the assignment rule to it, yielding:

\(\begin{array}{l}{v_{11110}}:\lnot (Pp\sqsupset _{R'}P)\wedge PS(m)=\epsilon \wedge \lnot inc\wedge inb(m)\wedge (\exists h: m\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)\\ \Rightarrow \\ inb(m+1)\wedge ((inc\wedge Pp\sqsupset _{R'} P)\vee (\lnot inc\wedge (\exists h: m+1\le h\le M:\exists Q\in PS(h): Q\sqsupset _{R'}P)))\}.\end{array}\)

If we know that there exists Q strictly more-correct than P in one of the patch sequences \(PS(m), PS(m+1),... PS(M)\) but PS(m) is empty, then it must be in one of the sequence \(PS(m+1), PS(m+2),... PS(M)\). For the same reason, m is necessarily strictly less than M, since Q is somewhere in \(PS(m+1), PS(m+2),... PS(M)\). Hence \(inb(m+1)\) holds. We conclude that \(v_{11110}\) is a tautology.

Since all the lemmas generated form v are valid, so is v. Hence, UnitIncCor() is partially correct with respect to the specification:

  • Precondition: \((\exists m: 1\le m\le M:\exists Q\in PS(m): Q\sqsupset _{R'}P).\)

  • Postcocndition: \(Pp\sqsupset _{R'}P\).

Proof of Proposition 12

We must prove the validity of the following formula in Hoare logic [25]:

v: \(\{\textbf{true}~\}\)

figure v

{\(inc\Rightarrow Pp\sqsupset _{R'}P\)}.

Applying the sequence rule to v with the intermediate predicate int: \(inc\Rightarrow Pp\sqsupset _{R'}P\) yields the following formulas:

\(v_0\): \(\{\textbf{true}~\}\)

figure w

{\(inc\Rightarrow Pp\sqsupset _{R'}P\)}.

\(v_1\): {\(inc\Rightarrow Pp\sqsupset _{R'}P\)}

figure x

{\(inc\Rightarrow Pp\sqsupset _{R'}P\)}.

The (concurrent) assignment rule applied to \(v_0\) yields:

\(v_{00}\): \(\textbf{true}~\Rightarrow (\textbf{false}~\Rightarrow P\sqsupset _{R'}P)\),

which is a tautology. We apply the while rule to \(v_1\) with the loop invariant inv: \(inc\Rightarrow Pp\sqsupset _{R'}P\), which yields the following formulas:

\(v_{10}\): \((inc\Rightarrow Pp\sqsupset _{R'}P) \Rightarrow (inc\Rightarrow Pp\sqsupset _{R'}P)\)

\(v_{11}\): {\((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge (\lnot inc\wedge m\le M)\}\)

figure y

\(\{inc\Rightarrow Pp\sqsupset _{R'}P\}\).

\(v_{12}\): \((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge (inc\vee m>M) \Rightarrow (inc\Rightarrow Pp\sqsupset _{R'}P)\).

Formulas \(v_{10}\) and \(v_{12}\) are clearly tautologies; we apply the sequence rule to \(v_{11}\), with int: \(inc\Rightarrow Pp\sqsupset _{R'}P\), which yields the following formulas:

\(v_{110}\): {\((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge (\lnot inc\wedge m\le M)\}\)

figure z

\(\{inc\Rightarrow Pp\sqsupset _{R'}P\}\)

\(v_{111}\): {\((inc\Rightarrow Pp\sqsupset _{R'}P)\}\)

figure aa

\(\{inc\Rightarrow Pp\sqsupset _{R'}P\}\).

We apply the while rule to \(v_{110}\) with the loop invariant inv: \(\lnot inc\), which yields the following formulas:

\(v_{1100}\): \((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge (\lnot inc\wedge m\le M)\Rightarrow \lnot inc\).

\(v_{1101}\): \(\{\lnot inc \wedge (\lnot Pp\sqsupset _{R'}P\wedge MorePatches(P,m))\}\)

figure ab

\(\{\lnot inc\}\).

\(v_{1102}\): \(\lnot inc \wedge \lnot (\lnot Pp\sqsupset _{R'}P\wedge MorePatches(P,m))\Rightarrow (inc\Rightarrow Pp\sqsupset _{R'}P).\)

Formula \(v_{1100}\) is clearly a tautology; formula \(v_{1102}\) is also a tautology because it has the form \(((\lnot a\wedge b)\Rightarrow (a\Rightarrow c))\), which can be simplified as \((a\vee \lnot b)\vee (\lnot a\vee c)\); we focus on \(v_{1101}\), to which we apply the assignment statement rule, which yields:

\(v_{11010}\): \((\lnot inc \wedge (\lnot Pp\sqsupset _{R'}P\wedge MorePatches(P,m))) \Rightarrow \lnot inc\).

This is clearly a tautology; we turn our attention to \(v_{111}\), to which we apply the if-then-else rule, which yields:

\(v_{1110}\): {\((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge (Pp\sqsupset _{R'}P)\}\)

figure ac

\(\{inc\Rightarrow Pp\sqsupset _{R'}P\}\).

\(v_{1111}\): {\((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge \lnot (Pp\sqsupset _{R'}P)\}\)

figure ad

\(\{inc\Rightarrow Pp\sqsupset _{R'}P\}\).

Application of the asignment statement rule to \(v_{1110}\) and \(v_{1111}\) yields, respectively:

\(v_{11100}\): \((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge (Pp\sqsupset _{R'}P)\Rightarrow (Pp\sqsupset _{R'}P)\).

\(v_{11110}\): \((inc\Rightarrow Pp\sqsupset _{R'}P) \wedge \lnot (Pp\sqsupset _{R'}P)\Rightarrow (inc\Rightarrow Pp\sqsupset _{R'}P)\).

Formulas \(v_{11100}\) and \(v_{11110}\) are both tautologies. This concludes the proof that

v: {\(\textbf{true}~\)}

figure ae

{\(inc\Rightarrow Pp\sqsupset _{R'}P\)}

is valid in Hoare’s logic. Hence, UnitIncCor() is partially correct with respect to the specification defined by the following pre/post condition pair:

  • Precondition: \(\textbf{true}~\).

  • Postcondition: \(inc\Rightarrow Pp\sqsupset _{R'}P\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khaireddine, B., Zakharchenko, A., Martinez, M. et al. Toward a theory of program repair. Acta Informatica 60, 209–255 (2023). https://doi.org/10.1007/s00236-023-00438-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00236-023-00438-4

Navigation