A framework for assume-guarantee regression verification of evolving software

doi:10.1016/j.scico.2020.102439

Science of Computer Programming

Volume 193, 1 July 2020, 102439

https://doi.org/10.1016/j.scico.2020.102439 Get rights and content

Highlights

•
This paper presents a method for generating local weakest assumptions using a backtracking algorithm.
•
The backtracking algorithm is based on CDNF algorithm and a variant of membership query answering technique.
•
The correctness of the backtracking algorithm is presented in the paper
•
The backtracking algorithm is then integrated into a framework for effectively rechecking evolving software.
•
The paper presents experimental results for some common systems in the researcher community.

Abstract

This paper presents a framework for verifying evolving component-based software using assume-guarantee logic. The goal is to improve CDNF-based assumption generation method by having local weakest assumptions that can be used more effectively when verifying component-based software in the context of software evolution. For this purpose, we improve the technique for responding to membership queries when generating candidate assumptions. This technique is then integrated into a proposed backtracking algorithm to generate local weakest assumptions. These assumptions are effectively used in rechecking the evolving software by reducing time required for assumption regeneration within the proposed framework. The proposed framework can be applied to verify software that is continually evolving. An implemented tool and experimental results are presented to demonstrate the effectiveness and usefulness of the framework.

Introduction

In the last three decades, component-based software engineering (CBSE) has emerged as one of the important approaches in software engineering. This approach has shown a number of advantages such as increasing effectiveness and efficiency, lowering cost, shortening product time-to-market, improving maintainability [52]. As a result, component-based software (CBS) quality assurance plays a critical role in software production life cycles due to the increasing demand for high-quality products. Due to the high-quality standard test procedure in software industry, the verification process in CBSs ensures that certain properties are not violated at all times.

There are two approaches to the verification of modern software: theorem proving which is semi-automatic, requires the interaction of domain experts [21], [20], [30], [37], [38], [51], and costs a lot of effort [5]; model checking which is automatic and does not require the interaction of domain experts [7], [18]. Although the model checking has gained considerable attention due to its fully automatic characteristic, the approach suffers from the problem of state space explosion [15], [18], [48], [16]. The assume-guarantee framework [17], [19], [24], [46], which performs modular verification of CBS, has been considered a promising solution for dealing with the state space explosion problem during model checking. The framework uses the “divide-and-conquer” strategy to verify whether a given system satisfies a predefined property. Therefore, it can potentially be applied to large-scale systems in practice. The key problem of the framework is to generate assumptions that satisfy the assume-guarantee rules [19], [29], [33]. If such an assumption exists, the given system satisfies the required property. Although the framework can be applied to large-scale systems effectively, it does not consider the system under check in the context of software evolution.

Modern software applications are continually evolving, and any verification has to be revisited repeatedly. A reduction in the cost of this repeated verification would offer significant benefits for industry: improving the quality of software through application of verification techniques in situations where this is currently infeasible. Progress has been made using approaches such as labeled transition systems [12], [19], [31], [33], [34], [35], implicit representation of transition systems [13], [27], timed transition systems [3], [28], [40], [41], [42]. The following two solutions have been used in reducing the verification costs for evolving software.

The first solution is to generate a new assumption each time software evolves at a lower cost. For software modeled by exploiting labeled transition systems, assumptions with small sizes (i.e., assumptions with small numbers of states) can be used effectively to recheck modified software leading to reduced verification cost. In a series of papers, Hung et al. proposed a method to generate minimized assumptions for CBS verification [31], [34], [35] and a framework to perform modular verification of evolving CBS [33]. However, the cost for generating minimal assumptions can be high [34]. The reason is that the investigated assumption generation problem [19], [33], [34], [35], [31] is formulated as an automata learning problem using the $L^{⁎}$ algorithm [4]. As a result, it is difficult to apply this approach to large-scale systems. On the other hand, for the faster assumption generation speed, another verification method, which uses CDNF (Conjunction of Disjunctive Normal Form) algorithm [10] and implicit representation of software, was proposed in 2010 by Chen et al. [13]. Later, in 2016, this method was improved by He et al. and applied in CBS regression verification [26] by introducing a fine–grained learning technique. However, with modified software, some of the subpredicates of the new version of components can be different, which requires the regression verification progress to regenerate the assumptions for every small change in the software component.

The second solution to reduce the verification cost for modified software is to increase assumption reuse as much as possible. This is because the software development cycle involves daily change. Therefore, the less time required to regenerate assumptions, the greater the cost savings when verifying modified software. Moreover, from the analysis in Section 5 below, weak assumptions (i.e., assumptions with large languages) can help to achieve this purpose and play a key role in the verification of modified software. On the other hand, to our knowledge, no research has been conducted on generating assumptions that have the weakest languages and use implicit specification. As a result, this research focuses on improving the learning algorithm proposed by Chen [13] to generate local weakest assumptions that can be used more efficiently to reduce the cost of software regression verification during software evolution.

To achieve the above goal, we first improve the technique to answer membership queries for the two ι (i.e., the initial predicate) and τ (i.e., transition relation) CDNF learning instances. Based on this improved answering technique, we can generate weaker assumptions than those generated by the algorithm proposed by Chen et al. [13] (hereafter, we refer to as CBAG algorithm) using a proposed backtracking learning algorithm (referred to as LWAG algorithm). This leads to an important result in the context of software evolution: LWAG algorithm can reduce the number of times assumptions must be regenerated when verifying modified software. The improved answering technique and LWAG algorithm are integrated into a framework to effectively reduce the number of times assumption regeneration is required for evolving software.

Using assumption generation algorithms which employ the implicit representation, we can not only benefit from the fast learning process but we can also obtain several advantages of implicit software representation over explicit representation. First, the contextual assumptions represented implicitly using Boolean functions have fewer states than do assumptions modeled using deterministic finite automata because implicit representations are equivalent to nondeterministic finite automata, which are exponentially more succinct than deterministic ones. As a result, our generated assumptions can have an exponentially smaller number of states than do assumptions generated from explicit representations. The second advantage is the scalability of the verification method using implicit representations, which occurs because the $L^{⁎}$ algorithm requires a polynomial number of queries in the number of states of the target finite automaton [4], [49]. In contrast, the CDNF algorithm requires a polynomial number of queries in the number of Boolean variables of the target Boolean function [10]. Because implicit assumptions can be exponentially more succinct than explicit ones, the learning algorithms for implicit assumptions can be exponentially better than automata-theoretic ones.

To our knowledge, the first paper that proposed using the $L^{⁎}$ algorithm to learn assumptions for the assume-guarantee reasoning algorithm was Cobleigh et al. [19]. Following this paper, several studies improved the method, including adoption of the assume-guarantee rules [6], [26], [39], [45], symbolic implementation for assume-guarantee rules [8], [9], [45], several improvements proposed in [1], [2], [12], [14], [25], [50], [53], and an extension to support liveness properties [22]. However, these papers all use the $L^{⁎}$ algorithm to learn an automaton as the required contextual assumption. Hence, they all have the same disadvantages as described above compared to the algorithm proposed in Chen's paper [13]. Hence, we based our paper on Chen's algorithm [13] to verify modified software.

The remainder of this paper is organized as follows. Section 2 presents the background for this paper. We review CBAG algorithm for generating assumptions using the CDNF algorithm in Section 3, followed by the proposed algorithms to improve the answers to membership queries and generate assumptions in Section 4. Section 5 presents a framework for verifying modified CBSs using assumptions generated by the proposed learning algorithm. Section 6 shows the preliminary experimental results. Related papers are presented in Section 7. Finally, we conclude the paper in Section 8.

Section snippets

Background

In this section, we present some basic concepts used in this paper. We use $B$ to denote the Boolean domain, which is a set that consists of exactly two elements whose interpretations are T (true) and F (false) (i.e., $B = {T, F}$ ). Given a set of Boolean variables X, we call $| X |$ the size of X, where $| X |$ is the number of variables inside X.

Let X be a finite set of Boolean variables. Consider a function $θ (X)$ over X, which is a function from $B^{| X |}$ to the Boolean domain $B$ , $θ (X)$ is called a $B o o l e a n f u n c t i o n$

The CDNF algorithm

Let X be a fixed set of Boolean variables and $λ (X)$ be a Boolean function over X. CDNF is an incremental learning algorithm that can learn the exact representation of $λ (X)$ in a finite number of steps [10]. Sharing the same ideas as the $L^{⁎}$ algorithm [4], CDNF is based on a $t e a c h e r$ (which knows $λ (X)$ ) when performing the learning process. The $t e a c h e r$ must be able to answer the following two types of queries:

•
$M e m b e r s h i p q u e r i e s M E M (v)$ : Given a valuation v over X, if $λ [v] = T (t r u e)$ , the $t e a c h e r$ returns yes

An improved technique for answering membership queries

As shown above in Section 3.1, in CDNF algorithm, the generated Boolean function depends on how the $t e a c h e r$ answers membership queries and whether yes or no (i.e., $λ [v] = T$ or $λ [v] = F$ , respectively) are returned to the $l e a r n e r$ . As a result, to improve the CDNF–based assumption generation method, we first need to focus on improving the technique by which of the $t e a c h e r$ answers the $l e a r n e r$ .

After analyzing OMQ algorithm together with Table 2, we observe that the answering technique in this algorithm

A framework for modular verification of evolving CBS

In practice, when software verification cost increases daily because of software evolution which can happen all time during software life cycle, more reusable assumptions, such as weak assumptions, play an important role in reducing verification cost by being used in the framework presented in this section. The empirical results shown in Section 6 clearly indicates the effectiveness of using weak assumptions when rechecking evolved software.

Consider a CBS M that contains two components $M_{0}$ and $M_{1}$

Experiments

To evaluate the effectiveness of LWAG algorithm, experiments are performed to highlight two key points: (i) a comparison between CBAG algorithm and LWAG algorithm and their corresponding generated assumptions; and (ii) a comparison of the framework in Section 5.1 between the cases using the assumptions generated by CBAG algorithm and LWAG algorithm after the software has been modified. Algorithms presented in Section 3 and Section 4 are implemented in $C # . N E T$ and Microsoft Visual Studio

Related works

Several existing papers on evolving software verification are relevant to our research [11], [13], [23], [27], [31], [32], [33], [34], [35], [44].

In 2010, Chen et al. proposed a purely implicit solution to the contextual assumption generation problem in assume-guarantee reasoning [13]. However, this paper did not consider the case in which the software component has been modified. Instead, when a component has been modified, the assumption–generation method must be executed again from the

Conclusion

In this paper, we presented an effective framework for rechecking evolving software using LWAG algorithm with an improved technique for answering membership queries during the assumption learning process. Although LWAG algorithm has a greater time complexity than does CBAG algorithm, it can generate local weakest assumptions to reduce the number of assumption regenerations when rechecking evolving software. An implemented tool and experimental results are also presented that allows comparing

Acknowledgements

This work is supported by the Vietnam's National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.03-2015.25.

References (54)

R. Alur et al.
Event-clock automata: a determinizable class of timed automata
Theor. Comput. Sci.
(January 1999)
D. Angluin
Learning regular sets from queries and counterexamples
Inf. Comput.
(November 1987)
N.H. Bshouty
Exact learning Boolean functions via the monotone theory
Inf. Comput.
(1995)
T. Henzinger et al.
Temporal proof methodologies for timed transition-systems
Inf. Comput.
(August 1994)
T. Vale et al.
Twenty-eight years of component-based software engineering
J. Syst. Softw.
(January 2016)
M. Zhou et al.
Formal component-based modeling and synthesis for plc systems
Comput. Ind.
(October 2013)
K. Abd Elkader et al.
Automated circular assume-guarantee reasoning with n-way decomposition and alphabet refinement
K. Abd Elkader et al.
Automated circular assume-guarantee reasoning
Form. Asp. Comput.
(September 2018)
M. Archer et al.
Developing user strategies in pvs: a tutorial
H. Barringer et al.
Proof rules for automated compositional verification through learning

B. Berard et al.

Systems and Software Verification: Model-Checking Techniques and Tools

(2010)

R. Bouchekir et al.

Learning-based symbolic assume-guarantee reasoning for Markov decision process by using interval Markov process

Innov. Syst. Softw. Eng.

(September 2018)

R. Bouchekir et al.

Toward implicit learning for the compositional verification of Markov decision processes

S. Chaki et al.

Verification of evolving software via component substitutability analysis

Form. Methods Syst. Des.

(June 2008)

S. Chaki et al.

Optimized L*-based assume-guarantee reasoning

Y.-F. Chen et al.

Automated assume-guarantee reasoning through implicit learning

Y.-F. Chen et al.

Learning minimal separating dfa's for compositional verification

E.M. Clarke et al.

Design and synthesis of synchronization skeletons using branching-time temporal logic

E.M. Clarke et al.

Model Checking and the State Explosion Problem

(2012)

E.M. Clarke et al.

Compositional model checking

E.M. Clarke et al.

Model Checking

(1999)

J.M. Cobleigh et al.

Learning assumptions for compositional verification

E.W. Dijkstra

Guarded commands, nondeterminacy and formal derivation of programs

Commun. ACM

(August 1975)

D.A. Duffy

Principles of Automated Theorem Proving

(1991)

A. Farzan et al.

Extending automated compositional verification to the full class of omega-regular languages

A. Groce et al.

Adaptive model checking

O. Grumberg et al.

Model checking and modular verification

ACM Trans. Program. Lang. Syst.

(May 1994)

Cited by (0)

View full text

A framework for assume-guarantee regression verification of evolving software

Highlights

Abstract

Introduction

Section snippets

Background

The CDNF algorithm

An improved technique for answering membership queries

A framework for modular verification of evolving CBS

Experiments

Related works

Conclusion

Acknowledgements

Theor. Comput. Sci.

Inf. Comput.

Inf. Comput.

Inf. Comput.

J. Syst. Softw.

Comput. Ind.

Automated circular assume-guarantee reasoning with n-way decomposition and alphabet refinement

Automated circular assume-guarantee reasoning

Form. Asp. Comput.

Developing user strategies in pvs: a tutorial

Proof rules for automated compositional verification through learning

Systems and Software Verification: Model-Checking Techniques and Tools

Learning-based symbolic assume-guarantee reasoning for Markov decision process by using interval Markov process

Innov. Syst. Softw. Eng.

Toward implicit learning for the compositional verification of Markov decision processes

Verification of evolving software via component substitutability analysis

Form. Methods Syst. Des.

Optimized L*-based assume-guarantee reasoning

Automated assume-guarantee reasoning through implicit learning

Learning minimal separating dfa's for compositional verification

Design and synthesis of synchronization skeletons using branching-time temporal logic

Model Checking and the State Explosion Problem

Compositional model checking

Model Checking

Learning assumptions for compositional verification

Guarded commands, nondeterminacy and formal derivation of programs

Commun. ACM

Principles of Automated Theorem Proving

Extending automated compositional verification to the full class of omega-regular languages

Adaptive model checking

Model checking and modular verification

ACM Trans. Program. Lang. Syst.