Symbolic partition refinement with automatic balancing of time and space☆
Introduction
Markov chains are among the most fundamental mathematical structures used for performance and dependability modeling of communication and computer systems. Since the size of a Markov chain usually grows exponentially with the size of the corresponding high-level model, one often encounters the state space explosion problem, which frequently makes the analysis of the Markov chain intractable.
In the area of formal verification, bisimulation equivalence [1] plays a prominent role as an equivalence relation on state transition graphs. It equates two states if and only if their future behavior is indistinguishable. In the context of Markov models, the same idea is known as lumpability [2]. Originally, lumpability was defined with respect to a given partition of the state space. If the lumpability condition is satisfied for this partition, an often much smaller model can be obtained by considering the quotient induced by the partition. That quotient, the lumped Markov chain, captures all relevant behavior.
Many approaches to alleviate or circumvent the state space explosion problem for Markov chains are implicitly or explicitly based on the notion of lumpability, since this allows computation of measures of the original Markov chain using the analysis of the lumped Markov chain. A core practical challenge in this context is to devise a proper partition of the state space ensuring that the lumping conditions hold, while avoiding to actually generate the (possibly excessively large) state space representation of the system to be lumped.
One strand of work exploits information available in the high-level model, such as symmetries or hierarchies, in order to directly generate the lumped model. Examples of such model-level approaches include [3], [4], [5], [6], [7], [8]. Another strand of work is a result of looking at lumpability from the bisimulation perspective [9], [10], [11]. It allows us to formulate lumpability as a fixpoint of a higher order function on the original state space, and gives rise to an effective algorithm to compute the smallest possible partition of a Markov chain that satisfies the lumping conditions. This means that instead of using high-level model information, which is cheaper but may lead to sub-optimal lumping, the optimally lumped Markov chain can be computed by an efficient fixpoint algorithm [12], [13]. Since this can in some cases be done faster than doing the actual analysis on the original model, it is possible to preprocess an arbitrary Markov chain to derive the best lumped Markov chain, and then proceed with the analysis of the latter. Due to the reduction in state space size, this two-step approach often speeds up the overall analysis time [14].
While this appears as a great step forward, the optimal lumping approach does not solve the state space explosion problem per se, since the fixpoint algorithm runs on the original, possibly excessively large state space. This is where symbolic representations come into play: Instead of explicitly representing the original state space, it is represented in a symbolic manner, using structures such as BDDs or MTBDDs [15], [16], [17], [18], [19], [20], [21], [22], [23]. These structures are equipped with well-understood heuristics such that in most cases they only grow moderately in the size of the high-level compositional model [24], [25], [26].
Notably, numerical analysis techniques do not scale if directly applied on symbolic representations. As observed in [26], [27], the MTBDD representation of the solution vector tends to grow extremely large, despite a compact MTBDD representation of the model. This is caused by lacking regularity and diversity of values in the solution vector as the computation progresses. Resorting to EVBDDs instead of MTBDDs can lead to a similar blow up during model checking, not in terminal nodes, but in edge value labelings, induced by the irregularity of the values and operations occurring. However, symbolic representations play a key role if used to represent the original state space in order to then generate the best possible lumping.
These considerations motivate the recent work on symbolic algorithms for optimal lumping. Rooted in the work of Blom and Orzan [28], [29], [30], who developed a distributed algorithm for bisimulation minimization, different symbolic lumping algorithms have been developed [31], [32], [33]. In combination with compositional construction techniques, they have been applied to models of sizes otherwise far out of reach of contemporary numerical analysis engines [23], [34].
While the above symbolic algorithms are similar in spirit, they have conceptual and practically relevant differences. The work of [31] utilizes a “fast” partition representation that makes it possible to have a very efficient algorithm in terms of computation time, but even for fairly small numbers of equivalence classes it consumes a considerable amount of memory. On the other hand, the “compact” partition representation of [32] stays very small in terms of space requirements, but its drawback is that performing operations on the representation can become quite expensive timewise. The difference is caused by drastically different representation techniques for encoding the state space partitions as BDDs, which are accessed and refined by the algorithm.
To further push the limits of this technology, this paper provides an in-depth study of the earlier approaches to symbolic lumping and then devises a combination of them. We develop an algorithm which is memory efficient by using the compact partition representation of [32] and runtime efficient by using the fast partition representation and refinement algorithm of [31]. Our “hybrid” approach offers a “spectrum” of representations whose extremes are the fast representation on one side and the compact representation on the other. It provides us a parameter by which we can control where in the spectrum a specific instance of the representation stands.
In principle, it is possible to implement our algorithms using both MTBDDs and EVBDDs. However, we do not expect great reductions in memory consumption by using EVBDDs, because the central data structure for representing partitions of state spaces is a vector of Boolean functions. For Boolean functions the size of BDDs and EVBDDs is identical up to one node [18]. Since our algorithmic ideas are more easy to explain in the context of MTBDDs and since efficient and well-tested implementations of MTBDDs are available, we use MTBDDs for the presentation and the experiments.
The contributions of the paper are (1) an algorithm that converts between fast and compact partition representations in a logarithmic number of BDD operations, (2) a simple but effective algorithm that automatically changes the parameter mentioned above to balance the time and space requirements of the algorithm such that the refinement works at maximal speed without exceeding the available memory, and (3) an implementation of the conversion and parameter selection algorithms into the principal refinement algorithm. We experimentally evaluate the benefits of our algorithm, and compare its performance with the algorithms of [33] (that uses the fast representation of [31]) and [32].
The entire work is presented here in the context of continuous-time Markov chain lumping. However, there is a much broader spectrum of possible applications, since the techniques are straightforwardly adaptable to labeled transition systems, discrete-time or interactive Markov chains, Markov reward models, Markov decision processes etc. The core contribution of this paper is thus a general, fast, and memory efficient algorithm for symbolic lumping and symbolic bisimulation minimization.
Organization of the paper: Section 2 reviews the basic concepts and the context in which the present paper is placed. In Section 3, we discuss the principal algorithmic considerations behind a non-symbolic lumping algorithm. After discussing the earlier approaches to symbolic lumping in Sections 4 Symbolic lumping algorithms, 5 Hybrid representation introduces the new hybrid algorithm. Experimental results are presented in Section 6 demonstrating the effectiveness of our approach. Section 7 concludes the paper.
Section snippets
Background
In this section, we first review the concepts of continuous-time Markov chains and lumpability. We then give a brief account of the general principle of signature-based lumping algorithms. We finally review symbolic data structures and their use to represent various entities appearing in our context, such as sets and matrices.
Explicit-state lumping algorithms
In the following, we present two different algorithms which compute the coarsest lumping quotient that refines an initial partition , possibly induced by atomic labels, rewards, etc. attached to states. If no initial partition is explicitly given, we set .
Symbolic lumping algorithms
In order to obtain an efficient symbolic algorithm which computes the coarsest lumpable partition of a Markov chain, we first have to find an appropriate symbolic partition representation. Requirements are not only compactness but also an efficient support of the necessary operations. The most common techniques will be presented in this section. Then we will show how the two explicit-state lumping algorithms from Section 3 can be turned into symbolic algorithms. Their advantages and
Hybrid representation
In Section 2.2, we presented four different partition representations, two of which have desirable properties: the compact representation (CR) is very efficient in terms of memory requirement, but its manipulation (such as adding and removing a block) is relatively expensive. On the other hand, the fast representation (FR) enables us to perform the operation very efficiently in terms of speed, but its space requirement is high for partitions with a large number of blocks.
Example models and implementation
We have implemented our hybrid algorithm in C++ using the CUDD package [45] as the MTBDD library. To generate the MTBDD representations of the input MCs, we used the probabilistic model checking tool PRISM [21]. Our input models are given in PRISM’s guarded command language. They are read by PRISM, which generates MTBDD representations of the models. We have modified PRISM such that the MTBDD representations are dumped to a file. These dumped MTBDDs are read by our implementation.
All the codes
Conclusion
In this paper, we have developed a general, fast and memory efficient algorithm for ordinary (and also exact and strict) Markov chain lumping. The algorithm is presented in the context of continuous-time Markov chains, but is easily adaptable to labeled transition systems, Kripke structures, discrete-time or interactive Markov chains, Markov reward models, etc.
The particular strength of this algorithm is that it exploits the true potential of BDD-based representation with respect to time and
Ralf Wimmer received his diploma in Computer Science from the Albert-Ludwigs-University, Freiburg (Germany) in 2004. Since 2005 he has been working as a Ph.D. student at the German Transregional Collaborative Research Center AVACS in Freiburg. His research interests include applications of symbolic methods for stochastic verification.
References (52)
- et al.
Symbolic state-space exploration and numerical analysis of state-sharing composed models
Linear Algebra and Its Applications
(2004) - et al.
Optimal state-space lumping in Markov chains
Information Processing Letters
(2003) - et al.
On the use of MTBDDs for performability analysis and verification of stochastic systems
Journal of Logic and Algebraic Programming
(2003) - et al.
Distributed branching bisimulation reduction of state spaces
- et al.
On the use of MTBDDs for performability analysis and verification of stochastic systems
Journal of Logic and Algebraic Programming
(2003) An algorithm for minimizing states in a finite automaton
- et al.
Dependability evaluation using composed SAN-based reward models
Journal Parallel and Distributed Computing
(1992) - et al.
Probabilistic model checking of complex biological pathways
Theoretical Computer Science
(2008) - et al.
Cell cycle control in eukaryotes: A BioSpi model
Electronic Notes in Theoretical Computer Science
(2007) Concurrency and automata on infinite sequences
Finite Markov Chains
Reduced base model construction methods for stochastic activity networks
IEEE Journal on Selected Areas in Communication
An efficient algorithm for aggregating PEPA models
IEEE Transactions on Software Engineering
Stochastic well-formed colored nets and symmetric modeling applications
IEEE Transactions on Computers
Symmetry reduction for probabilistic model checking
A Compositional Approach to Performance Modelling
Stochastic process algebras: Integrating qualitative and quantitative modelling
Exact and ordinary lumpability in finite Markov chains
Journal of Applied Probability
Bisimulation minimization mostly speeds up probabilistic model checking
Multiterminal binary decision diagrams: An efficient data structure for matrix representation
Formal Methods in System Design
Algebraic decision diagrams and their applications
Formal Methods in System Design
Edge-valued binary decision diagrams for multi-level hierarchical verification
Formal verification using edge-valued binary decision diagrams
IEEE Transaction on Computers
Zero-suppressed BDDs for set manipulation in combinatorial problems
Cited by (16)
Comparison of scheduling schemes for on-demand IaaS requests
2012, Journal of Systems and SoftwareCitation Excerpt :Note that the stationary distribution always exists because the state space of the CTMC ϒ is finite. In addition, advanced techniques (Fujita et al., 1997; Hermanns et al., 1999; Kwiatkowska et al., 2004, 2010; Wimmer et al., 2010) can also be applied to exploit the sparseness of the matrices and to cope with the problem of state space explosion. We can also use the probabilistic symbolic model checking PRISM tool (Kwiatkowska et al., 2004) to compute the stationary probabilities and performance measures.
Equivalence relations for modular performance evaluation in dtsPBC
2014, Mathematical Structures in Computer ScienceEquivalence relations for modular performance evaluation in dtsPBC
2013, Mathematical Structures in Computer SciencePERFORMANCE PRESERVING EQUIVALENCE FOR STOCHASTIC PROCESS ALGEBRA DTSDPBC
2023, Siberian Electronic Mathematical ReportsMulti-core symbolic bisimulation minimisation
2018, International Journal on Software Tools for Technology Transfer
Ralf Wimmer received his diploma in Computer Science from the Albert-Ludwigs-University, Freiburg (Germany) in 2004. Since 2005 he has been working as a Ph.D. student at the German Transregional Collaborative Research Center AVACS in Freiburg. His research interests include applications of symbolic methods for stochastic verification.
Salem Derisavi received his bachelor degree in Computer Engineering in 1999 from Sharif University of Technology, Iran and his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign in 2005. He is currently working with the IBM Software Laboratory in Toronto, Canada. His research interests include designing efficient data structures and algorithms for functional and numerical analysis of finite-state models, and more recently, parallelizing compilers for multi-core processors.
Holger Hermanns studied at the University of Bordeaux, France, and the University of Erlangen/Nürnberg, Germany, where he received a diploma degree in Computer Science in 1993 (with honors) and a Ph.D. degree from the Department of Computer Science in 1998 (with honors). From 1998 to 2006 he has been with the University of Twente, The Netherlands, holding an associate professor position since October 2001. Since 2003 he heads the Dependable Systems and Software Group at Saarland University, Germany. He has published more than 100 scientific papers, holds various research grants, and has co-chaired several international conferences including CAV, CONCUR and TACAS. His research interests include modeling and verification of concurrent systems, resource-aware embedded systems, and compositional performance and dependability evaluation.
- ☆
This work was partly supported by the German Research Council (DFG) as part of the Transregional Collaborative Research Center “Automatic Verification and Analysis of Complex Systems” (SFB/TR 14 AVACS). See http://www.avacs.org for more information.
- 1
Part of this work was done while the co-author was at Carleton University in Ottawa, Canada.