Abstract
Benchmarks and evaluation are important for the development of techniques and tools. Studies regarding evaluation of model checkers by large-scale benchmarks are few. The lack of such studies is mainly because of the language difference of existing model checkers and the requirement of intensive labor in building models. In this study, we present a large-scale benchmark for evaluating model checkers whose inputs are concurrent models. The benchmark consists of 2318 models that are generated automatically from real-world message passing interface (MPI) programs. The complexities of the models have been inspected to be well distributed and suitable for evaluating model checkers. Based on the benchmark, we have evaluated five state-of-the-art model checkers, i.e., PAT, FDR, Spin, PRISM, and NuSMV, by verifying the deadlock freedom property. The evaluation results demonstrate the ability and performance difference of these model checkers in verifying message passing programs.
Similar content being viewed by others
References
Clarke E M, Grumberg O, Peled D A. Model Checking. Cambridge: MIT Press, 2001
Frappier M, Fraikin B, Chossart R, et al. Comparison of model checking tools for information systems. In: Proceedings of the 12th International Conference on Formal Engineering Methods, 2010. 581–596
Pelánek R. BEEM: benchmarks for explicit model checkers. In: Proceedings of the 14th International SPIN Workshop on Model Checking Software, 2007. 263–267
Gopalakrishnan G, Kirby R M, Siegel S F, et al. Formal analysis of MPI-based parallel programs. Commun ACM, 2011, 54: 82–91
Siegel S F. Verifying parallel programs with mpi-spin. In: Proceedings of Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2007. 13–14
Luo Z Q, Zheng M C, Siegel S F. Verification of MPI programs using CIVL. In: Proceedings of the 24th European MPI Users’ Group Meeting, 2017. 6: 1–11
Yu H B, Chen Z B, Fu X J, et al. Combining symbolic execution and model checking to verify MPI programs. 2018. ArXiv: 1803.06300
King J C. Symbolic execution and program testing. Commun ACM, 1976, 19: 385–394
Gibson-Robinson T, Armstrong P, Boulgakov A, et al. A modern refinement checker for CSP. In: Proceedings of Tools and Algorithms for the Construction and Analysis of Systems, 2014. 187–201
Lattner C. Llvm and clang: next generation compiler technology. In: Proceedings of the BSD Conference, 2008. 1–2
Hoare C A R. Communicating Sequential Processes. Upper Saddle River: Prentice-Hall, 1985
Scattergood J B. The semantics and implementation of machine-readable CSP. Dissertation for Ph.D. Degree. Oxford: University of Oxford, 1998
McMillan K L. Symbolic model checking. Norwell: Kluwer Academic Publishers, 1993
Baier C, Katoen J. Principles of Model Checking. Cambridge: MIT Press, 2008
Siegel S F, Zirkel T K. TASS: the toolkit for accurate scientific software. Math Comput Sci, 2011, 5: 395–426
Xue R N, Liu X Z, Wu M, et al. Mpiwiz: subgroup reproducible replay of mpi applications. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009. 251–260
Müller M, de Supinski B, Gopalakrishnan G, et al. Dealing with mpi bugs at scale: Best practices, automatic detection, debugging, and formal verification. 2011
Vakkalanka S. Efficient dynamic verification algorithms for MPI applications. 2010
Thompson J D, Higgins D G, Gibson T J. Clustalw: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 1994, 22: 4673–4680
Lattner C, Adve V S. LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the 2nd IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004. 75–88
Just R, Jalali D, Inozemtseva L, et al. Are mutants a valid substitute for real faults in software testing? In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 654–665
Newman M E J. The structure and function of complex networks. SIAM Rev, 2003, 45: 167–256
Hermann L R. Laplacian-isoparametric grid generation scheme. J Eng Mech Div, 1976, 102: 749–907
Godefroid P. Partial-order methods for the verification of concurrent systems — an approach to the state-explosion problem. In: Lecture Notes in Computer Science. Berlin: Springer, 1996
McKeeman W M. Differential testing for software. Digit Tech J, 1998, 10: 100–107
Vakkalanka S S, Gopalakrishnan G, Kirby R M. Dynamic verification of MPI programs with reductions in presence of split operations and relaxed orderings. In: Proceedings of the 20th International Conference on Computer Aided Verification, 2008. 66–79
Vakkalanka S S, Sharma S, Gopalakrishnan G, et al. ISP: a tool for model checking MPI programs. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008. 285–286
Forejt V, Joshi S, Kroening D, et al. Precise predictive analysis for discovering communication deadlocks in MPI programs. ACM Trans Program Lang Syst, 2017, 39: 1–27
Blom S, van de Pol J, Weber M. Ltsmin: distributed and symbolic reachability. In: Proceedings of the 22nd International Conference on Computer Aided Verification, 2010. 354–359
Lal A, Reps T W. Reducing concurrent analysis under a context bound to sequential analysis. Form Methods Syst Des, 2009, 35: 73–97
Laarman A, van de Pol J, Weber M. Boosting multi-core reachability performance with shared hash tables. In: Proceedings of the 10th International Conference on Formal Methods in Computer-Aided Design, 2010. 247–255
Kwiatkowska M Z, Norman G, Parker D. The PRISM benchmark suite. In: Proceedings of the 9th International Conference on Quantitative Evaluation of Systems, 2012. 203–204
Atiya D A, Catano N, Lüttgen G. Towards a benchmark for model checkers of asynchronous concurrent systems. In: Proceedings of the 5th International Workshop on Automated Verification of Critical Systems (AVOCs), 2005. 98: 142–170
Acknowledgements
This work was supported by National Key R&D Program of China (Grant No. 2017YFB1001802) and National Natural Science Foundation of China (Garnt Nos. 61472440, 61632015, 61690203, 61532007).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Hong, W., Chen, Z., Yu, H. et al. Evaluation of model checkers by verifying message passing programs. Sci. China Inf. Sci. 62, 200101 (2019). https://doi.org/10.1007/s11432-018-9825-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-018-9825-3