Abstract
This paper provides a survey of recent work on adapting techniques for program analysis to compute probabilistic characterizations of program behavior. We survey how the frameworks of data flow analysis and symbolic execution have incorporated information about input probability distributions to quantify the likelihood of properties of program states. We identify themes that relate and distinguish a variety of techniques that have been developed over the past 15 years in this area. In doing so, we point out opportunities for future research that builds on the strengths of different techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
More precisely, Eq. (1) represents the probability of satisfying the constraint \(\phi \) conditioned on the fact that the input is within the prescribed domain D.
References
Aydin, A., Bang, L., Bultan, T.: Automata-based model counting for string constraints. In: Proceedings of the 27th International Conference on Computer Aided Verification, CAV 2015, Part I, San Francisco, CA, USA, 18–24 July 2015, pp. 255–272 (2015)
Bagnara, R., Hill, P.M., Zaffanella, E.: The parma polyhedra library: toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Sci. Comput. Program. 72(1), 3–21 (2008)
Bang, L., Aydin, A., Phan, Q., Pasareanu, C.S., Bultan, T.: String analysis for side channels with segmented oracles. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, 13–18 November 2016, pp. 193–204 (2016)
Barvinok, A.I.: A polynomial time algorithm for counting integral points in polyhedra when the dimension is fixed. Math. Oper. Res. 19(4), 769–779 (1994)
de Berg, M.: Computational Geometry: Algorithms and Applications. Springer, Heidelberg (2008)
Biere, A., van Maaren, H.: Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam (2009)
Bishop, C.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)
Borges, M., Filieri, A., d’Amorim, M., Păsăreanu, C.S.: Iterative distribution-aware sampling for probabilistic symbolic execution. In: Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2015. ACM (2015)
Borges, M., Filieri, A., d’Amorim, M., Păsăreanu, C.S., Visser, W.: Compositional solution space quantification for probabilistic software analysis. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 123–132. ACM (2014)
Bryant, R.E.: Symbolic boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv. (CSUR) 24(3), 293–318 (1992)
Cadar, C., Dunbar, D., Engler, D.R.: Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: OSDI, vol. 8, pp. 209–224 (2008)
Chakraborty, S., Fremont, D.J., Meel, K.S., Seshia, S.A., Vardi, M.Y.: Distribution-aware sampling and weighted model counting for SAT. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)
Chakraborty, S., Meel, K.S., Vardi, M.Y.: A scalable approximate model counter. In: Schulte, C. (ed.) CP 2013. LNCS, vol. 8124, pp. 200–216. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40627-0_18
Chistikov, D., Dimitrova, R., Majumdar, R.: Approximate counting in SMT and value estimation for probabilistic programs. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 320–334. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46681-0_26
Claret, G., Rajamani, S.K., Nori, A.V., Gordon, A.D., Borgström, J.: Bayesian inference using data flow analysis. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pp. 92–102. ACM (2013)
Clarke, E.M., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (1999)
Clarke, L., et al.: A system to generate test data and symbolically execute programs. IEEE Trans. Software Eng. 3, 215–222 (1976)
Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 238–252. ACM (1977)
Cousot, P., Monerau, M.: Probabilistic abstract interpretation. In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 169–193. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28869-2_9
Di Pierro, A., Wiklicky, H.: Probabilistic data flow analysis: a linear equational approach. arXiv preprint arXiv:1307.4474 (2013)
Dwyer, M.B.: Unifying testing and analysis through behavioral coverage. In: 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE), p. 2. IEEE (2011)
Esparza, J., Gaiser, A.: Probabilistic abstractions with arbitrary domains. In: Yahav, E. (ed.) SAS 2011. LNCS, vol. 6887, pp. 334–350. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23702-7_25
Filieri, A., Frias, M., Păsăreanu, C., Visser, W.: Model counting for complex data structures. In: Proceedings of the 2015 International SPIN Symposium on Model Checking of Software. ACM (2015)
Filieri, A., Păsăreanu, C.S., Visser, W., Geldenhuys, J.: Statistical symbolic execution with informed sampling. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 437–448. ACM (2014)
Filieri, A., Păsăreanu, C.S., Yang, G.: Quantification of software changes through probabilistic symbolic execution. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) - Short Paper, November 2015
Filieri, A., Păsăreanu, C.S., Visser, W.: Reliability analysis in symbolic pathfinder. In: Proceedings of the 2013 International Conference on Software Engineering, pp. 622–631. IEEE Press (2013)
Fink, S., Dolby, J.: WALA-The TJ watson libraries for analysis (2012)
Floyd, R.W.: Assigning meanings to programs. In: Mathematical Aspects of Computer Science, pp. 19–32 (1967)
Fosdick, L.D., Osterweil, L.J.: Data flow analysis in software reliability. ACM Comput. Surv. (CSUR) 8(3), 305–330 (1976)
Fu, K., Huang, T.: Stochastic grammars and languages. Int. J. Comput. Inform. Sci. 1(2), 135–170 (1972)
Geldenhuys, J., Dwyer, M.B., Visser, W.: Probabilistic symbolic execution. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis, pp. 166–176. ACM (2012)
Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., Rubin, D.: Bayesian Data Analysis, 3rd edn. Chapman & Hall/CRC Texts in Statistical Science, Taylor & Francis (2013)
Gentle, J.: Random Number Generation and Monte Carlo Methods. Statistics and Computing. Springer, New York (2013)
Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: ACM Sigplan Notices, vol. 40, pp. 213–223. ACM (2005)
Gordon, A.D., Henzinger, T.A., Nori, A.V., Rajamani, S.K.: Probabilistic programming. In: Proceedings of the on Future of Software Engineering, pp. 167–181. ACM (2014)
Graf, S., Saidi, H.: Construction of abstract state graphs with PVS. In: Grumberg, O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997). doi:10.1007/3-540-63166-6_10
Hahn, E.M., Hermanns, H., Wachter, B., Zhang, L.: PASS: abstraction refinement for infinite probabilistic models. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 353–357. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12002-2_30
Hasuo, I., Jacobs, B., Sokolova, A.: Generic trace theory. Electron. Notes Theor. Comput. Sci. 164(1), 47–65 (2006)
Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24622-0_8
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Jamrozik, K., Fraser, G., Tillman, N., Halleux, J.: Generating test suites with augmented dynamic symbolic execution. In: Veanes, M., Viganò, L. (eds.) TAP 2013. LNCS, vol. 7942, pp. 152–167. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38916-0_9
Jegourel, C., Legay, A., Sedwards, S.: Cross-entropy optimisation of importance sampling parameters for statistical model checking. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 327–342. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31424-7_26
Jegourel, C., Legay, A., Sedwards, S.: Importance splitting for statistical model checking rare properties. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 576–591. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_38
Jones, C.: Probabilistic non-determinism (1990)
Kildall, G.A.: A unified approach to global program optimization. In: Proceedings of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 194–206. ACM (1973)
King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976)
Kozen, D.: Semantics of probabilistic programs. J. Comput. Syst. Sci. 22(3), 328–350 (1981)
Kozen, D.: A probabilistic PDL. In: Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing, pp. 291–297. ACM (1983)
Kwiatkowska, M., Norman, G., Parker, D.: Advances and challenges of probabilistic model checking. In: 2010 Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton) (2010)
Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22110-1_47
Lam, P., Bodden, E., Lhoták, O., Hendren, L.: The Soot framework for Java program analysis: a retrospective. In: Cetus Users and Compiler Infrastructure Workshop, Galveston Island, TX, October 2011
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization (2004)
Legay, A., Delahaye, B., Bensalem, S.: Statistical model checking: an overview. In: Barringer, H., Falcone, Y., Finkbeiner, B., Havelund, K., Lee, I., Pace, G., Roşu, G., Sokolsky, O., Tillmann, N. (eds.) RV 2010. LNCS, vol. 6418, pp. 122–135. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16612-9_11
Luckow, K., Păsăreanu, C.S., Dwyer, M.B., Filieri, A., Visser, W.: Exact and approximate probabilistic symbolic execution for nondeterministic programs. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 575–586. ACM (2014)
Luu, L., Shinde, S., Saxena, P., Demsky, B.: A model counter for constraints over unbounded strings. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 565–576. ACM (2014)
Mardziel, P., Magill, S., Hicks, M., Srivatsa, M.: Dynamic enforcement of knowledge-based security policies using probabilistic abstract interpretation. J. Comput. Secur. 21(4), 463–532 (2013)
McDonald, J.B.: Some generalized functions for the size distribution of income. Econometrica J. Econometric Soc. 52, 647–663 (1984)
Meel, K.S.: Sampling techniques for boolean satisfiability. CoRR abs/1404.6682 (2014). http://arxiv.org/abs/1404.6682
Monniaux, D.: Abstract interpretation of probabilistic semantics. In: Palsberg, J. (ed.) SAS 2000. LNCS, vol. 1824, pp. 322–339. Springer, Heidelberg (2000). doi:10.1007/978-3-540-45099-3_17
Monniaux, D.: Backwards Abstract Interpretation of Probabilistic Programs. In: Sands, D. (ed.) ESOP 2001. LNCS, vol. 2028, pp. 367–382. Springer, Heidelberg (2001). doi:10.1007/3-540-45309-1_24
Monniaux, D.: Abstract interpretation of programs as markov decision processes. Sci. Comput. Program. 58(1), 179–205 (2005)
Morgan, C., McIver, A., Seidel, K.: Probabilistic predicate transformers. ACM Trans. Program. Lang. Syst. (TOPLAS) 18(3), 325–353 (1996)
Murta, D., Oliveira, J.N.: A study of risk-aware program transformation. Sci. Comput. Program. 110(C), 51–77 (2015)
Oliveira, J.N., Miraldo, V.C.: “Keep definition, change category” — a practical approach to state-based system calculi. J. Logical Algebraic Methods Program. 85(4), 449–474 (2016)
Pasareanu, C.S., Phan, Q., Malacaria, P.: Multi-run side-channel analysis using symbolic execution and Max-SMT. In: IEEE 29th Computer Security Foundations Symposium, CSF 2016, Lisbon, Portugal, 27 June–1 July 2016, pp. 387–400 (2016)
Păsăreanu, C.S., Rungta, N.: Symbolic pathfinder: symbolic execution of Java bytecode. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 179–180. ACM (2010)
Pestman, W.R.: Mathematical Statistics: An Introduction, vol. 1. Walter de Gruyter, Berlin (1998)
Puggelli, A., Li, W., Sangiovanni-Vincentelli, A.L., Seshia, S.A.: Polynomial-time verification of PCTL properties of MDPs with convex uncertainties. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 527–542. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_35
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Ramalingam, G.: Data flow frequency analysis. In: ACM SIGPLAN Notices, vol. 31, pp. 267–277. ACM (1996)
Robert, C.: The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation. Springer Texts in Statistics. Springer, New York (2007)
Robert, C., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2013)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2005)
Sang, T., Beame, P., Kautz, H.: Heuristics for fast exact model counting. In: Bacchus, F., Walsh, T. (eds.) SAT 2005. LNCS, vol. 3569, pp. 226–240. Springer, Heidelberg (2005). doi:10.1007/11499107_17
Schmidt, D.A.: Data flow analysis is model checking of abstract interpretations. In: Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 38–48. ACM (1998)
Sen, K., Marinov, D., Agha, G.: CUTE: A concolic unit testing engine for C (2005)
Smith, M.J.: Probabilistic abstract interpretation of imperative programs using truncated normal distributions. Electron. Notes Theor. Comput. Sci. 220(3), 43–59 (2008)
Song, D., et al.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89862-7_1
Thakur, A., Elder, M., Reps, T.: Bilateral algorithms for symbolic abstraction. In: Miné, A., Schmidt, D. (eds.) SAS 2012. LNCS, vol. 7460, pp. 111–128. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33125-1_10
The Apache Software Foundation: Commons math. http://commons.apache.org/proper/commons-math/. Accessed 16 Dec 2014
Thurley, M.: sharpSAT – counting models with advanced component caching and implicit BCP. In: Biere, A., Gomes, C.P. (eds.) SAT 2006. LNCS, vol. 4121, pp. 424–429. Springer, Heidelberg (2006). doi:10.1007/11814948_38
Thurow, L.C.: Analyzing the American income distribution. Am. Econ. Rev. 60, 261–269 (1970)
UC Davis, Mathematics: LattE. http://www.math.ucdavis.edu/latte
Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot-a Java bytecode optimization framework. In: Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research, p. 13. IBM Press (1999)
Verdoolaege, S.: Software package barvinok (2004). http://freshmeat.net/projects/barvinok
Wachter, B., Zhang, L.: Best probabilistic transformers. In: Barthe, G., Hermenegildo, M. (eds.) VMCAI 2010. LNCS, vol. 5944, pp. 362–379. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11319-2_26
Zuliani, P., Platzer, A., Clarke, E.M.: Bayesian statistical model checking with application to simulink/stateflow verification. In: Proceedings of the 13th ACM International Conference on Hybrid Systems: Computation and Control, pp. 243–252. ACM (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Dwyer, M.B., Filieri, A., Geldenhuys, J., Gerrard, M., Păsăreanu, C.S., Visser, W. (2017). Probabilistic Program Analysis. In: Cunha, J., Fernandes, J., Lämmel, R., Saraiva, J., Zaytsev, V. (eds) Grand Timely Topics in Software Engineering. GTTSE 2015. Lecture Notes in Computer Science(), vol 10223. Springer, Cham. https://doi.org/10.1007/978-3-319-60074-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-60074-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60073-4
Online ISBN: 978-3-319-60074-1
eBook Packages: Computer ScienceComputer Science (R0)