Probabilistic Program Analysis

Dwyer, Matthew B.; Filieri, Antonio; Geldenhuys, Jaco; Gerrard, Mitchell; Păsăreanu, Corina S.; Visser, Willem

doi:10.1007/978-3-319-60074-1_1

Matthew B. Dwyer¹⁸,
Antonio Filieri¹⁹,
Jaco Geldenhuys²¹,
Mitchell Gerrard¹⁸,
Corina S. Păsăreanu²⁰ &
…
Willem Visser²¹

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10223))

Included in the following conference series:

International Summer School on Generative and Transformational Techniques in Software Engineering

965 Accesses
2 Citations

Abstract

This paper provides a survey of recent work on adapting techniques for program analysis to compute probabilistic characterizations of program behavior. We survey how the frameworks of data flow analysis and symbolic execution have incorporated information about input probability distributions to quantify the likelihood of properties of program states. We identify themes that relate and distinguish a variety of techniques that have been developed over the past 15 years in this area. In doing so, we point out opportunities for future research that builds on the strengths of different techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
More precisely, Eq. (1) represents the probability of satisfying the constraint \(\phi \) conditioned on the fact that the input is within the prescribed domain D.

References

Aydin, A., Bang, L., Bultan, T.: Automata-based model counting for string constraints. In: Proceedings of the 27th International Conference on Computer Aided Verification, CAV 2015, Part I, San Francisco, CA, USA, 18–24 July 2015, pp. 255–272 (2015)
Google Scholar
Bagnara, R., Hill, P.M., Zaffanella, E.: The parma polyhedra library: toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Sci. Comput. Program. 72(1), 3–21 (2008)
Article MathSciNet Google Scholar
Bang, L., Aydin, A., Phan, Q., Pasareanu, C.S., Bultan, T.: String analysis for side channels with segmented oracles. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, 13–18 November 2016, pp. 193–204 (2016)
Google Scholar
Barvinok, A.I.: A polynomial time algorithm for counting integral points in polyhedra when the dimension is fixed. Math. Oper. Res. 19(4), 769–779 (1994)
Article MathSciNet MATH Google Scholar
de Berg, M.: Computational Geometry: Algorithms and Applications. Springer, Heidelberg (2008)
Book MATH Google Scholar
Biere, A., van Maaren, H.: Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam (2009)
MATH Google Scholar
Bishop, C.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)
MATH Google Scholar
Borges, M., Filieri, A., d’Amorim, M., Păsăreanu, C.S.: Iterative distribution-aware sampling for probabilistic symbolic execution. In: Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2015. ACM (2015)
Google Scholar
Borges, M., Filieri, A., d’Amorim, M., Păsăreanu, C.S., Visser, W.: Compositional solution space quantification for probabilistic software analysis. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 123–132. ACM (2014)
Google Scholar
Bryant, R.E.: Symbolic boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv. (CSUR) 24(3), 293–318 (1992)
Article Google Scholar
Cadar, C., Dunbar, D., Engler, D.R.: Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: OSDI, vol. 8, pp. 209–224 (2008)
Google Scholar
Chakraborty, S., Fremont, D.J., Meel, K.S., Seshia, S.A., Vardi, M.Y.: Distribution-aware sampling and weighted model counting for SAT. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)
Google Scholar
Chakraborty, S., Meel, K.S., Vardi, M.Y.: A scalable approximate model counter. In: Schulte, C. (ed.) CP 2013. LNCS, vol. 8124, pp. 200–216. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40627-0_18
Chapter Google Scholar
Chistikov, D., Dimitrova, R., Majumdar, R.: Approximate counting in SMT and value estimation for probabilistic programs. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 320–334. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46681-0_26
Google Scholar
Claret, G., Rajamani, S.K., Nori, A.V., Gordon, A.D., Borgström, J.: Bayesian inference using data flow analysis. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pp. 92–102. ACM (2013)
Google Scholar
Clarke, E.M., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (1999)
Google Scholar
Clarke, L., et al.: A system to generate test data and symbolically execute programs. IEEE Trans. Software Eng. 3, 215–222 (1976)
Article MathSciNet Google Scholar
Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 238–252. ACM (1977)
Google Scholar
Cousot, P., Monerau, M.: Probabilistic abstract interpretation. In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 169–193. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28869-2_9
Chapter Google Scholar
Di Pierro, A., Wiklicky, H.: Probabilistic data flow analysis: a linear equational approach. arXiv preprint arXiv:1307.4474 (2013)
Dwyer, M.B.: Unifying testing and analysis through behavioral coverage. In: 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE), p. 2. IEEE (2011)
Google Scholar
Esparza, J., Gaiser, A.: Probabilistic abstractions with arbitrary domains. In: Yahav, E. (ed.) SAS 2011. LNCS, vol. 6887, pp. 334–350. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23702-7_25
Chapter Google Scholar
Filieri, A., Frias, M., Păsăreanu, C., Visser, W.: Model counting for complex data structures. In: Proceedings of the 2015 International SPIN Symposium on Model Checking of Software. ACM (2015)
Google Scholar
Filieri, A., Păsăreanu, C.S., Visser, W., Geldenhuys, J.: Statistical symbolic execution with informed sampling. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 437–448. ACM (2014)
Google Scholar
Filieri, A., Păsăreanu, C.S., Yang, G.: Quantification of software changes through probabilistic symbolic execution. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) - Short Paper, November 2015
Google Scholar
Filieri, A., Păsăreanu, C.S., Visser, W.: Reliability analysis in symbolic pathfinder. In: Proceedings of the 2013 International Conference on Software Engineering, pp. 622–631. IEEE Press (2013)
Google Scholar
Fink, S., Dolby, J.: WALA-The TJ watson libraries for analysis (2012)
Google Scholar
Floyd, R.W.: Assigning meanings to programs. In: Mathematical Aspects of Computer Science, pp. 19–32 (1967)
Google Scholar
Fosdick, L.D., Osterweil, L.J.: Data flow analysis in software reliability. ACM Comput. Surv. (CSUR) 8(3), 305–330 (1976)
Article MathSciNet MATH Google Scholar
Fu, K., Huang, T.: Stochastic grammars and languages. Int. J. Comput. Inform. Sci. 1(2), 135–170 (1972)
Article MathSciNet MATH Google Scholar
Geldenhuys, J., Dwyer, M.B., Visser, W.: Probabilistic symbolic execution. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis, pp. 166–176. ACM (2012)
Google Scholar
Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A., Rubin, D.: Bayesian Data Analysis, 3rd edn. Chapman & Hall/CRC Texts in Statistical Science, Taylor & Francis (2013)
Google Scholar
Gentle, J.: Random Number Generation and Monte Carlo Methods. Statistics and Computing. Springer, New York (2013)
MATH Google Scholar
Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: ACM Sigplan Notices, vol. 40, pp. 213–223. ACM (2005)
Google Scholar
Gordon, A.D., Henzinger, T.A., Nori, A.V., Rajamani, S.K.: Probabilistic programming. In: Proceedings of the on Future of Software Engineering, pp. 167–181. ACM (2014)
Google Scholar
Graf, S., Saidi, H.: Construction of abstract state graphs with PVS. In: Grumberg, O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997). doi:10.1007/3-540-63166-6_10
Chapter Google Scholar
Hahn, E.M., Hermanns, H., Wachter, B., Zhang, L.: PASS: abstraction refinement for infinite probabilistic models. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 353–357. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12002-2_30
Chapter Google Scholar
Hasuo, I., Jacobs, B., Sokolova, A.: Generic trace theory. Electron. Notes Theor. Comput. Sci. 164(1), 47–65 (2006)
Article MathSciNet MATH Google Scholar
Hérault, T., Lassaigne, R., Magniette, F., Peyronnet, S.: Approximate probabilistic model checking. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 73–84. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24622-0_8
Chapter Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Article MathSciNet MATH Google Scholar
Jamrozik, K., Fraser, G., Tillman, N., Halleux, J.: Generating test suites with augmented dynamic symbolic execution. In: Veanes, M., Viganò, L. (eds.) TAP 2013. LNCS, vol. 7942, pp. 152–167. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38916-0_9
Chapter Google Scholar
Jegourel, C., Legay, A., Sedwards, S.: Cross-entropy optimisation of importance sampling parameters for statistical model checking. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 327–342. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31424-7_26
Chapter Google Scholar
Jegourel, C., Legay, A., Sedwards, S.: Importance splitting for statistical model checking rare properties. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 576–591. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_38
Chapter Google Scholar
Jones, C.: Probabilistic non-determinism (1990)
Google Scholar
Kildall, G.A.: A unified approach to global program optimization. In: Proceedings of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 194–206. ACM (1973)
Google Scholar
King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976)
Article MathSciNet MATH Google Scholar
Kozen, D.: Semantics of probabilistic programs. J. Comput. Syst. Sci. 22(3), 328–350 (1981)
Article MathSciNet MATH Google Scholar
Kozen, D.: A probabilistic PDL. In: Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing, pp. 291–297. ACM (1983)
Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: Advances and challenges of probabilistic model checking. In: 2010 Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton) (2010)
Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22110-1_47
Chapter Google Scholar
Lam, P., Bodden, E., Lhoták, O., Hendren, L.: The Soot framework for Java program analysis: a retrospective. In: Cetus Users and Compiler Infrastructure Workshop, Galveston Island, TX, October 2011
Google Scholar
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization (2004)
Google Scholar
Legay, A., Delahaye, B., Bensalem, S.: Statistical model checking: an overview. In: Barringer, H., Falcone, Y., Finkbeiner, B., Havelund, K., Lee, I., Pace, G., Roşu, G., Sokolsky, O., Tillmann, N. (eds.) RV 2010. LNCS, vol. 6418, pp. 122–135. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16612-9_11
Chapter Google Scholar
Luckow, K., Păsăreanu, C.S., Dwyer, M.B., Filieri, A., Visser, W.: Exact and approximate probabilistic symbolic execution for nondeterministic programs. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 575–586. ACM (2014)
Google Scholar
Luu, L., Shinde, S., Saxena, P., Demsky, B.: A model counter for constraints over unbounded strings. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 565–576. ACM (2014)
Google Scholar
Mardziel, P., Magill, S., Hicks, M., Srivatsa, M.: Dynamic enforcement of knowledge-based security policies using probabilistic abstract interpretation. J. Comput. Secur. 21(4), 463–532 (2013)
Article Google Scholar
McDonald, J.B.: Some generalized functions for the size distribution of income. Econometrica J. Econometric Soc. 52, 647–663 (1984)
Article MATH Google Scholar
Meel, K.S.: Sampling techniques for boolean satisfiability. CoRR abs/1404.6682 (2014). http://arxiv.org/abs/1404.6682
Monniaux, D.: Abstract interpretation of probabilistic semantics. In: Palsberg, J. (ed.) SAS 2000. LNCS, vol. 1824, pp. 322–339. Springer, Heidelberg (2000). doi:10.1007/978-3-540-45099-3_17
Chapter Google Scholar
Monniaux, D.: Backwards Abstract Interpretation of Probabilistic Programs. In: Sands, D. (ed.) ESOP 2001. LNCS, vol. 2028, pp. 367–382. Springer, Heidelberg (2001). doi:10.1007/3-540-45309-1_24
Chapter Google Scholar
Monniaux, D.: Abstract interpretation of programs as markov decision processes. Sci. Comput. Program. 58(1), 179–205 (2005)
Article MathSciNet MATH Google Scholar
Morgan, C., McIver, A., Seidel, K.: Probabilistic predicate transformers. ACM Trans. Program. Lang. Syst. (TOPLAS) 18(3), 325–353 (1996)
Article Google Scholar
Murta, D., Oliveira, J.N.: A study of risk-aware program transformation. Sci. Comput. Program. 110(C), 51–77 (2015)
Article Google Scholar
Oliveira, J.N., Miraldo, V.C.: “Keep definition, change category” — a practical approach to state-based system calculi. J. Logical Algebraic Methods Program. 85(4), 449–474 (2016)
Article MathSciNet MATH Google Scholar
Pasareanu, C.S., Phan, Q., Malacaria, P.: Multi-run side-channel analysis using symbolic execution and Max-SMT. In: IEEE 29th Computer Security Foundations Symposium, CSF 2016, Lisbon, Portugal, 27 June–1 July 2016, pp. 387–400 (2016)
Google Scholar
Păsăreanu, C.S., Rungta, N.: Symbolic pathfinder: symbolic execution of Java bytecode. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 179–180. ACM (2010)
Google Scholar
Pestman, W.R.: Mathematical Statistics: An Introduction, vol. 1. Walter de Gruyter, Berlin (1998)
Book MATH Google Scholar
Puggelli, A., Li, W., Sangiovanni-Vincentelli, A.L., Seshia, S.A.: Polynomial-time verification of PCTL properties of MDPs with convex uncertainties. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 527–542. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39799-8_35
Chapter Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Book MATH Google Scholar
Ramalingam, G.: Data flow frequency analysis. In: ACM SIGPLAN Notices, vol. 31, pp. 267–277. ACM (1996)
Google Scholar
Robert, C.: The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation. Springer Texts in Statistics. Springer, New York (2007)
MATH Google Scholar
Robert, C., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2013)
MATH Google Scholar
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2005)
MATH Google Scholar
Sang, T., Beame, P., Kautz, H.: Heuristics for fast exact model counting. In: Bacchus, F., Walsh, T. (eds.) SAT 2005. LNCS, vol. 3569, pp. 226–240. Springer, Heidelberg (2005). doi:10.1007/11499107_17
Chapter Google Scholar
Schmidt, D.A.: Data flow analysis is model checking of abstract interpretations. In: Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 38–48. ACM (1998)
Google Scholar
Sen, K., Marinov, D., Agha, G.: CUTE: A concolic unit testing engine for C (2005)
Google Scholar
Smith, M.J.: Probabilistic abstract interpretation of imperative programs using truncated normal distributions. Electron. Notes Theor. Comput. Sci. 220(3), 43–59 (2008)
Article MATH Google Scholar
Song, D., et al.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89862-7_1
Chapter Google Scholar
Thakur, A., Elder, M., Reps, T.: Bilateral algorithms for symbolic abstraction. In: Miné, A., Schmidt, D. (eds.) SAS 2012. LNCS, vol. 7460, pp. 111–128. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33125-1_10
Chapter Google Scholar
The Apache Software Foundation: Commons math. http://commons.apache.org/proper/commons-math/. Accessed 16 Dec 2014
Thurley, M.: sharpSAT – counting models with advanced component caching and implicit BCP. In: Biere, A., Gomes, C.P. (eds.) SAT 2006. LNCS, vol. 4121, pp. 424–429. Springer, Heidelberg (2006). doi:10.1007/11814948_38
Chapter Google Scholar
Thurow, L.C.: Analyzing the American income distribution. Am. Econ. Rev. 60, 261–269 (1970)
Google Scholar
UC Davis, Mathematics: LattE. http://www.math.ucdavis.edu/latte
Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot-a Java bytecode optimization framework. In: Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research, p. 13. IBM Press (1999)
Google Scholar
Verdoolaege, S.: Software package barvinok (2004). http://freshmeat.net/projects/barvinok
Wachter, B., Zhang, L.: Best probabilistic transformers. In: Barthe, G., Hermenegildo, M. (eds.) VMCAI 2010. LNCS, vol. 5944, pp. 362–379. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11319-2_26
Chapter Google Scholar
Zuliani, P., Platzer, A., Clarke, E.M.: Bayesian statistical model checking with application to simulink/stateflow verification. In: Proceedings of the 13th ACM International Conference on Hybrid Systems: Computation and Control, pp. 243–252. ACM (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Nebraska – Lincoln, Lincoln, USA
Matthew B. Dwyer & Mitchell Gerrard
Imperial College London, London, UK
Antonio Filieri
Carnegie Mellon Silicon Valley and NASA Ames Research Center, Santa Clara, USA
Corina S. Păsăreanu
University of Stellenbosch, Stellenbosch, South Africa
Jaco Geldenhuys & Willem Visser

Authors

Matthew B. Dwyer
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Filieri
View author publications
You can also search for this author in PubMed Google Scholar
Jaco Geldenhuys
View author publications
You can also search for this author in PubMed Google Scholar
Mitchell Gerrard
View author publications
You can also search for this author in PubMed Google Scholar
Corina S. Păsăreanu
View author publications
You can also search for this author in PubMed Google Scholar
Willem Visser
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthew B. Dwyer .

Editor information

Editors and Affiliations

NOVA LINCS, Universidade nova de Lisboa , Lisbon, Portugal
Jácome Cunha
CISUC, Universida de Coimbra, Coimbra, Portugal
João P. Fernandes
Institut für Informatik, Universität Koblenz-Landau, Koblenz, Rheinland-Pfalz, Germany
Ralf Lämmel
Departamento de Informática, universidate do Minho, Braga, Portugal
João Saraiva
Universiteit van Amsterdam, Amsterdam, Holy See (Vatican City State)
Vadim Zaytsev

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dwyer, M.B., Filieri, A., Geldenhuys, J., Gerrard, M., Păsăreanu, C.S., Visser, W. (2017). Probabilistic Program Analysis. In: Cunha, J., Fernandes, J., Lämmel, R., Saraiva, J., Zaytsev, V. (eds) Grand Timely Topics in Software Engineering. GTTSE 2015. Lecture Notes in Computer Science(), vol 10223. Springer, Cham. https://doi.org/10.1007/978-3-319-60074-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-60074-1_1
Published: 29 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60073-4
Online ISBN: 978-3-319-60074-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics