Abstract
We address the problem of computing an abstraction for a set of examples, which is precise enough to separate them from a set of counterexamples. The challenge is to find an over-approximation of the positive examples that does not represent any negative example. Conjunctive abstractions (e.g., convex numerical domains) and limited disjunctive abstractions, are often insufficient, as even the best such abstraction might include negative examples. One way to improve precision is to consider a general disjunctive abstraction.
We present \(D^3\), a new algorithm for learning general disjunctive abstractions. Our algorithm is inspired by widely used machine-learning algorithms for obtaining a classifier from positive and negative examples. In contrast to these algorithms which cannot generalize from disjunctions, \(D^3\) obtains a disjunctive abstraction that minimizes the number of disjunctions. The result generalizes the positive examples as much as possible without representing any of the negative examples. We demonstrate the value of our algorithm by applying it to the problem of data-driven differential analysis, computing the abstract semantic difference between two programs. Our evaluation shows that \(D^3\) can be used to effectively learn precise differences between programs even when the difference requires a disjunctive representation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
A most specific representation need not exist. For simplicity of the presentation, we consider the case where it does, and explain what adaptations are needed when it does not.
- 2.
If \(\beta \) maps a concrete point to a single concept which best represents it, it is easily shown that it suffices to maintain S as a single element. Candidate Elimination can also handle multiple representations, in which case S will be a set of specific bounds, similarly to G.
References
Scalacheck: Property-based testing for scala
Bagnara, R.: A hierarchy of constraint systems for data-flow analysis of constraint logic-based languages. Sci. Comput. Program. 30(1), 119–155 (1998)
Bagnara, R., Hill, P.M., Zaffanella, E.: Widening operators for powerset domains. STTT 8(4–5), 449–466 (2006)
Balcan, M.-F., Beygelzimer, A., Langford, J.: Agnostic active learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 65–72. ACM (2006)
Beckman, N.E., Nori, A.V., Rajamani, S.K., Simmons, R.J., Tetali, S.D., Thakur, A.V.: Proofs from tests. IEEE Trans. Softw. Eng. 36(4), 495–508 (2010)
Beyer, D., Henzinger, T.A., Théoduloz, G.: Configurable software verification: concretizing the convergence of model checking and program analysis. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 504–518. Springer, Heidelberg (2007)
Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855. Springer, Heidelberg (2000)
Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994)
Cousot, P., Cousot, R.: Static determination of dynamic properties of programs. In: Proceedings of the Second International Symposium on Programming, pp. 106–130, Dunod, Paris, France (1976)
Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Proceedings of the 6th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 269–282. ACM (1979)
Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables of a program. In: POPL, pp. 84–96 (1978)
Ernst, M.D., Cockrell, J., Griswold, W.G., Notkin, D.: Dynamically discovering likely program invariants to support program evolution. IEEE Trans. Softw. Eng. 27(2), 99–123 (2001)
Ernst, M.D., Perkins, J.H., Guo, P.J., McCamant, S., Pacheco, C., Tschantz, M.S., Xiao, C.: The daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1), 35–45 (2007)
Flanagan, C., Leino, K.R.M.: Houdini, an annotation assistant for ESC/Java. In: Oliveira, J.N., Zave, P. (eds.) FME 2001: Formal Methods for Increasing Software Productivity. LNCS, vol. 2021, pp. 500–517. Springer, Heidelberg (2001)
Ghorbal, K., Ivančić, F., Balakrishnan, G., Maeda, N., Gupta, A.: Donut domains: efficient non-convex domains for abstract interpretation. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 235–250. Springer, Heidelberg (2012)
Godefroid, P., Levin, M.Y., Molnar, D.: Sage: whitebox fuzzing for security testing. Queue 10(1), 20 (2012)
Granger, P.: Static analysis of arithmetical congruences. International Journal of Computer Mathematics 30(3–4), 165–190 (1989)
Gulavani, B.S., Chakraborty, S., Nori, A.V., Rajamani, S.K.: Automatically Refining Abstract Interpretations. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 443–458. Springer, Heidelberg (2008)
Gupta, A., McMillan, K.L., Fu, Z.: Automated Assumption Generation for Compositional Verification. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 420–432. Springer, Heidelberg (2007)
Gurfinkel, A., and Chaki, S. Boxes: A symbolic abstract domain of boxes. In Static Analysis. Springer, 2010, pp. 287–303
Lopes, N.P., Monteiro, J.: Weakest Precondition Synthesis for Compiler Optimizations. In: McMillan, K.L., Rival, X. (eds.) VMCAI 2014. LNCS, vol. 8318, pp. 203–221. Springer, Heidelberg (2014)
Manago, M., and Blythe, J. Learning disjunctive concepts. In Knowledge representation and organization in machine learning. Springer, 1989, pp. 211–230
Mauborgne, L., and Rival, X. Trace partitioning in abstract interpretation based static analyzers. In Programming Languages and Systems. Springer, 2005, pp. 5–20
Miné, A.: The octagon abstract domain. Higher-Order and Symbolic Computation 19(1), 31–100 (2006)
Mitchell, T. Machine Learning. McGraw-Hill international editions - computer science series. McGraw-Hill Education, 1997, ch. 2, pp. 20–51
Mitchell, T. M. Version spaces: an approach to concept learning. PhD thesis, Stanford University, Dec 1978
Murray, K. S. Multiple convergence: An approach to disjunctive concept acquisition. In IJCAI (1987), Citeseer, pp. 297–300
Partush, N., Yahav, E.: Abstract Semantic Differencing for Numerical Programs. In: Logozzo, F., Fähndrich, M. (eds.) Static Analysis. LNCS, vol. 7935, pp. 238–258. Springer, Heidelberg (2013)
Partush, N., and Yahav, E. Abstract semantic differencing via speculative correlation. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & #38; Applications (New York, NY, USA, 2014), OOPSLA ’14, ACM, pp. 811–828
Sankaranarayanan, S., Ivančić, F., Shlyakhter, I., Gupta, A.: Static Analysis in Disjunctive Numerical Domains. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 3–17. Springer, Heidelberg (2006)
Sebag, M. Delaying the choice of bias: A disjunctive version space approach. In ICML (1996), Citeseer, pp. 444–452
Seghir, M. N., and Kroening, D. Counterexample-guided precondition inference. In Programming Languages and Systems. Springer, 2013, pp. 451–471
Sen, K., Agha, G.: CUTE and jCUTE: Concolic Unit Testing and Explicit Path Model-Checking Tools. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 419–423. Springer, Heidelberg (2006)
Sharma, R., Aiken, A.: From Invariant Checking to Invariant Inference Using Randomized Search. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 88–105. Springer, Heidelberg (2014)
Sharma, R., Schkufza, E., Churchill, B. R., and Aiken, A. Data-driven equivalence checking. In OOPSLA (2013), pp. 391–406
Srivastava, S., and Gulwani, S. Program verification using templates over predicate abstraction. In ACM Sigplan Notices (2009), vol. 44, ACM, pp. 223–234
Thakur, A., Elder, M., Reps, T.: Bilateral Algorithms for Symbolic Abstraction. In: Miné, A., Schmidt, D. (eds.) SAS 2012. LNCS, vol. 7460, pp. 111–128. Springer, Heidelberg (2012)
Acknowledgment
The research leading to these results has received funding from the European Union’s - Seventh Framework Programme (FP7) under grant agreement no. 615688 - ERC-COG-PRIME and under ERC grant agreement no. 321174-VSSC, and from the BSF grant no. 2012259.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peleg, H., Shoham, S., Yahav, E. (2016). \(D^3\): Data-Driven Disjunctive Abstraction. In: Jobstmann, B., Leino, K. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2016. Lecture Notes in Computer Science(), vol 9583. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49122-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-662-49122-5_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-49121-8
Online ISBN: 978-3-662-49122-5
eBook Packages: Computer ScienceComputer Science (R0)