Skip to main content

Automatic Rollback Suggestions for Incremental Datalog Evaluation

  • Conference paper
  • First Online:
Practical Aspects of Declarative Languages (PADL 2023)

Abstract

Advances in incremental Datalog evaluation strategies have made Datalog popular among use cases with constantly evolving inputs such as static analysis in continuous integration and deployment pipelines. As a result, new logic programming debugging techniques are needed to support these emerging use cases.

This paper introduces an incremental debugging technique for Datalog, which determines the failing changes for a rollback in an incremental setup. Our debugging technique leverages a novel incremental provenance method. We have implemented our technique using an incremental version of the Soufflé Datalog engine and evaluated its effectiveness on the DaCapo Java program benchmarks analyzed by the Doop static analysis library. Compared to state-of-the-art techniques, we can localize faults and suggest rollbacks with an overall speedup of over 26.9\(\times \) while providing higher quality results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use an Intel Xeon Gold 6130 with 192 GB RAM, GCC 10.3.1, and Python 3.8.10.

  2. 2.

    Available at github.com/davidwzhao/souffle-fault-localization.

  3. 3.

    We say “over” because we bound timeouts to 7200 s.

References

  1. GitHub CodeQL (2021). https://codeql.github.com/. Accessed 19 Oct 2021

  2. Allen, N., Scholz, B., Krishnan, P.: Staged points-to analysis for large code bases. In: Franke, B. (ed.) CC 2015. LNCS, vol. 9031, pp. 131–150. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46663-6_7

    Chapter  Google Scholar 

  3. Arenas, M., Bertossi, L.E., Chomicki, J.: Answer sets for consistent query answering in inconsistent databases. Theory Pract. Log. Program. 3(4–5), 393–424 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  4. Backes, J., et al.: Reachability analysis for AWS-based networks. In: Dillig, I., Tasiran, S. (eds.) CAV 2019. LNCS, vol. 11562, pp. 231–241. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25543-5_14

    Chapter  Google Scholar 

  5. Binkley, D.W., Gallagher, K.B.: Program slicing. Adv. Comput. 43, 1–50 (1996)

    Article  Google Scholar 

  6. Blackburn, S.M., et al.: The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA 2006: Proceedings of the 21st annual ACM SIGPLAN conference on Object-Oriented Programing, Systems, Languages, and Applications, pp. 169–190. ACM Press, New York (2006). http://doi.acm.org/10.1145/1167473.1167488

  7. Blackburn, S.M., et al.: The dacapo benchmarks: Java benchmarking development and analysis. In: Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, pp. 169–190 (2006)

    Google Scholar 

  8. Bravenboer, M., Smaragdakis, Y.: Strictly declarative specification of sophisticated points-to analyses. SIGPLAN Not. 44(10), 243–262 (2009)

    Article  Google Scholar 

  9. Bravo, L., Bertossi, L.E.: Consistent query answering under inclusion dependencies. In: Lutfiyya, H., Singer, J., Stewart, D.A. (eds.) Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research, 5–7 October 2004, Markham, Ontario, Canada, pp. 202–216. IBM (2004)

    Google Scholar 

  10. Caballero, R., Riesco, A., Silva, J.: A survey of algorithmic debugging. ACM Comput. Surv. (CSUR) 50(4), 60 (2017)

    Google Scholar 

  11. Cheney, J.: Program slicing and data provenance. IEEE Data Eng. Bull. 30(4), 22–28 (2007)

    Google Scholar 

  12. Distefano, D., Fähndrich, M., Logozzo, F., O’Hearn, P.W.: Scaling static analyses at Facebook. Commun. ACM 62(8), 62–70 (2019)

    Article  Google Scholar 

  13. El-Hassany, A., Tsankov, P., Vanbever, L., Vechev, M.: Network-wide configuration synthesis. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 261–281. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_14

    Chapter  Google Scholar 

  14. Ezekiel, S., Lukas, K., Marcel, B., Zeller, A.: Locating faults with program slicing: an empirical analysis. Empir. Softw. Eng. 26(3), 1–45 (2021)

    Google Scholar 

  15. Fan, W.: Constraint-driven database repair. In: Liu, L., Özsu, M.T. (eds.) Encyclopedia of Database Systems, pp. 458–463. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-39940-9_599

    Chapter  Google Scholar 

  16. Fan, W., Geerts, F., Jia, X.: Semandaq: a data quality system based on conditional functional dependencies. Proc. VLDB Endow. 1(2), 1460–1463 (2008). https://doi.org/10.14778/1454159.1454200

    Article  Google Scholar 

  17. Gelfond, M., Lifschitz, V.: The Stable Model Semantics for Logic Programming, pp. 1070–1080. MIT Press (1988)

    Google Scholar 

  18. Grech, N., Brent, L., Scholz, B., Smaragdakis, Y.: Gigahorse: thorough, declarative decompilation of smart contracts. In: Proceedings of the 41th International Conference on Software Engineering, ICSE 2019, p. (to appear). ACM, Montreal (2019)

    Google Scholar 

  19. Grech, N., Kong, M., Jurisevic, A., Brent, L., Scholz, B., Smaragdakis, Y.: MadMax: surviving out-of-gas conditions in ethereum smart contracts. In: SPLASH 2018 OOPSLA (2018)

    Google Scholar 

  20. Harman, M., Hierons, R.: An overview of program slicing. Softw. Focus 2(3), 85–92 (2001)

    Article  Google Scholar 

  21. Hooker, J.: Generalized resolution for 0–1 linear inequalities. Ann. Math. Artif. Intell. 6, 271–286 (1992). https://doi.org/10.1007/BF01531033

    Article  MathSciNet  MATH  Google Scholar 

  22. Huang, S.S., Green, T.J., Loo, B.T.: Datalog and emerging applications: an interactive tutorial. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, pp. 1213–1216. ACM (2011). https://doi.org/10.1145/1989323.1989456, https://doi.acm.org/10.1145/1989323.1989456

  23. Jordan, H., Scholz, B., Subotić, P.: Soufflé: on synthesis of program analyzers. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9780, pp. 422–430. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41540-6_23

    Chapter  Google Scholar 

  24. Kakas, A.C., Kowalski, R.A., Toni, F.: Abductive logic programming (1993)

    Google Scholar 

  25. Karvounarakis, G., Ives, Z.G., Tannen, V.: Querying data provenance. In: SIGMOD 2010, p. 951–962. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1807167.1807269

  26. Li, X., Bundy, A., Smaill, A.: ABC repair system for datalog-like theories. In: KEOD, pp. 333–340 (2018)

    Google Scholar 

  27. McSherry, F., Murray, D.G., Isaacs, R., Isard, M.: Differential dataflow. In: CIDR (2013)

    Google Scholar 

  28. Motik, B., Nenov, Y., Piro, R., Horrocks, I.: Maintenance of datalog materialisations revisited. Artif. Intell. 269, 76–136 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  29. Raghothaman, M., Mendelson, J., Zhao, D., Naik, M., Scholz, B.: Provenance-guided synthesis of datalog programs. Proc. ACM Program. Lang. 4(POPL), 1–27 (2019)

    Google Scholar 

  30. Ryzhyk, L., Budiu, M.: Differential datalog. Datalog 2, 4–5 (2019)

    Google Scholar 

  31. Schäfer, M., Avgustinov, P., de Moor, O.: Algebraic data types for object-oriented datalog (2017)

    Google Scholar 

  32. Schrijver, A.: Theory of Linear and Integer Programming. Wiley, USA (1986)

    MATH  Google Scholar 

  33. Vallée-Rai, R. Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot: a Java bytecode optimization framework. In: CASCON First Decade High Impact Papers, pp. 214–224 (2010)

    Google Scholar 

  34. Weiser, M.: Program slicing. IEEE Trans. Software Eng. 4, 352–357 (1984)

    Article  MATH  Google Scholar 

  35. Yan, M., Xia, X., Lo, D., Hassan, A.E., Li, S.: Characterizing and identifying reverted commits. Empir. Softw. Eng. 24(4), 2171–2208 (2019). https://doi.org/10.1007/s10664-019-09688-8

    Article  Google Scholar 

  36. Yoon, Y., Myers, B.A.: An exploratory study of backtracking strategies used by developers. In: Proceedings of the 5th International Workshop on Co-operative and Human Aspects of Software Engineering, CHASE 2012, pp. 138–144. IEEE Press (2012)

    Google Scholar 

  37. Zeller, A.: Yesterday, my program worked. Today, it does not. Why? ACM SIGSOFT Softw. Eng. Notes 24(6), 253–267 (1999)

    Article  Google Scholar 

  38. Zeller, A., Hildebrandt, R.: Simplifying and isolating failure-inducing input. IEEE Trans. Software Eng. 28(2), 183–200 (2002)

    Article  Google Scholar 

  39. Zhao, D., Subotic, P., Raghothaman, M., Scholz, B.: Towards elastic incrementalization for datalog. In: 23rd International Symposium on Principles and Practice of Declarative Programming, pp. 1–16 (2021)

    Google Scholar 

  40. Zhao, D., Subotić, P., Scholz, B.: Debugging large-scale datalog: a scalable provenance evaluation strategy. ACM Trans. Program. Lang. Syst. (TOPLAS) 42(2), 1–35 (2020)

    Article  Google Scholar 

  41. Zhou, W., Sherr, M., Tao, T., Li, X., Loo, B.T., Mao, Y.: Efficient querying and maintenance of network provenance at internet-scale. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 615–626 (2010)

    Google Scholar 

Download references

Acknowledgments

M.R. was funded by U.S. NSF grants CCF-2146518, CCF-2124431, and CCF-2107261.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, D., Subotić, P., Raghothaman, M., Scholz, B. (2023). Automatic Rollback Suggestions for Incremental Datalog Evaluation. In: Hanus, M., Inclezan, D. (eds) Practical Aspects of Declarative Languages. PADL 2023. Lecture Notes in Computer Science, vol 13880. Springer, Cham. https://doi.org/10.1007/978-3-031-24841-2_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24841-2_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24840-5

  • Online ISBN: 978-3-031-24841-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics