Abstract
Advances in incremental Datalog evaluation strategies have made Datalog popular among use cases with constantly evolving inputs such as static analysis in continuous integration and deployment pipelines. As a result, new logic programming debugging techniques are needed to support these emerging use cases.
This paper introduces an incremental debugging technique for Datalog, which determines the failing changes for a rollback in an incremental setup. Our debugging technique leverages a novel incremental provenance method. We have implemented our technique using an incremental version of the Soufflé Datalog engine and evaluated its effectiveness on the DaCapo Java program benchmarks analyzed by the Doop static analysis library. Compared to state-of-the-art techniques, we can localize faults and suggest rollbacks with an overall speedup of over 26.9\(\times \) while providing higher quality results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use an Intel Xeon Gold 6130 with 192 GB RAM, GCC 10.3.1, and Python 3.8.10.
- 2.
Available at github.com/davidwzhao/souffle-fault-localization.
- 3.
We say “over” because we bound timeouts to 7200 s.
References
GitHub CodeQL (2021). https://codeql.github.com/. Accessed 19 Oct 2021
Allen, N., Scholz, B., Krishnan, P.: Staged points-to analysis for large code bases. In: Franke, B. (ed.) CC 2015. LNCS, vol. 9031, pp. 131–150. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46663-6_7
Arenas, M., Bertossi, L.E., Chomicki, J.: Answer sets for consistent query answering in inconsistent databases. Theory Pract. Log. Program. 3(4–5), 393–424 (2003)
Backes, J., et al.: Reachability analysis for AWS-based networks. In: Dillig, I., Tasiran, S. (eds.) CAV 2019. LNCS, vol. 11562, pp. 231–241. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25543-5_14
Binkley, D.W., Gallagher, K.B.: Program slicing. Adv. Comput. 43, 1–50 (1996)
Blackburn, S.M., et al.: The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA 2006: Proceedings of the 21st annual ACM SIGPLAN conference on Object-Oriented Programing, Systems, Languages, and Applications, pp. 169–190. ACM Press, New York (2006). http://doi.acm.org/10.1145/1167473.1167488
Blackburn, S.M., et al.: The dacapo benchmarks: Java benchmarking development and analysis. In: Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, pp. 169–190 (2006)
Bravenboer, M., Smaragdakis, Y.: Strictly declarative specification of sophisticated points-to analyses. SIGPLAN Not. 44(10), 243–262 (2009)
Bravo, L., Bertossi, L.E.: Consistent query answering under inclusion dependencies. In: Lutfiyya, H., Singer, J., Stewart, D.A. (eds.) Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research, 5–7 October 2004, Markham, Ontario, Canada, pp. 202–216. IBM (2004)
Caballero, R., Riesco, A., Silva, J.: A survey of algorithmic debugging. ACM Comput. Surv. (CSUR) 50(4), 60 (2017)
Cheney, J.: Program slicing and data provenance. IEEE Data Eng. Bull. 30(4), 22–28 (2007)
Distefano, D., Fähndrich, M., Logozzo, F., O’Hearn, P.W.: Scaling static analyses at Facebook. Commun. ACM 62(8), 62–70 (2019)
El-Hassany, A., Tsankov, P., Vanbever, L., Vechev, M.: Network-wide configuration synthesis. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 261–281. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_14
Ezekiel, S., Lukas, K., Marcel, B., Zeller, A.: Locating faults with program slicing: an empirical analysis. Empir. Softw. Eng. 26(3), 1–45 (2021)
Fan, W.: Constraint-driven database repair. In: Liu, L., Özsu, M.T. (eds.) Encyclopedia of Database Systems, pp. 458–463. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-39940-9_599
Fan, W., Geerts, F., Jia, X.: Semandaq: a data quality system based on conditional functional dependencies. Proc. VLDB Endow. 1(2), 1460–1463 (2008). https://doi.org/10.14778/1454159.1454200
Gelfond, M., Lifschitz, V.: The Stable Model Semantics for Logic Programming, pp. 1070–1080. MIT Press (1988)
Grech, N., Brent, L., Scholz, B., Smaragdakis, Y.: Gigahorse: thorough, declarative decompilation of smart contracts. In: Proceedings of the 41th International Conference on Software Engineering, ICSE 2019, p. (to appear). ACM, Montreal (2019)
Grech, N., Kong, M., Jurisevic, A., Brent, L., Scholz, B., Smaragdakis, Y.: MadMax: surviving out-of-gas conditions in ethereum smart contracts. In: SPLASH 2018 OOPSLA (2018)
Harman, M., Hierons, R.: An overview of program slicing. Softw. Focus 2(3), 85–92 (2001)
Hooker, J.: Generalized resolution for 0–1 linear inequalities. Ann. Math. Artif. Intell. 6, 271–286 (1992). https://doi.org/10.1007/BF01531033
Huang, S.S., Green, T.J., Loo, B.T.: Datalog and emerging applications: an interactive tutorial. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, pp. 1213–1216. ACM (2011). https://doi.org/10.1145/1989323.1989456, https://doi.acm.org/10.1145/1989323.1989456
Jordan, H., Scholz, B., Subotić, P.: Soufflé: on synthesis of program analyzers. In: Chaudhuri, S., Farzan, A. (eds.) CAV 2016. LNCS, vol. 9780, pp. 422–430. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41540-6_23
Kakas, A.C., Kowalski, R.A., Toni, F.: Abductive logic programming (1993)
Karvounarakis, G., Ives, Z.G., Tannen, V.: Querying data provenance. In: SIGMOD 2010, p. 951–962. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1807167.1807269
Li, X., Bundy, A., Smaill, A.: ABC repair system for datalog-like theories. In: KEOD, pp. 333–340 (2018)
McSherry, F., Murray, D.G., Isaacs, R., Isard, M.: Differential dataflow. In: CIDR (2013)
Motik, B., Nenov, Y., Piro, R., Horrocks, I.: Maintenance of datalog materialisations revisited. Artif. Intell. 269, 76–136 (2019)
Raghothaman, M., Mendelson, J., Zhao, D., Naik, M., Scholz, B.: Provenance-guided synthesis of datalog programs. Proc. ACM Program. Lang. 4(POPL), 1–27 (2019)
Ryzhyk, L., Budiu, M.: Differential datalog. Datalog 2, 4–5 (2019)
Schäfer, M., Avgustinov, P., de Moor, O.: Algebraic data types for object-oriented datalog (2017)
Schrijver, A.: Theory of Linear and Integer Programming. Wiley, USA (1986)
Vallée-Rai, R. Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot: a Java bytecode optimization framework. In: CASCON First Decade High Impact Papers, pp. 214–224 (2010)
Weiser, M.: Program slicing. IEEE Trans. Software Eng. 4, 352–357 (1984)
Yan, M., Xia, X., Lo, D., Hassan, A.E., Li, S.: Characterizing and identifying reverted commits. Empir. Softw. Eng. 24(4), 2171–2208 (2019). https://doi.org/10.1007/s10664-019-09688-8
Yoon, Y., Myers, B.A.: An exploratory study of backtracking strategies used by developers. In: Proceedings of the 5th International Workshop on Co-operative and Human Aspects of Software Engineering, CHASE 2012, pp. 138–144. IEEE Press (2012)
Zeller, A.: Yesterday, my program worked. Today, it does not. Why? ACM SIGSOFT Softw. Eng. Notes 24(6), 253–267 (1999)
Zeller, A., Hildebrandt, R.: Simplifying and isolating failure-inducing input. IEEE Trans. Software Eng. 28(2), 183–200 (2002)
Zhao, D., Subotic, P., Raghothaman, M., Scholz, B.: Towards elastic incrementalization for datalog. In: 23rd International Symposium on Principles and Practice of Declarative Programming, pp. 1–16 (2021)
Zhao, D., Subotić, P., Scholz, B.: Debugging large-scale datalog: a scalable provenance evaluation strategy. ACM Trans. Program. Lang. Syst. (TOPLAS) 42(2), 1–35 (2020)
Zhou, W., Sherr, M., Tao, T., Li, X., Loo, B.T., Mao, Y.: Efficient querying and maintenance of network provenance at internet-scale. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 615–626 (2010)
Acknowledgments
M.R. was funded by U.S. NSF grants CCF-2146518, CCF-2124431, and CCF-2107261.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, D., Subotić, P., Raghothaman, M., Scholz, B. (2023). Automatic Rollback Suggestions for Incremental Datalog Evaluation. In: Hanus, M., Inclezan, D. (eds) Practical Aspects of Declarative Languages. PADL 2023. Lecture Notes in Computer Science, vol 13880. Springer, Cham. https://doi.org/10.1007/978-3-031-24841-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-24841-2_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24840-5
Online ISBN: 978-3-031-24841-2
eBook Packages: Computer ScienceComputer Science (R0)