ABSTRACT
Incremental computations react to input changes by updating their outputs. Compared to a non-incremental rerun, incremental computations can provide order-of-magnitude speedups, since often small input changes trigger small output changes. One popular means for implementing incremental computations is to encode the computation in Datalog, for which efficient incremental solvers exist. However, Datalog is very restrictive in terms of the data types it can process: Atomic data organized in relations. While structured tree and graph-shaped data can be encoded in relations, a naive encoding inhibits incrementality. In this paper, we present an encoding of structured data in Datalog that supports efficient incrementality such that small input changes are expressible. We explain how to efficiently implement and integrate this encoding into an existing incremental Datalog engine, and we show how tree diffing algorithms can be used to change the encoded data.
- Martín Abadi, Butler W. Lampson, and Jean-Jacques Lévy. 1996. Analysis and Caching of Dependencies. In Proceedings of the 1996 ACM SIGPLAN International Conference on Functional Programming, ICFP 1996, Philadelphia, Pennsylvania, USA, May 24-26, 1996, Robert Harper and Richard L. Wexelblat (Eds.). ACM, 83–91. https://doi.org/10.1145/232627.232638 Google ScholarDigital Library
- Pavel Avgustinov, Oege de Moor, Michael Peyton Jones, and Max Schäfer. 2016. QL: Object-oriented Queries on Relational Data. In 30th European Conference on Object-Oriented Programming, ECOOP 2016, July 18-22, 2016, Rome, Italy, Shriram Krishnamurthi and Benjamin S. Lerner (Eds.) (LIPIcs, Vol. 56). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2:1–2:25. https://doi.org/10.4230/LIPIcs.ECOOP.2016.2 Google ScholarCross Ref
- Darshana Balakrishnan, Carl Nuessle, Oliver Kennedy, and Lukasz Ziarek. 2021. TreeToaster: Towards an IVM-Optimized Compiler. In SIGMOD ’21: International Conference on Management of Data, Virtual Event, China, June 20-25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 155–167. https://doi.org/10.1145/3448016.3459244 Google ScholarDigital Library
- William C. Benton and Charles N. Fischer. 2007. Interactive, scalable, declarative program analysis: from prototype to implementation. In Proceedings of the 9th International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming, July 14-16, 2007, Wroclaw, Poland, Michael Leuschel and Andreas Podelski (Eds.). ACM, 13–24. https://doi.org/10.1145/1273920.1273923 Google ScholarDigital Library
- Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly declarative specification of sophisticated points-to analyses. In Proceedings of the 24th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2009, October 25-29, 2009, Orlando, Florida, USA, Shail Arora and Gary T. Leavens (Eds.). ACM, 243–262. Google ScholarDigital Library
- James Cheney, Sam Lindley, and Philip Wadler. 2014. Query shredding: efficient relational evaluation of queries over nested multisets. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, Curtis E. Dyreson, Feifei Li, and M. Tamer Özsu (Eds.). ACM, 1027–1038. https://doi.org/10.1145/2588555.2612186 Google ScholarDigital Library
- Michael Eichberg, Matthias Kahl, Diptikalyan Saha, Mira Mezini, and Klaus Ostermann. 2007. Automatic Incrementalization of Prolog Based Static Analyses. In Practical Aspects of Declarative Languages, 9th International Symposium, PADL 2007, Nice, France, January 14-15, 2007, Michael Hanus (Ed.) (Lecture Notes in Computer Science, Vol. 4354). Springer, 109–123. https://doi.org/10.1007/978-3-540-69611-7_7 Google ScholarDigital Library
- Sebastian Erdweg, Moritz Lichter, and Manuel Weiel. 2015. A Sound and Optimal Incremental Build System with Dynamic Dependencies. In Proceedings of Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 89–106. Google ScholarDigital Library
- Sebastian Erdweg, Tamás Szabó, and André Pacak. 2021. Concise, type-safe, and efficient structural diffing. In PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 20-25, 2021, Stephen N. Freund and Eran Yahav (Eds.). ACM, 406–419. https://doi.org/10.1145/3453483.3454052 Google ScholarDigital Library
- Ashish Gupta and Inderpal Singh. Mumick. 1999. Materialized views : techniques, implementations, and applications. MIT Press. isbn:9780262571227 https://mitpress.mit.edu/books/materialized-views Google Scholar
- Ashish Gupta, Inderpal Singh Mumick, and V S Subrahmanian. 1993. Maintaining views incrementally. In Proceedings of Conference on Management of Data (SIGMOD). ACM Press, New York, New York, USA. 157–166. isbn:0897915925 https://doi.org/10.1145/170035.170066 Google ScholarDigital Library
- Matthew A. Hammer, Jana Dunfield, Kyle Headley, Nicholas Labich, Jeffrey S. Foster, Michael W. Hicks, and David Van Horn. 2015. Incremental computation with names. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, part of SPLASH 2015, Pittsburgh, PA, USA, October 25-30, 2015, Jonathan Aldrich and Patrick Eugster (Eds.). ACM, 748–766. https://doi.org/10.1145/2814270.2814305 Google ScholarDigital Library
- Shan Shan Huang, Todd Jeffrey Green, and Boon Thau Loo. 2011. Datalog and emerging applications: an interactive tutorial. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011, Timos K. Sellis, Renée J. Miller, Anastasios Kementsietsidis, and Yannis Velegrakis (Eds.). ACM, 1213–1216. Google ScholarDigital Library
- Christoph Koch, Daniel Lupei, and Val Tannen. 2016. Incremental View Maintenance For Collection Programming. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Tova Milo and Wang-Chiew Tan (Eds.). ACM, 75–90. https://doi.org/10.1145/2902251.2902286 Google ScholarDigital Library
- Gabriël Konat, Sebastian Erdweg, and Eelco Visser. 2018. Scalable incremental building with dynamic task dependencies. In Proceedings of International Conference on Automated Software Engineering (ASE). ACM Press, New York, New York, USA. 76–86. isbn:9781450359375 https://doi.org/10.1145/3238147.3238196 Google ScholarDigital Library
- Monica S. Lam, John Whaley, V. Benjamin Livshits, Michael C. Martin, Dzintars Avots, Michael Carbin, and Christopher Unkel. 2005. Context-sensitive program analysis as database queries. In Proceedings of the Twenty-fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 13-15, 2005, Baltimore, Maryland, USA, Chen Li (Ed.). ACM, 1–12. https://doi.org/10.1145/1065167.1065169 Google ScholarDigital Library
- Patrick Lam, Eric Bodden, Ondrej Lhoták, and Laurie Hendren. 2011. The Soot framework for Java program analysis: a retrospective. In Cetus Users and Compiler Infastructure Workshop (CETUS 2011). 15, 35. Google Scholar
- Yanhong A. Liu and Tim Teitelbaum. 1995. Systematic Derivation of Incremental Programs. Sci. Comput. Program., 24, 1 (1995), 1–39. https://doi.org/10.1016/0167-6423(94)00031-9 Google ScholarDigital Library
- David Maier, K Tuncay Tekle, Michael Kifer, and David Scott Warren. 2018. Datalog: concepts, history, and outlook. In Declarative Logic Programming: Theory, Systems, and Applications, Michael Kifer and Yanhong Annie Liu (Eds.). ACM / Morgan & Claypool, 3–100. https://doi.org/10.1145/3191315.3191317 Google ScholarDigital Library
- Ralf Mitschke, Sebastian Erdweg, Mirko Köhler, Mira Mezini, and Guido Salvaneschi. 2014. i3QL: Language-Integrated Live Data Views. In Proceedings of Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 417–432. Google ScholarDigital Library
- André Pacak and Sebastian Erdweg. 2022. Functional Programming with Datalog. In 36th European Conference on Object-Oriented Programming, ECOOP 2022, June 6-10, 2022, Berlin, Germany, Karim Ali and Jan Vitek (Eds.) (LIPIcs, Vol. 222). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 7:1–7:28. https://doi.org/10.4230/LIPIcs.ECOOP.2022.7 Google ScholarCross Ref
- André Pacak, Sebastian Erdweg, and Tamás Szabó. 2020. A systematic approach to deriving incremental type checkers. Proc. ACM Program. Lang., 4, OOPSLA (2020), 127:1–127:28. https://doi.org/10.1145/3428195 Google ScholarDigital Library
- William W. Pugh and Tim Teitelbaum. 1989. Incremental Computation via Function Caching. In Conference Record of the Sixteenth Annual ACM Symposium on Principles of Programming Languages, Austin, Texas, USA, January 11-13, 1989. ACM Press, 315–328. https://doi.org/10.1145/75277.75305 Google ScholarDigital Library
- G Ramalingam and Thomas Reps. 1993. A categorized bibliography on incremental computation. In Proceedings of Symposium on Principles of Programming Languages (POPL). ACM Press, New York, New York, USA. 502–510. isbn:0897915607 https://doi.org/10.1145/158511.158710 Google ScholarDigital Library
- Leonid Ryzhyk and Mihai Budiu. 2019. Differential Datalog. In Datalog 2.0 2019 - 3rd International Workshop on the Resurgence of Datalog in Academia and Industry co-located with the 15th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR 2019) at the Philadelphia Logic Week 2019, Philadelphia, PA (USA), June 4-5, 2019, Mario Alviano and Andreas Pieris (Eds.) (CEUR Workshop Proceedings, Vol. 2368). CEUR-WS.org, 56–67. Google Scholar
- Bernhard Scholz, Herbert Jordan, Pavle Subotic, and Till Westmann. 2016. On fast large-scale program analysis in Datalog. In Proceedings of the 25th International Conference on Compiler Construction, CC 2016, Barcelona, Spain, March 12-18, 2016, Ayal Zaks and Manuel V. Hermenegildo (Eds.). ACM, 196–206. https://doi.org/10.1145/2892208.2892226 Google ScholarDigital Library
- Jaclyn Smith, Michael Benedikt, Milos Nikolic, and Amir Shaikhha. 2020. Scalable Querying of Nested Data. Proc. VLDB Endow., 14, 3 (2020), 445–457. https://doi.org/10.5555/3430915.3442441 Google ScholarDigital Library
- Tamás Szabó, Gábor Bergmann, Sebastian Erdweg, and Markus Voelter. 2018. Incrementalizing lattice-based program analyses in Datalog. Proc. ACM Program. Lang., 2, OOPSLA (2018), 139:1–139:29. https://doi.org/10.1145/3276509 Google ScholarDigital Library
- Tamás Szabó, Sebastian Erdweg, and Gábor Bergmann. 2021. Incremental whole-program analysis in Datalog with lattices. In PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 20-25, 2021, Stephen N. Freund and Eran Yahav (Eds.). ACM, 1–15. https://doi.org/10.1145/3453483.3454026 Google ScholarDigital Library
- Tamás Szabó, Sebastian Erdweg, and Markus Voelter. 2016. IncA: a DSL for the definition of incremental program analyses. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016, David Lo, Sven Apel, and Sarfraz Khurshid (Eds.). ACM, 320–331. https://doi.org/10.1145/2970276.2970298 Google ScholarDigital Library
- Dániel Varró, Gábor Bergmann, Ábel Hegedüs, Ákos Horváth, István Ráth, and Zoltán Ujhelyi. 2016. Road to a Reactive and Incremental Model Transformation Platform: Three Generations of the VIATRA Framework. Softw. Syst. Model., 15, 3 (2016), jul, 609–629. issn:1619-1366 https://doi.org/10.1007/s10270-016-0530-4 Google ScholarDigital Library
- Tim A. Wagner and Susan L. Graham. 1997. Incremental Analysis of real Programming Languages. In Proceedings of the ACM SIGPLAN ’97 Conference on Programming Language Design and Implementation (PLDI), Las Vegas, Nevada, USA, June 15-18, 1997, Marina C. Chen, Ron K. Cytron, and A. Michael Berman (Eds.). ACM, 31–43. https://doi.org/10.1145/258915.258920 Google ScholarDigital Library
- John Whaley and Monica S. Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, Washington, DC, USA, June 9-11, 2004, William W. Pugh and Craig Chambers (Eds.). ACM, 131–144. https://doi.org/10.1145/996841.996859 Google ScholarDigital Library
- David Zhao, Pavle Subotic, Mukund Raghothaman, and Bernhard Scholz. 2021. Towards Elastic Incrementalization for Datalog. In PPDP 2021: 23rd International Symposium on Principles and Practice of Declarative Programming, Tallinn, Estonia, September 6-8, 2021, Niccolò Veltri, Nick Benton, and Silvia Ghilezan (Eds.). ACM, 20:1–20:16. https://doi.org/10.1145/3479394.3479415 Google ScholarDigital Library
Index Terms
- Incremental Processing of Structured Data in Datalog
Recommendations
Incremental whole-program analysis in Datalog with lattices
PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and ImplementationIncremental static analyses provide up-to-date analysis results in time proportional to the size of a code change, not the entire code base. This promises fast feedback to programmers in IDEs and when checking in commits. However, existing incremental ...
Towards Elastic Incrementalization for Datalog
PPDP '21: Proceedings of the 23rd International Symposium on Principles and Practice of Declarative ProgrammingVarious incremental evaluation strategies for Datalog have been developed that reuse computations for small input changes. These methods assume that incrementalization is always a better strategy than recomputation. However, in real-world applications ...
Incrementalizing lattice-based program analyses in Datalog
Program analyses detect errors in code, but when code changes frequently as in an IDE, repeated re-analysis from-scratch is unnecessary: It leads to poor performance unless we give up on precision and recall. Incremental program analysis promises to ...
Comments