skip to main content
10.1145/3564719.3568686acmconferencesArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Incremental Processing of Structured Data in Datalog

Published:01 December 2022Publication History

ABSTRACT

Incremental computations react to input changes by updating their outputs. Compared to a non-incremental rerun, incremental computations can provide order-of-magnitude speedups, since often small input changes trigger small output changes. One popular means for implementing incremental computations is to encode the computation in Datalog, for which efficient incremental solvers exist. However, Datalog is very restrictive in terms of the data types it can process: Atomic data organized in relations. While structured tree and graph-shaped data can be encoded in relations, a naive encoding inhibits incrementality. In this paper, we present an encoding of structured data in Datalog that supports efficient incrementality such that small input changes are expressible. We explain how to efficiently implement and integrate this encoding into an existing incremental Datalog engine, and we show how tree diffing algorithms can be used to change the encoded data.

References

  1. Martín Abadi, Butler W. Lampson, and Jean-Jacques Lévy. 1996. Analysis and Caching of Dependencies. In Proceedings of the 1996 ACM SIGPLAN International Conference on Functional Programming, ICFP 1996, Philadelphia, Pennsylvania, USA, May 24-26, 1996, Robert Harper and Richard L. Wexelblat (Eds.). ACM, 83–91. https://doi.org/10.1145/232627.232638 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Pavel Avgustinov, Oege de Moor, Michael Peyton Jones, and Max Schäfer. 2016. QL: Object-oriented Queries on Relational Data. In 30th European Conference on Object-Oriented Programming, ECOOP 2016, July 18-22, 2016, Rome, Italy, Shriram Krishnamurthi and Benjamin S. Lerner (Eds.) (LIPIcs, Vol. 56). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2:1–2:25. https://doi.org/10.4230/LIPIcs.ECOOP.2016.2 Google ScholarGoogle ScholarCross RefCross Ref
  3. Darshana Balakrishnan, Carl Nuessle, Oliver Kennedy, and Lukasz Ziarek. 2021. TreeToaster: Towards an IVM-Optimized Compiler. In SIGMOD ’21: International Conference on Management of Data, Virtual Event, China, June 20-25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 155–167. https://doi.org/10.1145/3448016.3459244 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. William C. Benton and Charles N. Fischer. 2007. Interactive, scalable, declarative program analysis: from prototype to implementation. In Proceedings of the 9th International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming, July 14-16, 2007, Wroclaw, Poland, Michael Leuschel and Andreas Podelski (Eds.). ACM, 13–24. https://doi.org/10.1145/1273920.1273923 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly declarative specification of sophisticated points-to analyses. In Proceedings of the 24th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2009, October 25-29, 2009, Orlando, Florida, USA, Shail Arora and Gary T. Leavens (Eds.). ACM, 243–262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. James Cheney, Sam Lindley, and Philip Wadler. 2014. Query shredding: efficient relational evaluation of queries over nested multisets. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, Curtis E. Dyreson, Feifei Li, and M. Tamer Özsu (Eds.). ACM, 1027–1038. https://doi.org/10.1145/2588555.2612186 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Michael Eichberg, Matthias Kahl, Diptikalyan Saha, Mira Mezini, and Klaus Ostermann. 2007. Automatic Incrementalization of Prolog Based Static Analyses. In Practical Aspects of Declarative Languages, 9th International Symposium, PADL 2007, Nice, France, January 14-15, 2007, Michael Hanus (Ed.) (Lecture Notes in Computer Science, Vol. 4354). Springer, 109–123. https://doi.org/10.1007/978-3-540-69611-7_7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sebastian Erdweg, Moritz Lichter, and Manuel Weiel. 2015. A Sound and Optimal Incremental Build System with Dynamic Dependencies. In Proceedings of Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 89–106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sebastian Erdweg, Tamás Szabó, and André Pacak. 2021. Concise, type-safe, and efficient structural diffing. In PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 20-25, 2021, Stephen N. Freund and Eran Yahav (Eds.). ACM, 406–419. https://doi.org/10.1145/3453483.3454052 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ashish Gupta and Inderpal Singh. Mumick. 1999. Materialized views : techniques, implementations, and applications. MIT Press. isbn:9780262571227 https://mitpress.mit.edu/books/materialized-views Google ScholarGoogle Scholar
  11. Ashish Gupta, Inderpal Singh Mumick, and V S Subrahmanian. 1993. Maintaining views incrementally. In Proceedings of Conference on Management of Data (SIGMOD). ACM Press, New York, New York, USA. 157–166. isbn:0897915925 https://doi.org/10.1145/170035.170066 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Matthew A. Hammer, Jana Dunfield, Kyle Headley, Nicholas Labich, Jeffrey S. Foster, Michael W. Hicks, and David Van Horn. 2015. Incremental computation with names. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, part of SPLASH 2015, Pittsburgh, PA, USA, October 25-30, 2015, Jonathan Aldrich and Patrick Eugster (Eds.). ACM, 748–766. https://doi.org/10.1145/2814270.2814305 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Shan Shan Huang, Todd Jeffrey Green, and Boon Thau Loo. 2011. Datalog and emerging applications: an interactive tutorial. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011, Timos K. Sellis, Renée J. Miller, Anastasios Kementsietsidis, and Yannis Velegrakis (Eds.). ACM, 1213–1216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Christoph Koch, Daniel Lupei, and Val Tannen. 2016. Incremental View Maintenance For Collection Programming. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Tova Milo and Wang-Chiew Tan (Eds.). ACM, 75–90. https://doi.org/10.1145/2902251.2902286 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gabriël Konat, Sebastian Erdweg, and Eelco Visser. 2018. Scalable incremental building with dynamic task dependencies. In Proceedings of International Conference on Automated Software Engineering (ASE). ACM Press, New York, New York, USA. 76–86. isbn:9781450359375 https://doi.org/10.1145/3238147.3238196 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Monica S. Lam, John Whaley, V. Benjamin Livshits, Michael C. Martin, Dzintars Avots, Michael Carbin, and Christopher Unkel. 2005. Context-sensitive program analysis as database queries. In Proceedings of the Twenty-fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 13-15, 2005, Baltimore, Maryland, USA, Chen Li (Ed.). ACM, 1–12. https://doi.org/10.1145/1065167.1065169 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Patrick Lam, Eric Bodden, Ondrej Lhoták, and Laurie Hendren. 2011. The Soot framework for Java program analysis: a retrospective. In Cetus Users and Compiler Infastructure Workshop (CETUS 2011). 15, 35. Google ScholarGoogle Scholar
  18. Yanhong A. Liu and Tim Teitelbaum. 1995. Systematic Derivation of Incremental Programs. Sci. Comput. Program., 24, 1 (1995), 1–39. https://doi.org/10.1016/0167-6423(94)00031-9 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. David Maier, K Tuncay Tekle, Michael Kifer, and David Scott Warren. 2018. Datalog: concepts, history, and outlook. In Declarative Logic Programming: Theory, Systems, and Applications, Michael Kifer and Yanhong Annie Liu (Eds.). ACM / Morgan & Claypool, 3–100. https://doi.org/10.1145/3191315.3191317 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ralf Mitschke, Sebastian Erdweg, Mirko Köhler, Mira Mezini, and Guido Salvaneschi. 2014. i3QL: Language-Integrated Live Data Views. In Proceedings of Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 417–432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. André Pacak and Sebastian Erdweg. 2022. Functional Programming with Datalog. In 36th European Conference on Object-Oriented Programming, ECOOP 2022, June 6-10, 2022, Berlin, Germany, Karim Ali and Jan Vitek (Eds.) (LIPIcs, Vol. 222). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 7:1–7:28. https://doi.org/10.4230/LIPIcs.ECOOP.2022.7 Google ScholarGoogle ScholarCross RefCross Ref
  22. André Pacak, Sebastian Erdweg, and Tamás Szabó. 2020. A systematic approach to deriving incremental type checkers. Proc. ACM Program. Lang., 4, OOPSLA (2020), 127:1–127:28. https://doi.org/10.1145/3428195 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. William W. Pugh and Tim Teitelbaum. 1989. Incremental Computation via Function Caching. In Conference Record of the Sixteenth Annual ACM Symposium on Principles of Programming Languages, Austin, Texas, USA, January 11-13, 1989. ACM Press, 315–328. https://doi.org/10.1145/75277.75305 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G Ramalingam and Thomas Reps. 1993. A categorized bibliography on incremental computation. In Proceedings of Symposium on Principles of Programming Languages (POPL). ACM Press, New York, New York, USA. 502–510. isbn:0897915607 https://doi.org/10.1145/158511.158710 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Leonid Ryzhyk and Mihai Budiu. 2019. Differential Datalog. In Datalog 2.0 2019 - 3rd International Workshop on the Resurgence of Datalog in Academia and Industry co-located with the 15th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR 2019) at the Philadelphia Logic Week 2019, Philadelphia, PA (USA), June 4-5, 2019, Mario Alviano and Andreas Pieris (Eds.) (CEUR Workshop Proceedings, Vol. 2368). CEUR-WS.org, 56–67. Google ScholarGoogle Scholar
  26. Bernhard Scholz, Herbert Jordan, Pavle Subotic, and Till Westmann. 2016. On fast large-scale program analysis in Datalog. In Proceedings of the 25th International Conference on Compiler Construction, CC 2016, Barcelona, Spain, March 12-18, 2016, Ayal Zaks and Manuel V. Hermenegildo (Eds.). ACM, 196–206. https://doi.org/10.1145/2892208.2892226 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jaclyn Smith, Michael Benedikt, Milos Nikolic, and Amir Shaikhha. 2020. Scalable Querying of Nested Data. Proc. VLDB Endow., 14, 3 (2020), 445–457. https://doi.org/10.5555/3430915.3442441 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Tamás Szabó, Gábor Bergmann, Sebastian Erdweg, and Markus Voelter. 2018. Incrementalizing lattice-based program analyses in Datalog. Proc. ACM Program. Lang., 2, OOPSLA (2018), 139:1–139:29. https://doi.org/10.1145/3276509 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Tamás Szabó, Sebastian Erdweg, and Gábor Bergmann. 2021. Incremental whole-program analysis in Datalog with lattices. In PLDI ’21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, Virtual Event, Canada, June 20-25, 2021, Stephen N. Freund and Eran Yahav (Eds.). ACM, 1–15. https://doi.org/10.1145/3453483.3454026 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Tamás Szabó, Sebastian Erdweg, and Markus Voelter. 2016. IncA: a DSL for the definition of incremental program analyses. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016, David Lo, Sven Apel, and Sarfraz Khurshid (Eds.). ACM, 320–331. https://doi.org/10.1145/2970276.2970298 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Dániel Varró, Gábor Bergmann, Ábel Hegedüs, Ákos Horváth, István Ráth, and Zoltán Ujhelyi. 2016. Road to a Reactive and Incremental Model Transformation Platform: Three Generations of the VIATRA Framework. Softw. Syst. Model., 15, 3 (2016), jul, 609–629. issn:1619-1366 https://doi.org/10.1007/s10270-016-0530-4 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tim A. Wagner and Susan L. Graham. 1997. Incremental Analysis of real Programming Languages. In Proceedings of the ACM SIGPLAN ’97 Conference on Programming Language Design and Implementation (PLDI), Las Vegas, Nevada, USA, June 15-18, 1997, Marina C. Chen, Ron K. Cytron, and A. Michael Berman (Eds.). ACM, 31–43. https://doi.org/10.1145/258915.258920 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. John Whaley and Monica S. Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, Washington, DC, USA, June 9-11, 2004, William W. Pugh and Craig Chambers (Eds.). ACM, 131–144. https://doi.org/10.1145/996841.996859 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. David Zhao, Pavle Subotic, Mukund Raghothaman, and Bernhard Scholz. 2021. Towards Elastic Incrementalization for Datalog. In PPDP 2021: 23rd International Symposium on Principles and Practice of Declarative Programming, Tallinn, Estonia, September 6-8, 2021, Niccolò Veltri, Nick Benton, and Silvia Ghilezan (Eds.). ACM, 20:1–20:16. https://doi.org/10.1145/3479394.3479415 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Incremental Processing of Structured Data in Datalog

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GPCE 2022: Proceedings of the 21st ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences
        November 2022
        186 pages
        ISBN:9781450399203
        DOI:10.1145/3564719

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 December 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader