skip to main content
research-article

Limit Datalog: A Declarative Query Language for Data Analysis

Published:25 February 2020Publication History
Skip Abstract Section

Abstract

Motivated by applications in declarative data analysis, we study DatalogZ-an extension of Datalog with stratified negation and arithmetics over integers. Reasoning in this language is undecidable, so we present a fragment, called limit DatalogZ, that is powerful enough to naturally capture many important data analysis tasks. In limit DatalogZ, all intensional predicates with a numeric argument are limit predicates that keep only the maximal or minimal bounds on numeric values. Reasoning in limit DatalogZ is decidable if multiplication is used in a way that satisfies our linearity condition. Moreover, fact entailment in limit-linear DatalogZ is ΔEXP 2 -complete in combined and ΔP2 -complete in data complexity, and it drops to coNEXP and coNP, respectively, if only (semi-)positive programs are considered. We also propose an additional stability requirement, for which the complexity drops to EXP and P, matching the bounds for usual Datalog. Limit DatalogZ thus provides us with a unified logical framework for declarative data analysis and can be used as a basis for understanding the expressive power of the key data analysis constructs.

References

  1. P. Alvaro, T. Condie, N. Conway, K. Elmeleegy, J. M. Hellerstein, and R. Sears. BOOM analytics: Exploring data-centric,declarative programming for the cloud. In EuroSys, pages 223--236, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Aref, B. ten Cate, T. J. Green, B. Kimelfeld, D. Olteanu, E. Pasalic, T. L. Veldhuizen, and G. Washburn. Design and implementation of the LogicBlox system. In SIGMOD, pages 1371--1382, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Beeri, S. A. Naqvi, O. Shmueli, and S. Tsur. Set constructors in a logic database language. J. Log. Pr., 10(3&4):181--232, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Chin, D. von Dincklage, V. Ercegovac, P. Hawkins, M. S. Miller, F. J. Och, C. Olston, and F. Pereira. Yedalog: Exploring knowledge at scale. In SNAPL, pages 63--78, 2015.Google ScholarGoogle Scholar
  5. D. Chistikov and C. Haase. The taming of the semi-linear set. In ICALP, volume 55, pages 128:1--128:13, 2016.Google ScholarGoogle Scholar
  6. M. P. Consens and A. O. Mendelzon. Low complexity aggregation in GraphLog and Datalog. Th. Comp. Sci., 116(1):95--116, 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Cox, K. McAloon, and C. Tretkoff. Computational complexity and constraint logic programming languages. Ann. Math. Artif. Intell., 5(2--4):163--189, 1992.Google ScholarGoogle Scholar
  8. E. Dantsin, T. Eiter, G. Gottlob, and A. Voronkov. Complexity and expressive power of logic programming. ACM Comput. Surv., 33(3):374--425, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Eisner and N. W. Filardo. Dyna: Extending datalog for modern AI. In Datalog, pages 181--220, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Eiter, G. Gottlob, and H. Mannila. Disjunctive datalog. ACM Trans. Database Syst., 22(3):364--418, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Faber, G. Pfeifer, and N. Leone. Semantics and complexity of recursive aggregates in answer set programming. Artif. Intell., 175(1):278--298, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Ganguly, S. Greco, and C. Zaniolo. Extrema predicates in deductive databases. J. Comput. System Sci., 51(2):244--259, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Immerman. Descriptive Complexity. Springer, 1999.Google ScholarGoogle Scholar
  14. M. Kaminski, B. Cuenca Grau, E. V. Kostylev, B. Motik, and I. Horrocks. Foundations of declarative data analysis using limit datalog programs. In IJCAI, pages 1123--1130, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  15. M. Kaminski, B. Cuenca Grau, E. V. Kostylev, B. Motik, and I. Horrocks. Stratified negation in limit Datalog programs. In IJCAI, pages 1875--1881, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  16. D. B. Kemp and P. J. Stuckey. Semantics of logic programs with aggregates. In ISLP, pages 387--401, 1991.Google ScholarGoogle Scholar
  17. B. T. Loo, T. Condie, M. N. Garofalakis, D. E. Gay, J. M. Hellerstein, P. Maniatis, R. Ramakrishnan, T. Roscoe, and I. Stoica. Declarative networking. Commun. ACM, 52(11):87--95, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Markl. Breaking the chains: On declarative data analysis and data independence in the big data era. PVLDB, 7(13):1730--1733, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Mazuran, E. Serra, and C. Zaniolo. Extending the power of datalog recursion. VLDB J., 22(4):471--493, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. I. S. Mumick, H. Pirahesh, and R. Ramakrishnan. The magic of duplicates and aggregates. In VLDB, pages 264--277, 1990.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. H. Papadimitriou and M. Yannakakis. A note on succinct representations of graphs. Information and Control, 71(3):181--185, 1986.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. A. Ross and Y. Sagiv. Monotonic aggregation in deductive databases. J. Comput. System Sci., 54(1):79--97, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Sabidussi. The centrality index of a graph. Psychometrika, 31(4):581--603, 1966.Google ScholarGoogle ScholarCross RefCross Ref
  24. J. S. Schlipf. The expressive powers of the logic programming semantics. J. Comput. System Sci., 51(1):64--86, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Seo, S. Guo, and M. S. Lam. SociaLite: An efficient graph query language based on datalog. IEEE Trans. Knowl. Data Eng., 27(7):1824--1837, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Shkapsky, M. Yang, M. Interlandi, H. Chiu, T. Condie, and C. Zaniolo. Big data analytics with datalog queries on Spark. In SIGMOD, pages 1135--1149, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Sudarshan and R. Ramakrishnan. Aggregation and relevance in deductive databases. In VLDB, pages 501--511, 1991.Google ScholarGoogle Scholar
  28. A. Van Gelder. The well-founded semantics of aggregation. In PODS, pages 127--138, 1992.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Wang, M. Balazinska, and D. Halperin. Asynchronous and fault-tolerant recursive datalog evaluation in shared-nothing engines. PVLDB, 8(12):1542--1553, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Yang, A. Shkapsky, and C. Zaniolo. Scaling up the performance of more powerful datalog systems on multicore machines. VLDB J., 26(2):229--248, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Zaniolo, M. Yang, A. Das, A. Shkapsky, T. Condie, and M. Interlandi. Fixpoint semantics and optimization of recursive datalog programs with aggregates. Th. Pract. Log. Program., 17(5--6):1048--1065, 2017.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGMOD Record
    ACM SIGMOD Record  Volume 48, Issue 4
    December 2019
    52 pages
    ISSN:0163-5808
    DOI:10.1145/3385658
    Issue’s Table of Contents

    Copyright © 2020 Copyright is held by the owner/author(s)

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 25 February 2020

    Check for updates

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader