skip to main content
10.1145/3357384.3357841acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

PODIUM: Probabilistic Datalog Analysis via Contribution Maximization

Published: 03 November 2019 Publication History

Abstract

The use of probabilistic datalog programs has been advocated for applications that involve recursive computation and uncertainty. While using such programs allows for a flexible knowledge derivation, it makes the analysis of query results a challenging task. Particularly, given a set O of output tuples and a number k, one would like to understand which k-size subset of the input tuples has affected the most the derivation of O. This is useful for multiple tasks, such as identifying critical sources of errors and understanding surprising results. To this end, we formalize the Contribution Maximization problem and present an efficient algorithm to solve it. Our algorithm injects a refined variant of the classic Magic Sets technique, integrated with a sampling method, into top-performing algorithms for the well-studied Influence Maximization problem. We propose to demonstrate our solution in a system called PODIUM. We will demonstrate the usefulness of PODIUM using real-life data and programs, and illustrate the effectiveness of our algorithm.

References

[1]
IRIS Reasoner. http://www.iris-reasoner.org. (????).
[2]
S. Abiteboul, R. Hull, and V. Vianu. 1995. Foundations of Databases .Addison-Wesley.
[3]
Christian Borgs, Michael Brautbar, Jennifer Chayes, and Brendan Lucier. 2014. Maximizing Social Influence in Nearly Optimal Time. In SODA .
[4]
Daniel Deutch, Amir Gilad, and Yuval Moskovitch. 2015. Selective Provenance for Datalog Programs Using Top-K Queries. PVLDB (2015).
[5]
Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, and Fabian M. Suchanek. 2013. AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In WWW.
[6]
Bhargav Kanagal, Jian Li, and Amol Deshpande. 2011. Sensitivity Analysis and Explanations for Robust Query Evaluation in Probabilistic Databases. In SIGMOD.
[7]
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the Spread of Influence Through a Social Network. In SIGKDD.
[8]
Seokki Lee, Bertram Lud"a scher, and Boris Glavic. 2019. PUG: a framework and practical implementation for why and why-not provenance. VLDB J., Vol. 28, 1 (2019).
[9]
Alexandra Meliou, Wolfgang Gatterbauer, Katherine F Moore, and Dan Suciu. 2010. The complexity of causality and responsibility for query answers and non-answers. PVLDB Endowment (2010).
[10]
Sudeepa Roy and Dan Suciu. 2014. A formal approach to finding explanations for database queries. In SIGMOD.
[11]
Alexander Shkapsky, Mohan Yang, Matteo Interlandi, Hsuan Chiu, Tyson Condie, and Carlo Zaniolo. 2016. Big data analytics with datalog queries on spark. SIGMOD. ACM.
[12]
Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In WWW. ACM.
[13]
Youze Tang, Yanchen Shi, and Xiaokui Xiao. 2015. Influence Maximization in Near-Linear Time: A Martingale Approach. In SIGMOD .
[14]
Milo Tova, Moskovitch Yuval, and Youngmann Brit. 2019. Technical Report. (2019).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. probabilistic datalog
  2. results explanations

Qualifiers

  • Research-article

Conference

CIKM '19
Sponsor:

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 118
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media