skip to main content
research-article

Query Games in Databases

Published: 17 June 2021 Publication History

Abstract

Database tuples can be seen as players in the game of jointly realizing the answer to a query. Some tuples may contribute more than others to the outcome, which can be a binary value in the case of a Boolean query, a number for a numerical aggregate query, and so on. To quantify the contributions of tuples, we use the Shapley value that was introduced in cooperative game theory and has found applications in a plethora of domains. Specifically, the Shapley value of an individual tuple quantifies its contribution to the query. We investigate the applicability of the Shapley value in this setting, as well as the computational aspects of its calculation in terms of complexity, algorithms, and approximation.

References

[1]
A. Amarilli and B. Kimelfeld. Model counting for conjunctive queries without self-joins. CoRR, abs/1908.07093, 2019. To appear at ICDT 2021.
[2]
M. Arenas, P. Barcel´o, L. Bertossi, and M. Monet. The tractability of SHAP-scores over deterministic and decomposable boolean circuits. In Proceedings of AAAI, 2021. CoRR abs/2007.14045.
[3]
S. Arora and B. Barak. Computational Complexity - A Modern Approach. Cambridge University Press, 2009.
[4]
H. Aziz and B. de Keijzer. Shapley meets Shapley. In STACS, pages 99--111, 2014. 84 SIGMOD Record, March 2021 (Vol. 50, No. 1)
[5]
L. Bertossi. Repair-based degrees of database inconsistency. In LPNMR, volume 11481 of LMCS, pages 195--209. Springer, 2019.
[6]
L. Bertossi. Declarative approaches to counterfactual explanations for classification. CoRR, abs/2011.07423, 2020. Extended version of RuleML+RR'20 paper.
[7]
L. Bertossi, J. Li, M. Schleich, D. Suciu, and Z. Vagena. Causality-based explanation of classification outcomes. In DEEM@SIGMOD, pages 6:1--6:10. ACM, 2020.
[8]
L. Bertossi and B. Salimi. Causes for query answers from databases: Datalog abduction, view-updates, and integrity constraints. Int. J. Approx. Reason., 90:226--252, 2017.
[9]
L. Bertossi and B. Salimi. From causes for database queries to repairs and model-based diagnosis and back. Theory Comput. Syst., 61(1):191--232, 2017.
[10]
P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In ICDT, volume 1973 of Lecture Notes in Computer Science, pages 316--330. Springer, 2001.
[11]
C. Chen, K. Lin, C. Rudin, Y. Shaposhnik, S. Wang, and T. Wang. An interpretable model with globally consistent explanations for credit risk. CoRR, abs/1811.12615, 2018.
[12]
H. Chockler and J. Y. Halpern. Responsibility and blame: A structural-model approach. J. Artif. Intell. Res., 22:93--115, 2004.
[13]
N. N. Dalvi, C. R´e, and D. Suciu. Probabilistic databases: diamonds in the dirt. Commun. ACM, 52(7):86--94, 2009.
[14]
N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, pages 864--875. Morgan Kaufmann, 2004.
[15]
N. N. Dalvi and D. Suciu. The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM, 59(6):30:1--30:87, 2012.
[16]
P. Dubey and L. S. Shapley. Mathematical properties of the Banzhaf power index. Mathematics of Operations Research, 4(2):99--131, 1979.
[17]
G. Greco, F. Lupia, and F. Scarcello. Structural tractability of Shapley and Banzhaf values in allocation games. In IJCAI, pages 547--553, 2015.
[18]
T. J. Green and V. Tannen. The semiring framework for database provenance. In PODS, pages 93--99. ACM, 2017.
[19]
J. Y. Halpern. A modification of the Halpern-Pearl definition of causality. In IJCAI, pages 3022--3033. AAAI Press, 2015.
[20]
J. Y. Halpern and J. Pearl. Causes and Explanations: A Structural-Model Approach. Part I: Causes. The British Journal for the Philosophy of Science, 56(4):843--887, 2005.
[21]
A. Karimi, B. J. von K¨ugelgen, B. Sch¨olkopf, and I. Valera. Algorithmic recourse under imperfect causal knowledge: a probabilistic approach. In NeurIPS, 2020.
[22]
B. Kenig and D. Suciu. A dichotomy for the generalized model counting problem for unions of conjunctive queries. CoRR, abs/2008.00896, 2020.
[23]
E. Livshits, L. Bertossi, B. Kimelfeld, and M. Sebag. The shapley value of tuples in query answering. In ICDT, volume 155 of LIPIcs, pages 20:1--20:19, 2020.
[24]
E. Livshits and B. Kimelfeld. The shapley value of inconsistency measures for functional dependencies. CoRR, abs/2009.13819, 2020. To appear at ICDT 2021.
[25]
S. M. Lundberg, G. Erion, H. Chen, A. D. Grave, J. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S.-I. Lee. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1):56--67, 2020.
[26]
A. Meliou, W. Gatterbauer, J. Y. Halpern, C. Koch, K. F. Moore, and D. Suciu. Causality in databases. IEEE Data Eng. Bull., 33(3):59--67, 2010.
[27]
A. Meliou, W. Gatterbauer, K. F. Moore, and D. Suciu. The complexity of causality and responsibility for query answers and non-answers. Proc. VLDB Endow., 4(1):34--45, 2010.
[28]
K. Mu, W. Liu, and Z. Jin. Measuring the blame of each formula for inconsistent prioritized knowledge bases. Journal of Logic and Computation, 22(3):481--516, 02 2011.
[29]
A. Reshef, B. Kimelfeld, and E. Livshits. The impact of negation on the complexity of the shapley value in conjunctive queries. In PODS, pages 285--297. ACM, 2020.
[30]
D. G. Saari and K. K. Sieberg. Some surprising properties of power indices. Games Econ. Behav., 36(2):241--263, 2001.
[31]
B. Salimi, L. Bertossi, D. Suciu, and G. V. den Broeck. Quantifying causal effects on query answering in databases. In TaPP. USENIX Association, 2016.
[32]
L. S. Shapley. A Value for n-Person Games. RAND Corporation, Santa Monica, CA, 1952.
[33]
L. S. Shapley and A. E. Roth. The Shapley value : essays in honor of Lloyd S. Shapley. Cambridge, 1988.
[34]
P. Struss. Model-based problem solving. In Handbook of Knowledge Representation, volume 3 of Foundations of Artificial Intelligence, pages 395--465. Elsevier, 2008.
[35]
D. Suciu, D. Olteanu, C. R´e, and C. Koch. Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011.
[36]
M. Thimm. Measuring inconsistency in probabilistic knowledge bases. In UAI, pages 530--537, 2009.
[37]
G. Van den Broeck, A. Lykov, M. Schleich, and D. Suciu. On the tractability of SHAP explanations. In Proceedings of AAAI, 2021. CoRR abs/2009.08634.

Cited By

View all
  • (2024)The Generalized Causal-Effect Score in Data Management (short paper)Proceedings of the Conference on Governance, Understanding and Integration of Data for Effective and Responsible AI10.1145/3665601.3669843(32-35)Online publication date: 9-Jun-2024
  • (2024)Banzhaf Values for Facts in Query AnsweringProceedings of the ACM on Management of Data10.1145/36549262:3(1-26)Online publication date: 30-May-2024
  • (2023)From Database Repairs to Causality in Databases and BeyondTransactions on Large-Scale Data- and Knowledge-Centered Systems LIV10.1007/978-3-662-68014-8_5(119-131)Online publication date: 22-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 50, Issue 1
March 2021
90 pages
ISSN:0163-5808
DOI:10.1145/3471485
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2021
Published in SIGMOD Volume 50, Issue 1

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)4
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The Generalized Causal-Effect Score in Data Management (short paper)Proceedings of the Conference on Governance, Understanding and Integration of Data for Effective and Responsible AI10.1145/3665601.3669843(32-35)Online publication date: 9-Jun-2024
  • (2024)Banzhaf Values for Facts in Query AnsweringProceedings of the ACM on Management of Data10.1145/36549262:3(1-26)Online publication date: 30-May-2024
  • (2023)From Database Repairs to Causality in Databases and BeyondTransactions on Large-Scale Data- and Knowledge-Centered Systems LIV10.1007/978-3-662-68014-8_5(119-131)Online publication date: 22-Sep-2023
  • (2023)Attribution-Scores in Data Management and Explainable Machine LearningAdvances in Databases and Information Systems10.1007/978-3-031-42914-9_2(16-33)Online publication date: 4-Sep-2023
  • (2023)Attribution-Scores and Causal Counterfactuals as Explanations in Artificial IntelligenceReasoning Web. Causality, Explanations and Declarative Knowledge10.1007/978-3-031-31414-8_1(1-23)Online publication date: 28-Apr-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media