Abstract
Database tuples can be seen as players in the game of jointly realizing the answer to a query. Some tuples may contribute more than others to the outcome, which can be a binary value in the case of a Boolean query, a number for a numerical aggregate query, and so on. To quantify the contributions of tuples, we use the Shapley value that was introduced in cooperative game theory and has found applications in a plethora of domains. Specifically, the Shapley value of an individual tuple quantifies its contribution to the query. We investigate the applicability of the Shapley value in this setting, as well as the computational aspects of its calculation in terms of complexity, algorithms, and approximation.
- A. Amarilli and B. Kimelfeld. Model counting for conjunctive queries without self-joins. CoRR, abs/1908.07093, 2019. To appear at ICDT 2021.Google Scholar
- M. Arenas, P. Barcel´o, L. Bertossi, and M. Monet. The tractability of SHAP-scores over deterministic and decomposable boolean circuits. In Proceedings of AAAI, 2021. CoRR abs/2007.14045.Google Scholar
- S. Arora and B. Barak. Computational Complexity - A Modern Approach. Cambridge University Press, 2009. Google ScholarDigital Library
- H. Aziz and B. de Keijzer. Shapley meets Shapley. In STACS, pages 99--111, 2014. 84 SIGMOD Record, March 2021 (Vol. 50, No. 1)Google Scholar
- L. Bertossi. Repair-based degrees of database inconsistency. In LPNMR, volume 11481 of LMCS, pages 195--209. Springer, 2019.Google ScholarCross Ref
- L. Bertossi. Declarative approaches to counterfactual explanations for classification. CoRR, abs/2011.07423, 2020. Extended version of RuleML+RR'20 paper.Google Scholar
- L. Bertossi, J. Li, M. Schleich, D. Suciu, and Z. Vagena. Causality-based explanation of classification outcomes. In DEEM@SIGMOD, pages 6:1--6:10. ACM, 2020. Google ScholarDigital Library
- L. Bertossi and B. Salimi. Causes for query answers from databases: Datalog abduction, view-updates, and integrity constraints. Int. J. Approx. Reason., 90:226--252, 2017.Google ScholarCross Ref
- L. Bertossi and B. Salimi. From causes for database queries to repairs and model-based diagnosis and back. Theory Comput. Syst., 61(1):191--232, 2017. Google ScholarDigital Library
- P. Buneman, S. Khanna, and W. C. Tan. Why and where: A characterization of data provenance. In ICDT, volume 1973 of Lecture Notes in Computer Science, pages 316--330. Springer, 2001. Google ScholarDigital Library
- C. Chen, K. Lin, C. Rudin, Y. Shaposhnik, S. Wang, and T. Wang. An interpretable model with globally consistent explanations for credit risk. CoRR, abs/1811.12615, 2018.Google Scholar
- H. Chockler and J. Y. Halpern. Responsibility and blame: A structural-model approach. J. Artif. Intell. Res., 22:93--115, 2004. Google ScholarDigital Library
- N. N. Dalvi, C. R´e, and D. Suciu. Probabilistic databases: diamonds in the dirt. Commun. ACM, 52(7):86--94, 2009. Google ScholarDigital Library
- N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, pages 864--875. Morgan Kaufmann, 2004. Google ScholarDigital Library
- N. N. Dalvi and D. Suciu. The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM, 59(6):30:1--30:87, 2012. Google ScholarDigital Library
- P. Dubey and L. S. Shapley. Mathematical properties of the Banzhaf power index. Mathematics of Operations Research, 4(2):99--131, 1979.Google ScholarDigital Library
- G. Greco, F. Lupia, and F. Scarcello. Structural tractability of Shapley and Banzhaf values in allocation games. In IJCAI, pages 547--553, 2015. Google ScholarDigital Library
- T. J. Green and V. Tannen. The semiring framework for database provenance. In PODS, pages 93--99. ACM, 2017. Google ScholarDigital Library
- J. Y. Halpern. A modification of the Halpern-Pearl definition of causality. In IJCAI, pages 3022--3033. AAAI Press, 2015. Google ScholarDigital Library
- J. Y. Halpern and J. Pearl. Causes and Explanations: A Structural-Model Approach. Part I: Causes. The British Journal for the Philosophy of Science, 56(4):843--887, 2005.Google ScholarCross Ref
- A. Karimi, B. J. von K¨ugelgen, B. Sch¨olkopf, and I. Valera. Algorithmic recourse under imperfect causal knowledge: a probabilistic approach. In NeurIPS, 2020.Google Scholar
- B. Kenig and D. Suciu. A dichotomy for the generalized model counting problem for unions of conjunctive queries. CoRR, abs/2008.00896, 2020. Google ScholarDigital Library
- E. Livshits, L. Bertossi, B. Kimelfeld, and M. Sebag. The shapley value of tuples in query answering. In ICDT, volume 155 of LIPIcs, pages 20:1--20:19, 2020.Google Scholar
- E. Livshits and B. Kimelfeld. The shapley value of inconsistency measures for functional dependencies. CoRR, abs/2009.13819, 2020. To appear at ICDT 2021.Google Scholar
- S. M. Lundberg, G. Erion, H. Chen, A. D. Grave, J. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S.-I. Lee. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1):56--67, 2020.Google ScholarCross Ref
- A. Meliou, W. Gatterbauer, J. Y. Halpern, C. Koch, K. F. Moore, and D. Suciu. Causality in databases. IEEE Data Eng. Bull., 33(3):59--67, 2010.Google Scholar
- A. Meliou, W. Gatterbauer, K. F. Moore, and D. Suciu. The complexity of causality and responsibility for query answers and non-answers. Proc. VLDB Endow., 4(1):34--45, 2010. Google ScholarDigital Library
- K. Mu, W. Liu, and Z. Jin. Measuring the blame of each formula for inconsistent prioritized knowledge bases. Journal of Logic and Computation, 22(3):481--516, 02 2011. Google ScholarDigital Library
- A. Reshef, B. Kimelfeld, and E. Livshits. The impact of negation on the complexity of the shapley value in conjunctive queries. In PODS, pages 285--297. ACM, 2020. Google ScholarDigital Library
- D. G. Saari and K. K. Sieberg. Some surprising properties of power indices. Games Econ. Behav., 36(2):241--263, 2001.Google ScholarCross Ref
- B. Salimi, L. Bertossi, D. Suciu, and G. V. den Broeck. Quantifying causal effects on query answering in databases. In TaPP. USENIX Association, 2016. Google ScholarDigital Library
- L. S. Shapley. A Value for n-Person Games. RAND Corporation, Santa Monica, CA, 1952.Google Scholar
- L. S. Shapley and A. E. Roth. The Shapley value : essays in honor of Lloyd S. Shapley. Cambridge, 1988.Google Scholar
- P. Struss. Model-based problem solving. In Handbook of Knowledge Representation, volume 3 of Foundations of Artificial Intelligence, pages 395--465. Elsevier, 2008.Google ScholarCross Ref
- D. Suciu, D. Olteanu, C. R´e, and C. Koch. Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011. Google ScholarDigital Library
- M. Thimm. Measuring inconsistency in probabilistic knowledge bases. In UAI, pages 530--537, 2009. Google ScholarDigital Library
- G. Van den Broeck, A. Lykov, M. Schleich, and D. Suciu. On the tractability of SHAP explanations. In Proceedings of AAAI, 2021. CoRR abs/2009.08634.Google Scholar
Index Terms
- Query Games in Databases
Recommendations
Probabilistic query answering over inconsistent databases
This paper presents a framework for querying inconsistent databases in the presence of functional dependencies. Most of the works dealing with the problem of extracting reliable information from inconsistent databases are based on the notion of repair, ...
Query processing over incomplete autonomous databases
VLDB '07: Proceedings of the 33rd international conference on Very large data basesIncompleteness due to missing attribute values (aka "null values") is very common in autonomous web databases, on which user accesses are usually supported through mediators. Traditional query processing techniques that focus on the strict soundness of ...
Comments