skip to main content
10.1145/3034786.3056107acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Counting and Enumerating (Preferred) Database Repairs

Published:09 May 2017Publication History

ABSTRACT

In the traditional sense, a subset repair of an inconsistent database refers to a consistent subset of facts (tuples) that is maximal under set containment. Preferences between pairs of facts allow to distinguish a set of preferred repairs based on relative reliability (source credibility, extraction quality, recency, etc.) of data items. Previous studies explored the problem of categoricity, where one aims to determine whether preferences suffice to repair the database unambiguously, or in other words, whether there is precisely one preferred repair. In this paper we study the ability to quantify ambiguity, by investigating two classes of problems. The first is that of counting the number of subset repairs, both preferred (under various common semantics) and traditional. We establish dichotomies in data complexity for the entire space of (sets of) functional dependencies. The second class of problems is that of enumerating (i.e., generating) the preferred repairs. We devise enumeration algorithms with efficiency guarantees on the delay between generated repairs, even for constraints represented as general conflict graphs or hypergraphs.

References

  1. F. N. Afrati and P. G. Kolaitis. Repair checking in inconsistent databases: algorithms and complexity. In ICDT, pages 31--41. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. E. Appelt and B. Onyshkevych. The common pattern specification language. In TIPSTER Text Program: Phase III, pages 23--30. Association for Computational Linguistics, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In PODS, pages 68--79. ACM, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. E. Bertossi. Database Repairing and Consistent Query Answering. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis. Conditional functional dependencies for data cleaning. In ICDE, pages 746--755. IEEE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  6. E. Boros, K. M. Elbassioni, V. Gurvich, and L. Khachiyan. An efficient incremental algorithm for generating all maximal independent sets in hypergraphs of bounded dimension. Parallel Processing Letters, 10(4):253--266, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  7. C. Bourgaux, M. Bienvenu, and F. Goasdoué. Querying inconsistent description logic knowledge bases under preferred repair semantics. In DL, pages 96--99, 2014.Google ScholarGoogle Scholar
  8. Y. Cao, W. Fan, and W. Yu. Determining the relative accuracy of attributes. In SIGMOD, pages 565--576. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Chiticariu, R. Krishnamurthy, Y. Li, S. Raghavan, F. Reiss, and S. Vaithyanathan. SystemT: An algebraic approach to declarative information extraction. In ACL, pages 128--137, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Chomicki and J. Marcinkowski. Minimal-change integrity maintenance using tuple deletions. Inf. Comput., 197(1--2):90--121, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Cohen, I. Fadida, Y. Kanza, B. Kimelfeld, and Y. Sagiv. Full disjunctions: Polynomial-delay iterators in action. In VLDB, pages 739--750. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Corneil, H. Lerchs, and L. Burlingham. Complement reducible graphs. Discrete Applied Mathematics, 3(3):163--174, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  13. T. Eiter and G. Gottlob. Identifying the minimal transversals of a hypergraph and related problems. SIAM J. Comput., 24(6):1278--1304, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Fagin, B. Kimelfeld, and P. G. Kolaitis. Dichotomies in the complexity of preferred repairs. In PODS, pages 3--15. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Fagin, B. Kimelfeld, F. Reiss, and S. Vansummeren. Cleaning inconsistencies in information extraction via prioritized repairs. In PODS, pages 164--175. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. Fan, F. Geerts, and J. Wijsen. Determining the currency of data. ACM Trans. Database Syst., 37(4):25, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. W. Fan, S. Ma, N. Tang, and W. Yu. Interaction between record matching and data repairing. J. Data and Information Quality, 4(4):16:1--16:38, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Flesca, F. Furfaro, and F. Parisi. Preferred database repairs under aggregate constraints. In SUM, pages 215--229, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Gaasterland, P. Godfrey, and J. Minker. An overview of cooperative answering. J. Intell. Inf. Syst., 1(2):123--157, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  20. F. Geerts, G. Mecca, P. Papotti, and D. Santoro. The LLUNATIC data-cleaning framework. PVLDB, 6(9):625--636, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. S. Johnson, C. H. Papadimitriou, and M. Yannakakis. On generating all maximal independent sets. Inf. Process. Lett., 27(3):119--123, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Kimelfeld. A dichotomy in the complexity of deletion propagation with functional dependencies. In PODS, pages 191--202, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Kimelfeld, E. Livshits, and L. Peterfreund. Unambiguous prioritized repairing of databases. To appear in ICDT, 2017.Google ScholarGoogle Scholar
  24. B. Kimelfeld, J. Vondrák, and R. Williams. Maximizing conjunctive views in deletion propagation. ACM Trans. Database Syst., 37(4):24, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. V. Koltun and C. H. Papadimitriou. Approximately dominating representatives. Theor. Comput. Sci., 371(3):148--154, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Koutris and J. Wijsen. The data complexity of consistent query answering for self-join-free conjunctive queries under primary key constraints. In PODS, pages 17--29. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Lopatenko and L. E. Bertossi. Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In ICDT, pages 179--193, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Maier. Minimum covers in relational database model. J. ACM, 27(4):664--674, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Maslowski and J. Wijsen. A dichotomy in the complexity of counting database repairs. J. Comput. Syst. Sci., 79(6):958--983, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. Maslowski and J. Wijsen. Counting database repairs that satisfy conjunctive queries with self-joins. In ICDT, pages 155--164. Open Proceedings.org, 2014.Google ScholarGoogle Scholar
  31. D. V. Nieuwenborgh and D. Vermeir. Preferred answer sets for ordered logic programs. In JELIA, pages 432--443, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. S. Provan and M. O. Ball. The complexity of counting cuts and of computing the probability that a graph is connected. SIAM J. Comput., 12(4):777--788, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Staworko, J. Chomicki, and J. Marcinkowski. Preference-driven querying of inconsistent relational databases. In EDBT Workshops, volume 4254 of LNCS, pages 318--335. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Staworko, J. Chomicki, and J. Marcinkowski. Prioritized repairing and consistent query answering in relational databases. Ann. Math. Artif. Intell., 64(2--3):209--246, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Toda and M. Ogiwara. Counting classes are at least as hard as the polynomial-time hierarchy. SIAM Journal on Computing, 21(2), 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. P. Vadhan. The complexity of counting in sparse, regular, and planar graphs. SIAM J. Comput., 31(2):398--427, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Wijsen. Database repairing using updates. ACM Trans. Database Syst., 30(3):722--768, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Counting and Enumerating (Preferred) Database Repairs

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PODS '17: Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
            May 2017
            458 pages
            ISBN:9781450341981
            DOI:10.1145/3034786

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 9 May 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            PODS '17 Paper Acceptance Rate29of101submissions,29%Overall Acceptance Rate642of2,707submissions,24%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader