Abstract
Matching Dependencies (MDs) are a recent proposal for declarative entity resolution. They are rules that specify, on the basis of similarities satisfied by values in a database, what values should be considered duplicates, and have to be matched. On the basis of a chase-like procedure for MD enforcement, we can obtain clean (duplicate-free), and possibly several, resolved instances. The resolved answers to a query are invariant under the class of resolved instances. Previous work identified classes of queries and sets of MDs for which resolved query answering is tractable, with special emphasis on cyclic sets of MDs. In this work we further investigate the complexity of this problem, identifying intractable cases, and exploring the frontier between tractability and intractability. We concentrate mostly on acyclic sets of MDs. For a special case we obtain a dichotomy result relative to NP-hardness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995)
Bahmani, Z., Bertossi, L., Kolahi, S., Lakshmanan, L.: Declarative entity resolution via matching dependencies and answer set programs. In: Proc. KR 2012 (2012)
Benjelloun, O., Garcia-Molina, H., Menestrina, D., Su, Q., Euijong Whang, S., Widom, J.: Swoosh: A generic approach to entity resolution. VLDB Journal 18(1), 255–276 (2009)
Bertossi, L.: Database Repairing and Consistent Query Answering. Morgan & Claypool, Synthesis Lectures on Data Management (2011)
Bertossi, L., Kolahi, S., Lakshmanan, L.: Data cleaning and query answering with matching dependencies and matching functions. Theory of Computing Systems 52(3), 441–482 (2013)
Bertossi, L., Gardezi, J.: Tractable vs. Intractable Cases of Matching Dependencies for Query Answering under Entity Resolution. Corr ArXiv: 1309.1884 (2013)
Bleiholder, J., Naumann, F.: Data fusion. ACM Computing Surveys 41(1), 1–41 (2008)
Cali, A., Lembo, D., Rosati, R.: On the decidability and complexity of query answering over inconsistent and incomplete databases. In: Proc. PODS 2003, pp. 260–271 (2003)
Elmagarmid, A., Ipeirotis, P., Verykios, V.: Duplicate record detection: A survey. IEEE Trans. Knowledge and Data Eng. 19(1), 1–16 (2007)
Fan, W.: Dependencies revisited for improving data quality. In: Proc. PODS 2008 (2008)
Fan, W., Jia, X., Li, J., Ma, S.: Reasoning about record matching rules. In: Proc. VLDB 2009 (2009)
Gardezi, J., Bertossi, L., Kiringa, I.: Matching dependencies: semantics, query answering and integrity constraints. Frontiers of Computer Science 6(3), 278–292 (2012)
Gardezi, J., Bertossi, L.: Query rewriting using datalog for duplicate resolution. In: Barceló, P., Pichler, R. (eds.) Datalog 2.0 2012. LNCS, vol. 7494, pp. 86–98. Springer, Heidelberg (2012)
Gardezi, J., Bertossi, L.: Tractable cases of clean query answering under entity resolution via matching dependencies. In: Hüllermeier, E., Link, S., Fober, T., Seeger, B. (eds.) SUM 2012. LNCS, vol. 7520, pp. 180–193. Springer, Heidelberg (2012)
Kolaitis, P., Pema, E.: A dichotomy in the complexity of consistent query answering for queries with two atoms. Information Processesing Letters 112(3), 77–85 (2012)
Koutris, P., Suciu, D.: A dichotomy on the complexity of consistent query answering for atoms with simple keys. In: Proc. ICDT 2014, pp. 165–176 (2014)
ten Cate, B., Fontaine, G., Kolaitis, P.: On the data complexity of consistent query answering. In: Proc. ICDT 2012, pp. 22–33 (2012)
Wijsen, J.: A survey of the data complexity of consistent query answering under key constraints. In: Beierle, C., Meghini, C. (eds.) FoIKS 2014. LNCS, vol. 8367, pp. 62–78. Springer, Heidelberg (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Bertossi, L., Gardezi, J. (2014). Tractable vs. Intractable Cases of Query Answering under Matching Dependencies. In: Straccia, U., Calì, A. (eds) Scalable Uncertainty Management. SUM 2014. Lecture Notes in Computer Science(), vol 8720. Springer, Cham. https://doi.org/10.1007/978-3-319-11508-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-11508-5_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11507-8
Online ISBN: 978-3-319-11508-5
eBook Packages: Computer ScienceComputer Science (R0)