Abstract
We present an overview of the literature on querying semiring-annotated data, a notion we introduced five years ago in a paper with Val Tannen. First, we show that positive relational algebra calculations for various forms of annotated relations, as well as provenance models for such queries, are particular cases of the same general algorithm involving commutative semirings. For this reason, we present a formal framework for answering queries on data with annotations from commutative semirings, and propose a comprehensive provenance representation based on semirings of polynomials. We extend these considerations to XQuery views over annotated, unordered XML data, and show that the semiring framework suffices for a large positive fragment of XQuery applied to such data. Finally, we conclude with a brief overview of the large body of work that builds upon these results, including both extensions to the theoretical foundations and uses in practical applications.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- Y. Amsterdamer, D. Deutch, T. Milo, and V. Tannen. On provenance minimization. In PODS, 2011. Google ScholarDigital Library
- Y. Amsterdamer, D. Deutch, and V. Tannen. On the limitations of provenance for queries with difference. In TaPP, 2011.Google Scholar
- Y. Amsterdamer, D. Deutch, and V. Tannen. Provenance for aggregate queries. In PODS, pages 153--164, 2011. Google ScholarDigital Library
- O. Benjelloun, A. D. Sarma, A. Y. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In VLDB, 2006. Google ScholarDigital Library
- P. Buneman, S. Khanna, and W.-C. Tan. Why and where: A characterization of data provenance. In ICDT, 2001. Google ScholarDigital Library
- S. Chaudhuri and M. Y. Vardi. Optimization of real conjunctive queries. In PODS, 1993. Google ScholarDigital Library
- J. Cheney, L. Chiticariu, and W. C. Tan. Provenance in databases: Why, how, and where. Foundations and Trends in Databases, 1(4):379--474, 2009. Google ScholarDigital Library
- L. Chiticariu and W.-C. Tan. Debugging schema mappings with routes. In VLDB, 2006. Google ScholarDigital Library
- Y. Cui, J. Widom, and J. L. Wiener. Tracing the lineage of view data in a warehousing environment. TODS, 25(2), 2000. Google ScholarDigital Library
- D. Draper, P. Fankhauser, M. Fernandez, A. Malhotra, M. Rys, J. Simeon, and P. Wadler. XQuery 1.0 formal semantics. Available from http://www.w3.org/TR/xquery-semantics/, 12 November 2003. W3C working draft.Google Scholar
- J. N. Foster, T. J. Green, and V. Tannen. Annotated XML: Queries and provenance. In PODS, 2008. Google ScholarDigital Library
- N. Fuhr and T. Rölleke. A probabilistic relational algebra for the integration of information retrieval and database systems. TOIS, 14(1), 1997. Google ScholarDigital Library
- F. Geerts and A. Poggi. On database query languages for K-relations. J. Applied Logic, 8(2), 2010.Google Scholar
- T. J. Green. Containment of conjunctive queries on annotated relations. Theory Comput. Syst., 49(2), 2011. Google ScholarDigital Library
- T. J. Green and Z. G. Ives. Recomputing materialized instances after changes to mappings and data. In ICDE, 2012. Google ScholarDigital Library
- T. J. Green, Z. G. Ives, and V. Tannen. Reconciliable differences. Theory of Computing Systems, 49(2), 2011. Google ScholarDigital Library
- T. J. Green, G. Karvounarakis, Z. G. Ives, and V. Tannen. Update exchange with mappings and provenance. In VLDB, 2007. Amended version available as Univ. of Pennsylvania report MS-CIS-07-26. Google ScholarDigital Library
- T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS, 2007. Google ScholarDigital Library
- T. J. Green and V. Tannen. Models for incomplete and probabilistic information. In EDBT Workshops, 2006. Google ScholarDigital Library
- T. Imielinski and W. Lipski. Incomplete information in relational databases. JACM, 31(4), 1984. Google ScholarDigital Library
- Y. Ioannidis and R. Ramakrishnan. Containment of conjunctive queries: beyond relations as sets. TODS, 20(3), 1995. Google ScholarDigital Library
- G. Karvounarakis. Provenance for Collaborative Data Sharing. PhD thesis, University of Pennsylvania, July 2009. Google ScholarDigital Library
- G. Karvounarakis and Z. G. Ives. Bidirectional mappings for data and update exchange. In WebDB, 2008.Google Scholar
- G. Karvounarakis, Z. G. Ives, and V. Tannen. Querying Data Provenance. In SIGMOD, 2010. Google ScholarDigital Library
- E. V. Kostylev, J. L. Reutter, and A. Z. Salamon. Classification of annotation semirings over query containment. In PODS, 2012. Google ScholarDigital Library
- W. Kuich. Semirings and formal power series. In Handbook of formal languages, volume 1. Springer, 1997.Google ScholarDigital Library
- http://www.logicblox.com.Google Scholar
- D. Olteanu and J. Zavodny. Factorised representations of query results: Size bounds and readability. In ICDT, 2012. Google ScholarDigital Library
- A. D. Sarma, O. Benjelloun, A. Halevy, and J. Widom. Working models for uncertain data. In ICDE, 2006. Google ScholarDigital Library
- Y. Theoharis, I. Fundulaki, G. Karvounarakis, and V. Christophides. On provenance of queries on semantic web data. IEEE Internet Computing, 15(1):31--39, 2011. Google ScholarDigital Library
- L. A. Zadeh. Fuzzy sets. Inf. Control, 8(3), 1965.Google Scholar
- E. Zimányi. Query evaluation in probabilistic relational databases. TCS, 171(1--2), 1997. Google ScholarDigital Library
Index Terms
- Semiring-annotated data: queries and provenance?
Recommendations
A semiring-like representation of lattice pseudoeffect algebras
In order to represent lattice pseudoeffect algebras, a non-commutative generalization of lattice effect algebras, in terms of a particular subclass of near semirings, we introduce in this article the notion of near pseudoeffect semiring. Taking ...
Semiring induced valuation algebras: Exact and approximate local computation algorithms
Local computation in join trees or acyclic hypertrees has been shown to be linked to a particular algebraic structure, called valuation algebra. There are many models of this algebraic structure ranging from probability theory to numerical analysis, ...
Semiring Neighbours: An Algebraic Embedding and Extension of Neighbourhood Logic
In 1996 Zhou and Hansen proposed a first-order interval logic called Neighbourhood Logic (NL) for specifying liveness and fairness of computing systems and defining notions of real analysis in terms of expanding modalities. After that, Roy and Zhou ...
Comments