skip to main content
column

Semiring-annotated data: queries and provenance?

Published:05 October 2012Publication History
Skip Abstract Section

Abstract

We present an overview of the literature on querying semiring-annotated data, a notion we introduced five years ago in a paper with Val Tannen. First, we show that positive relational algebra calculations for various forms of annotated relations, as well as provenance models for such queries, are particular cases of the same general algorithm involving commutative semirings. For this reason, we present a formal framework for answering queries on data with annotations from commutative semirings, and propose a comprehensive provenance representation based on semirings of polynomials. We extend these considerations to XQuery views over annotated, unordered XML data, and show that the semiring framework suffices for a large positive fragment of XQuery applied to such data. Finally, we conclude with a brief overview of the large body of work that builds upon these results, including both extensions to the theoretical foundations and uses in practical applications.

References

  1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Y. Amsterdamer, D. Deutch, T. Milo, and V. Tannen. On provenance minimization. In PODS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Y. Amsterdamer, D. Deutch, and V. Tannen. On the limitations of provenance for queries with difference. In TaPP, 2011.Google ScholarGoogle Scholar
  4. Y. Amsterdamer, D. Deutch, and V. Tannen. Provenance for aggregate queries. In PODS, pages 153--164, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. O. Benjelloun, A. D. Sarma, A. Y. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Buneman, S. Khanna, and W.-C. Tan. Why and where: A characterization of data provenance. In ICDT, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Chaudhuri and M. Y. Vardi. Optimization of real conjunctive queries. In PODS, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Cheney, L. Chiticariu, and W. C. Tan. Provenance in databases: Why, how, and where. Foundations and Trends in Databases, 1(4):379--474, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Chiticariu and W.-C. Tan. Debugging schema mappings with routes. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Cui, J. Widom, and J. L. Wiener. Tracing the lineage of view data in a warehousing environment. TODS, 25(2), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Draper, P. Fankhauser, M. Fernandez, A. Malhotra, M. Rys, J. Simeon, and P. Wadler. XQuery 1.0 formal semantics. Available from http://www.w3.org/TR/xquery-semantics/, 12 November 2003. W3C working draft.Google ScholarGoogle Scholar
  12. J. N. Foster, T. J. Green, and V. Tannen. Annotated XML: Queries and provenance. In PODS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Fuhr and T. Rölleke. A probabilistic relational algebra for the integration of information retrieval and database systems. TOIS, 14(1), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Geerts and A. Poggi. On database query languages for K-relations. J. Applied Logic, 8(2), 2010.Google ScholarGoogle Scholar
  15. T. J. Green. Containment of conjunctive queries on annotated relations. Theory Comput. Syst., 49(2), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. J. Green and Z. G. Ives. Recomputing materialized instances after changes to mappings and data. In ICDE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. J. Green, Z. G. Ives, and V. Tannen. Reconciliable differences. Theory of Computing Systems, 49(2), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. J. Green, G. Karvounarakis, Z. G. Ives, and V. Tannen. Update exchange with mappings and provenance. In VLDB, 2007. Amended version available as Univ. of Pennsylvania report MS-CIS-07-26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. J. Green and V. Tannen. Models for incomplete and probabilistic information. In EDBT Workshops, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Imielinski and W. Lipski. Incomplete information in relational databases. JACM, 31(4), 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Ioannidis and R. Ramakrishnan. Containment of conjunctive queries: beyond relations as sets. TODS, 20(3), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Karvounarakis. Provenance for Collaborative Data Sharing. PhD thesis, University of Pennsylvania, July 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Karvounarakis and Z. G. Ives. Bidirectional mappings for data and update exchange. In WebDB, 2008.Google ScholarGoogle Scholar
  25. G. Karvounarakis, Z. G. Ives, and V. Tannen. Querying Data Provenance. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. V. Kostylev, J. L. Reutter, and A. Z. Salamon. Classification of annotation semirings over query containment. In PODS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. W. Kuich. Semirings and formal power series. In Handbook of formal languages, volume 1. Springer, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. http://www.logicblox.com.Google ScholarGoogle Scholar
  29. D. Olteanu and J. Zavodny. Factorised representations of query results: Size bounds and readability. In ICDT, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. D. Sarma, O. Benjelloun, A. Halevy, and J. Widom. Working models for uncertain data. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Theoharis, I. Fundulaki, G. Karvounarakis, and V. Christophides. On provenance of queries on semantic web data. IEEE Internet Computing, 15(1):31--39, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. L. A. Zadeh. Fuzzy sets. Inf. Control, 8(3), 1965.Google ScholarGoogle Scholar
  33. E. Zimányi. Query evaluation in probabilistic relational databases. TCS, 171(1--2), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Semiring-annotated data: queries and provenance?

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGMOD Record
          ACM SIGMOD Record  Volume 41, Issue 3
          September 2012
          49 pages
          ISSN:0163-5808
          DOI:10.1145/2380776
          Issue’s Table of Contents

          Copyright © 2012 Authors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 October 2012

          Check for updates

          Qualifiers

          • column

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader