Skip to main content

On the Difficulty of Finding Optimal Relational Decompositions for XML Workloads: A Complexity Theoretic Perspective

  • Conference paper
  • First Online:
Book cover Database Theory — ICDT 2003 (ICDT 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2572))

Included in the following conference series:

Abstract

A key problem that arises in the context of storing XML documents in relational databases is that of finding an optimal relational decomposition for a given set of XML documents and a given set of XML queries over those documents. While there have been a number of ad hoc solutions proposed for this problem, to our knowledge this paper represents a first step toward formalizing the problem and studying its complexity. It turns out that to even define what one means by an optimal decomposition, one first needs to specify an algorithm to translate XML queries to relational queries, and a cost model to evaluate the quality of the resulting relational queries. By examining an interesting problem embedded in choosing a relational decomposition, we show that choices of different translation algorithms and cost models result in very different complexities for the resulting optimization problems. Our results suggest that, contrary to the trend in previous work, the eventual development of practical algorithms for finding relational decompositions for XML workloads will require judicious choices of cost models and translation algorithms, rather than an exclusive focus on the decomposition problem in isolation.

Research supported in part by NSF grants CSA-9623632, ITR-0086002, CCR- 9634665 and CCR-0208013

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. Alimonti and V. Kann. Hardness of approximating problems on cubic graphs. In Proc. 3rd Italian Conf. on Algorithms and Complexity, Lecture Notes in Computer Science, 1203, pages 288–298. Springer-Verlag, 1997.

    Google Scholar 

  2. R. Bar-Yehuda and S. Even. A local-ratio theorem for approximating the weighted vertex cover problem. Annals of Discrete Mathematics, 25:27–46, 1985.

    MathSciNet  Google Scholar 

  3. P. Bohannon, J. Freire, P. Roy, and J. Simeon. From xml schema to relations: A cost-based approach to xml storage. In ICDE, 2002.

    Google Scholar 

  4. A. Deutsch, M. Fernandez, and D. Suciu. Storing semistructured data with stored. In SIGMOD, pages 431–442, 1999.

    Google Scholar 

  5. I. Dinur and S. Safra. The importance of being biased. In Proceedings of the thiryfourth annual ACM symposium on Theory of computing, pages 33–42. ACM Press, 2002.

    Google Scholar 

  6. D. Florescu and D. Kossman. Storing and querying xml data using an rdbms. In Data Engineering Bulletin, volume 22, 1999.

    Google Scholar 

  7. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco, 1979.

    Google Scholar 

  8. C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.

    Google Scholar 

  9. Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. Journal of the ACM (JACM), 27(4):633–655, 1980.

    Article  MATH  MathSciNet  Google Scholar 

  10. A. Schmidt, M. Kersten, M. Windhouwer, and F. Waas. Efficient relational storage and retrieval of xml documents. Lecture Notes in Computer Science, 1997, 2001.

    Google Scholar 

  11. A. R. Schmidt, F. Waas, M. L. Kersten, D. Florescu, I. Manolescu, M. J. Carey, and R. Busse. The XML Benchmark Project. Technical Report INS-R0103, CWI, Amsterdam, The Netherlands, April 2001.

    Google Scholar 

  12. J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt, and J. Naughton. Relational databases for querying xml documents: Limitations and opportunities. In Proceedings of the VLDB Conference, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Krishnamurthy, R., Chakaravarthy, V.T., Naughton, J.F. (2003). On the Difficulty of Finding Optimal Relational Decompositions for XML Workloads: A Complexity Theoretic Perspective. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds) Database Theory — ICDT 2003. ICDT 2003. Lecture Notes in Computer Science, vol 2572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36285-1_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-36285-1_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00323-6

  • Online ISBN: 978-3-540-36285-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics