Skip to main content

Improving XML Processing Using Adapted Data Structures

  • Conference paper
  • First Online:
Web, Web-Services, and Database Systems (NODe 2002)

Abstract

From its origins in document processing, XML has developed into a medium for communicating all kinds of data between applications. More recently, interest has focused on the concept of native XML databases. This paradigm requires that database queries can be resolved by direct searching of XML data structures. Relational databases can be compressed without the loss of direct addressability. A similar approach can be applied to XML data structures. Compression in the relational paradigm is associated with improved performance. We review this approach and show results from the implementation of a prototype compressed DOM. Our research indicates that it is possible to optimise queries over compact XML structures by choosing appropriate physical representations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Peter M. G. Apers, Paolo Atzeni, et al., editors. Proceedings of the 27th International Conference on Very Large Data Bases, September 11–14, 2001, Roma, Italy. Morgan Kaufmann, 2001.

    Google Scholar 

  2. Angela Bonifati and Stefano Ceri. Comparative analysis of five XML query languages. SIGMOD Record, 29(1):68–79, 2000.

    Article  Google Scholar 

  3. Zhiyuan Chen and Praveen Seshadri. An algebraic compression framework for query results. In ICDE 2000, pages 177–188, San Diego, California, USA, 2000. IEEE Computer Society.

    Google Scholar 

  4. W. Paul Cockshot, Douglas McGregor, and John Wilson. High-performance operations using a compressed database architecture. Computer J., 41(5):283–296, 1998.

    Article  Google Scholar 

  5. Brian Cooper, Neal Sample, et al. A fast index for semistructured data. In Apers et al. [1], pages 341–350.

    Google Scholar 

  6. Daniela Florescu and Donald Kossmann. Storing and querying XML data using an RDBMS. Data Engineering Bulletin, 22(3):27–34, Sep 1999.

    Google Scholar 

  7. Hector Garcia-Molina and Kenneth Salem. Main memory database systems: An overview. IEEE Trans. Knowledge Data Eng., 4:509–516, 1992.

    Article  Google Scholar 

  8. Roy Goldman and Jennifer Widom. DataGuides: Enabling query formulation and optimization in semistructured databases. In Matthias Jarke, Michael J. Carey, et al., editors, VLDB 1997, pages 436–445. Morgan Kaufmann, 1997.

    Google Scholar 

  9. Abu Sayed M. L. Hoque, Douglas R. McGregor, and John N. Wilson. Database compression using of-line dictionary methods. Technical report, Department of Computer and Information Science, University of Strathclyde, Glasgow, Scotland, UK, 2002.

    Google Scholar 

  10. Meike Klettke and Holger Meyer. XML and object-relational database systems. In Suciu and Vossen [23], pages 151–170.

    Google Scholar 

  11. Meike Klettke, Lars Schneider, and Andreas Heuer. Metrics for XML document collections. In Akmal Chaudri and Rainer Unland, editors, XMLDM Workshop, pages 162–176, Prague, Czech Republic, Mar 2002.

    Google Scholar 

  12. Quanzhong Li and Bongki Moon. Indexing and querying XML data for regular path expressions. In Apers et al. [1], pages 361–370.

    Google Scholar 

  13. Hartmut Liefke and Dan Suciu. XMill: An e.cient compressor for XML data. In Weidong Chen, Jeffrey F. Naughton, and Philip A. Bernstein, editors, SIGMOD 2000, volume 29 of SIGMOD Record, pages 153–164, 2000.

    Google Scholar 

  14. J. McHugh, S. Abiteboul, et al. Lore: A database management system for semistructured data. SIGMOD Record, 26(3):54–66, Sep 1997.

    Article  Google Scholar 

  15. Mathias Neumüller and John N. Wilson. Compact in-memory representation of XML data. Technical report, Department of Computer and Information Science, University of Strathclyde, Glasgow, Scotland, UK, 2002.

    Google Scholar 

  16. P. Pucheral, J.-M. Thevenin, and P. Valduriez. Efficient main memory data management using the DBGraph storage model. In Dennis McLeod, Ron Sacks-Davis, and Hans-Jörg Schek, editors, VLDB 1990, pages 683–695. Morgan Kaufmann, 1990.

    Google Scholar 

  17. Albrecht Schmidt, Martin Kersten, et al. Efficient relational storage and retrieval of XML documents. In Suciu and Vossen [23], pages 137–150.

    Google Scholar 

  18. Albrecht Schmidt, Florian Waas, et al. Why and how to benchmark XML databases. SIGMOD Record, 30(3):27–32, 2001.

    Article  Google Scholar 

  19. Harald Schöning. Tamino-a DBMS designed for XML. In ICDE 2001, pages 149–154, Heidelberg, Germany, 2001. IEEE Computer Society.

    Google Scholar 

  20. Takeyuki Shimura, Masatoshi Yoshikawa, and Shunsuke Uemura. Storage and retrieval of XML documents using object-relational databases. In Trevor J. M. Bench-Capon, Giovanni Soda, and A. Min Tjoa, editors, DEXA’ 99, volume 1677 of LNCS, pages 206–217. Springer, 1999.

    Google Scholar 

  21. Kimbro Staken. Introduction to dbXML. XML.com, Nov 2001. http://www.xml.com/pub/a/2001/11/28/dbxml.html.

  22. Kimbro Staken. Introduction to native XML databases. XML.com, Oct 2001. http://www.xml.com/pub/a/2001/10/31/nativexml.html.

  23. Dan Suciu and Gottfried Vossen, editors. The World Wide Web and Databases, Third International Workshop WebDB 2000, Dallas, Texas, USA, May 18–19, 2000, Selected Papers, volume 1997 of LNCS. Springer, 2001.

    Google Scholar 

  24. D. Tsichritzis and F. H. Lochovsky. Hierarchical data-base management: A survey. ACM Computing Surveys, 8(1):105–123, 1976.

    Article  MATH  Google Scholar 

  25. WAP Forum, Ltd. Binary XML Content Format Specification, version 1.3 edition, Jul 2001. http://www.wapforum.org.

  26. T. A. Welch. A technique for high performance data compression. IEEE Computer, 17(6):8–20, Jun 1984.

    Google Scholar 

  27. Ian H. Witten, Alistair Moffat, and Timothy C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann, second edition, May 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Neumüller, M., Wilson, J.N. (2003). Improving XML Processing Using Adapted Data Structures. In: Chaudhri, A.B., Jeckle, M., Rahm, E., Unland, R. (eds) Web, Web-Services, and Database Systems. NODe 2002. Lecture Notes in Computer Science, vol 2593. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36560-5_16

Download citation

  • DOI: https://doi.org/10.1007/3-540-36560-5_16

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00745-6

  • Online ISBN: 978-3-540-36560-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics