skip to main content
10.1145/1242572.1242717acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Querying and maintaining a compact XML storage

Published: 08 May 2007 Publication History

Abstract

As XML database sizes grow, the amount of space used for storing the data and auxiliary data structures becomes a major factor in query and update performance. This paper presents a new storage scheme for XML data that supports all navigational operations in near constant time. In addition to supporting efficient queries, the space requirement of the proposed scheme is within a constant factor of the information theoretic minimum, while insertions and deletions can be performed in near constant time as well. As a result, the proposed structure features a small memory footprint that increases cache locality, whilst still supporting standard APIs, such as DOM, and necessary database operations, such as queries and updates, efficiently. Analysis and experiments show that the proposed structure is space and time efficient.

References

[1]
Dblp bibliography. See http://www.informatik.uni-trier.de/~ley/db/.
[2]
Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, and Jignesh M. Patel. Structural joins: A primitive for efficient XML query pattern matching. In Proceedings of the 18th International Conference on Data Engineering (ICDE), pages 141--153. IEEE Computer Society, 2002.
[3]
Rolf Apweiler, Amos Bairoch, and Cathy H. Wu. Protein sequence databases. Current Opinion in Chemical Biology, 8:76--80, 2004.
[4]
Peter Buneman, Martin Grohe, and Christoph Koch. Path Queries on Compressed XML. In Proceedings of the 29th International Conference on Very Large Databases (VLDB), pages 141--152. Morgan Kaufmann, 2003.
[5]
Giorgio Busatto, Markus Lohrey, and Sebastian Maneth. Efficient memory representation of xml documents. In DBPL, 2005.
[6]
Yi Chen, George A. Mihaila, Rajesh Bordawekar, and Sriram Padmanabhan. L-tree: A dynamic labeling structure for ordered xml data. In Wolfgang Lindner, Marco Mesiti, Can Türker, Yannis Tzitzikas, and Athena Vakali, editors, EDBT Workshops, volume 3268 of Lecture Notes in Computer Science, pages 209--218. Springer, 2004.
[7]
James Cheney. XMLPPM: XML-Conscious PPM Compression. See http://www.cs.cornell.edu/People/jcheney/xmlppm/xmlppm.html.
[8]
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini, and S. Muthukrishnan. Compressing and searching xml data via two zips. In WWW, pages 751--760, 2006.
[9]
Torsten Grust, Maurice van Keulen, and Jens Teubner. Accelerating xpath evaluation in any rdbms. ACM Trans. Database Syst., 29:91--131, 2004.
[10]
Alan Halverson, Josef Burger, Leonidas Galanis, Ameet Kini, Rajasekar Krishnamurthy, Ajith Nagaraja Rao, Feng Tian, Stratis Viglas, Yuan Wang, Jeffrey F. Naughton, and David J. DeWitt. Mixed Mode XML Query Processing. In Proceedings of the 29th International Conference on Very Large Databases (VLDB), pages 225--236. Morgan Kaufmann, 2003.
[11]
Guy Jacobson. Succinct Static Data Structures. PhD thesis, Carnegie Mellon University, 1988.
[12]
H. V. Jagadish, S. Al-Khalifa, A. Chapman, L. V. S. Lakshmanan, A. Nierman, S. Paparizos, J. M. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu. TIMBER: A native XML database. VLDB Journal, 11(4):274--291, 2002.
[13]
Vanja Josifovski, Marcus Fontoura, and Attlila Barta. Querying XML streams. VLDB Journal, 14(2):197--210, 2005.
[14]
Jyrki Katajainen and Erkki Makinen. Tree compression and optimization with applications. In International Journal of Foundations of Computer Science (FOCS), Vol. 1, pages 425--447. IEEE Computer Society, 1990.
[15]
Quanzhong Li and Bongki Moon. Indexing and querying XML data for regular path expressions. In Proceedings of the 27th International Conference on Very Large Databases (VLDB), pages 361--370. Morgan Kaufmann, 2001.
[16]
Hartmut Liefke and Dan Suciu. XMill: an efficient compressor for XML data. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 153--164. ACM Press, 2000.
[17]
Sebastian Maneth and Giorgio Busatto. Tree transducers and tree compressions. In Igor Walukiewicz, editor, FoSSaCS, volume 2987 of Lecture Notes in Computer Science, pages 363--377. Springer, 2004.
[18]
Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19, 1993.
[19]
Jun-Ki Min, Myung-Jae Park, and Chin-Wan Chung. XPRESS: A Queriable Compression for XML Data. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 122--133. ACM Press, 2003.
[20]
J. Ian Munro, Venkatesh Raman, and Adam J. Storm. Representing Dynamic Binary Trees Succinctly. In Proceedings of the 12th Annual Symposium on Discrete Algorithms (SODA), pages 529--536. SIAM, 2001.
[21]
Adam Silberstein, Hao He, Ke Yi, and Jun Yang. BOXes: Efficient maintenance of order-based labeling for dynamic XML data. In the 21st International Conference on Data Engineering (ICDE), pages 285--296, 2005.
[22]
Pankaj M. Tolani and Jayant R. Haritsa. XGRIND: A query-friendly XML compressor. In Proceedings of the 18th International Conference on Data Engineering (ICDE), pages 225--234. IEEE Computer Society, 2002.
[23]
Ning Zhang, Varun Kacholia, and M. Tamer Ozsu. A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML. In Proceedings of the 20th International Conference on Data Engineering (ICDE), pages 54--65. IEEE Computer Society, 2004.

Cited By

View all
  • (2022)An Efficient Prefix-Based Labeling Scheme for XML Dynamic Updates Using Hexagonal PatternIEEE Access10.1109/ACCESS.2022.317843810(57107-57123)Online publication date: 2022
  • (2020)NTSOAP: A Robust Approach for Non-redundancy Tags of SOAP MessagesArtificial Intelligence and Renewables Towards an Energy Transition10.1007/978-3-030-63846-7_75(782-789)Online publication date: 18-Dec-2020
  • (2015)QRFXFreeze: Queryable Compressor for RFXThe Scientific World Journal10.1155/2015/8647502015(1-8)Online publication date: 2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. compact storage
  3. query processing
  4. storage optimization

Qualifiers

  • Article

Conference

WWW'07
Sponsor:
WWW'07: 16th International World Wide Web Conference
May 8 - 12, 2007
Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)An Efficient Prefix-Based Labeling Scheme for XML Dynamic Updates Using Hexagonal PatternIEEE Access10.1109/ACCESS.2022.317843810(57107-57123)Online publication date: 2022
  • (2020)NTSOAP: A Robust Approach for Non-redundancy Tags of SOAP MessagesArtificial Intelligence and Renewables Towards an Energy Transition10.1007/978-3-030-63846-7_75(782-789)Online publication date: 18-Dec-2020
  • (2015)QRFXFreeze: Queryable Compressor for RFXThe Scientific World Journal10.1155/2015/8647502015(1-8)Online publication date: 2015
  • (2014)XXSACM Transactions on Information Systems10.1145/262955432:3(1-37)Online publication date: 8-Jul-2014
  • (2014)Designing a high-performance mobile cloud web browserProceedings of the 23rd International Conference on World Wide Web10.1145/2567948.2579365(735-736)Online publication date: 7-Apr-2014
  • (2013)On fly search approach for compact XML2013 International Conference on Recent Trends in Information Technology (ICRTIT)10.1109/ICRTIT.2013.6844228(347-351)Online publication date: Jul-2013
  • (2013)Feasibility and a case study on content optimization services on cloudInformation Systems Frontiers10.1007/s10796-012-9379-415:4(525-532)Online publication date: 1-Sep-2013
  • (2013)Data Value Storage for Compressed Semi-structured DataDatabase and Expert Systems Applications10.1007/978-3-642-40173-2_16(174-188)Online publication date: 2013
  • (2012)The Economics of Content Optimization ServicesProceedings of the 2012 IEEE First International Conference on Services Economics10.1109/SE.2012.13(9-15)Online publication date: 24-Jun-2012
  • (2012)Dynamizing succinct tree representationsProceedings of the 11th international conference on Experimental Algorithms10.1007/978-3-642-30850-5_20(224-235)Online publication date: 7-Jun-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media