Skip to main content

A Resource Efficient Hybrid Data Structure for Twig Queries

  • Conference paper
Database and XML Technologies (XSym 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4156))

Included in the following conference series:

Abstract

Designing data structures for use in mobile devices requires attention on optimising data volumes with associated benefits for data transmission, storage space and battery use. For semistructured data, tree summarisation techniques can be used to reduce the volume of structured elements while dictionary compression can efficiently deal with value-based predicates. This paper introduces an integration of the two approaches using numbering schemes to connect the separate elements, the key strength of this hybrid technique is that both structural and value predicates can be resolved in one graph, while further allowing for compression of the resulting data structure. Performance measures that show advantages of using this hybrid structure are presented, together with an analysis of query resolution using a number of different index granularities. As the current trend is towards the requirement for working with larger semi-structured data sets this work allows for the utilisation of these data sets whilst reducing both the bandwidth and storage space necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amato, G., Debole, F., Zezula, P., Rabitti, F.: YAPI: Yet another path index for XML searching. In: Proc of ECDL, pp. 176–187 (2003)

    Google Scholar 

  2. Arion, A., Bonifati, A., Costa, G., D’Aguanno, S., Manolescu, I., Pugliese, A.: Xquec: Pushing queries to compressed XML data. In: VLDB, pp. 1065–1068 (2003)

    Google Scholar 

  3. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal XML pattern matching. In: Proc. of SIGMOD, pp. 310–321 (2002)

    Google Scholar 

  4. Buneman, P., Grohe, M., Koch, C.: Path queries on compressed XML. In: Proc. of VLDB, pp. 141–152 (2003)

    Google Scholar 

  5. Buneman, P., Choi, B., Fan, W., Hutchison, R., Mann, R., Viglas, S.D.: Vectorizing and querying large XML repositories. In: ICDE 2005: Proceedings of the 21st International Conference on Data Engineering, pp. 261–272. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  6. Buneman, P., Davidson, S.B., Fernandez, M.F., Suciu, D.: Adding structure to unstructured data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 336–350. Springer, Heidelberg (1996)

    Google Scholar 

  7. Chen, Q., Lim, A., Ong, K.W.: D(k)-index: an adaptive structural summary for graph-structured data. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 134–144. ACM Press, New York (2003)

    Chapter  Google Scholar 

  8. Cheney, J.: Compressing XML with multiplexed hierarchical ppm models. In: Data Compression Conference, p. 163 (2001)

    Google Scholar 

  9. Cheney, J.: Tradeoffs in XML database compression. In: DCC, pp. 392–401 (2006)

    Google Scholar 

  10. Cheng, J., Ng, W.: Xqzip: Querying compressed XML using structural indexing. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 219–236. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Chien, S.Y., Vagena, Z., et al.: Efficient structural joins on indexed XML documents. In: Proc. of VLDB, pp. 263–274 (2002)

    Google Scholar 

  12. Choi, B., Buneman, P.: XML vectorization: A column-based XML storage model. Technical Report MS-CIS-03-17, University of Pennsylvania (2003)

    Google Scholar 

  13. Dietz, P.F.: Maintaining order in a linked list. In: Proc. of STOCS, pp. 122–127 (1982)

    Google Scholar 

  14. Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and searching XML data via two zips. In: Proc. World Wide Web Conference 2006 (WWW 2006) (2006)

    Google Scholar 

  15. Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: Proc. of VLDB, pp. 436–445 (1997)

    Google Scholar 

  16. Grust, T.: Accelerating XPath location steps. In: Proc. of SIGMOD, pp. 109–120 (2002)

    Google Scholar 

  17. Halverson, A., Burger, J., et al.: Mixed mode XML query processing. In: Proc of VLDB, pp. 225–236 (2003)

    Google Scholar 

  18. Kaushik, R., Bohannon, P., et al.: Covering indexes for branching path queries. In: Proc. of SIGMOD, pp. 133–144 (2002)

    Google Scholar 

  19. Kaushik, R., Krishnamurthy, R., et al.: On the integration of structure indexes and inverted lists. In: Proc. of SIGMOD, pp. 779–790 (2004)

    Google Scholar 

  20. Kaushik, R., Shenoy, P., et al.: Exploiting local similarity for indexing paths in graph-structured data. In: Proc. of ICDE, pp. 129–140 (2002)

    Google Scholar 

  21. Kha, D.D., Yoshikawa, M., Uemura, S.: A structural numbering scheme for XML data. In: Proc. of EDBT, pp. 279–289 (2001)

    Google Scholar 

  22. Lee, J., Lee, Y., Kang, S., Jin, H., Lee, S., Kim, B., Song, J.: Bmq-index: Shared and incremental processing of border monitoring queries over data streams. In: The 7th International Conference on Mobile Data Management (MDM 2006) (2006)

    Google Scholar 

  23. Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB, pp. 361–370 (2001)

    Google Scholar 

  24. Liefke, H., Suciu, D.: XMill: An efficient compressor for XML data. In: Proc. of SIGMOD, pp. 153–164 (2000)

    Google Scholar 

  25. McHugh, J., Widom, J.: Query optimization for XML. In: Proc. of VLDB, pp. 315–326 (1999)

    Google Scholar 

  26. Min, J.-K., Park, M.-J., Chung, C.-W.: Xpress: A queriable compression for xml data. In: SIGMOD Conference, pp. 122–133 (2003)

    Google Scholar 

  27. Naughton, J.F., DeWitt, D.J., et al.: The Niagara internet query system. IEEE Data Eng. Bull. 24(2), 27–33 (2001)

    Google Scholar 

  28. Neumüller, M., Wilson, J.N.: Improving XML processing using adapted data structures. In: Proc. of WEBDB, pp. 63–77 (2002)

    Google Scholar 

  29. Schmidt, A.R., Waas, F., et al.: XMark: A benchmark for XML data management. In: Proc. of VLDB, pp. 27–32 (2002)

    Google Scholar 

  30. Tolani, P.M., Haritsa, J.R.: Xgrind: A query-friendly XML compressor. In: ICDE, pp. 225–234 (2002)

    Google Scholar 

  31. Wang, W., Jiang, H., Wang, H., Lin, X., Lu, H., Li, J.: Efficient processing of XML path queries using the disk-based F&B Index. In: Proc. of VLDB, pp. 145–156 (2005)

    Google Scholar 

  32. Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD 2004: Proceedings of the 2004 ACM SIGMOD international conference on Management of data, pp. 335–346. ACM Press, New York (2004)

    Chapter  Google Scholar 

  33. Zezula, P., Amato, G., et al.: Tree Signatures for XML Querying and Navigation. In: Bellahsène, Z., Chaudhri, A.B., Rahm, E., Rys, M., Unland, R. (eds.) XSym 2003. LNCS, vol. 2824, pp. 149–163. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  34. Zhang, C., Naughton, J.F., et al.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD, pp. 425–436 (2001)

    Google Scholar 

  35. Zhang, N., Ozsu, M.T., Ilyas, I.F., Aboulnaga, A., Cheriton, D.R.: Fix: Feature-based indexing technique for XML documents. Technical Report CS-2006-07, School of Computer Science, University of Waterloo (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wilson, J., Gourlay, R., Japp, R., Neumüller, M. (2006). A Resource Efficient Hybrid Data Structure for Twig Queries. In: Amer-Yahia, S., Bellahsène, Z., Hunt, E., Unland, R., Yu, J.X. (eds) Database and XML Technologies. XSym 2006. Lecture Notes in Computer Science, vol 4156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11841920_6

Download citation

  • DOI: https://doi.org/10.1007/11841920_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-38877-7

  • Online ISBN: 978-3-540-38879-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics