Abstract
In this paper, we demonstrate how the semantic information, such as value, property, object class and relationship between object classes in XML data impacts XML query processing. We show that the lack of using semantics causes different problems in value management and content search in existing approaches. Motivated on solving these problems, we propose a semantic approach for XML twig pattern query processing. In particular, we design TwigTable algorithm to incorporate property and value information into query processing. This information can be correctly discovered in any XML data. In addition, we propose three object-based optimization techniques to TwigTable. If more semantics of object classes are known in an XML document, we can process queries more efficiently with these semantic optimizations. Last, we show the benefits of our approach by a comprehensive experimental study.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
http://www.cs.washington.edu/research/xmldatasets/data/nasa/nasa.xml
http://www.sybase.com/products/databasemanagement/sqlanywhere
Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient XML query pattern matching. In: Proc. of ICDE, pp. 141–154 (2002)
Berglund, A., Chamberlin, D., Fernandez, M.F., Kay, M., Robie, J., Simeon, J.: XML Path Language (XPath) 2.0. W3C Working Draft (2003)
Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query. W3C Working Draft (2003)
Bohannon, P., Freire, J., Roy, P., Simeon, J.: From XML schema to relations: a cost-based approach to XML storage. In: Proc. of ICDE, pp. 64–75 (2002)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: Optimal XML pattern matching. In: Proc. of SIGMOD, pp. 310–321 (2002)
Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: Proc. of SIGMOD, pp. 455–466 (2005)
Chen, Y., Davidson, S.B., Hara, C.S., Zheng, Y.: RRXS: redundancy reducing XML storage in relations. In: Proc. of VLDB, pp. 189–200 (2003)
Doan, A., Ramakrishnan, R., Chen, F., DeRose, P., Lee, Y., McCann, R., Sayyadian, M., Shen, W.: Community information management. IEEE Data Eng. Bull. 29(1), 64–72 (2006)
Florescu, D., Kossmann, D.: Storing and querying XML data using an RDMBS. IEEE Data Eng. Bull. 22(3), 27–34 (1999)
Gou, G., Chirkova, R.: Efficienty querying large XML data repositories: a survey. IEEE Transactions on Knowledge and Data Engineering 19(10), 1381–1403 (2007)
Grust, T.: Accelerating XPath location steps. In: Proc. of SIGMOD, pp. 109–120 (2002)
Jiang, H., Lu, H., Wang, W.: Efficient processing of XML twig queries with OR-predicates. In: Proc. of SIGMOD, pp. 59–70 (2004)
Jiang, H., Wang, W., Lu, H., Yu, J.: Holistic twig joins on indexed XML documents. In: Proc. of VLDB, pp. 273–284 (2003)
Li, C., Ling, T.W.: QED: a novel quaternary encoding to completely avoid re-labeling in XML updates. In: Proc. of CIKM, pp. 501–508 (2005)
Ling, T.W., Lee, M.L., Dobbie, G.: Semistructured database design (web information systems engineering and Internet technologies series). Springer, Heidelberg (2004)
Liu, Z., Chen, Y.: Identifying meaningful return information for XML keyword search. In: Proc. of SIGMOD, pp. 329–340 (2007)
Lu, J., Chen, T., Ling, T.W.: Efficient processing of XML twig patterns with parent child edges: a look-ahead approach. In: Proc. of CIKM, pp. 533–542 (2004)
Lu, J., Ling, T.W., Chan, C., Chen, T.: From region encoding to extended dewey: On efficient processing of XML twig pattern matching. In: Proc. of VLDB, pp. 193–204 (2005)
Navathe, S., Ceri, S., Wiederhold, G., Dou, J.: Vertical partitioning algorithms for database design. ACM Transactions on Database Systems 9(4), 680–710 (1984)
Pal, S., Cseri, I., Seeliger, O., Schaller, G., Giakoumakis, L., Zolotov, V.: Indexing XML data stored in a relational database. In: Proc. of VLDB, pp. 1146–1157 (2004)
Rao, P.R., Moon, B.: PRIX: Indexing and Querying XML Using Prufer Sequences. In: Proc. of ICDE, p. 288 (2004)
Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational databases for querying XML documents: limitations and opportunities. In: Proc. of VLDB, pp. 302–314 (1999)
Spink, A.: A user-centered approach to evaluating human interaction with web search engines: an exploratory study. Information Processing & Management 38(3), 401–426 (2002)
Tatarinov, I., Viglas, S., Beyer, K.S., Shanmugasundaram, J., Shekita, E.J., Zhang, C.: Storing and Querying Ordered XML Using a Relational Database System. In: Proc. of SIGMOD, pp. 204–215 (2002)
Tian, F., DeWitt, D.J., Chen, J., Zhang, C.: The design and performance evaluation of alternative XML storage strategies. SIGMOD Record 31(1), 5–10 (2002)
TreeBank. Retrieved from University of Washington Database Group (2002)
Wang, H., Park, S., Fan, W., Yu, P.S.: ViST: A Dynamic index method for querying XML data by tree structures. In: Proc. of SIGMOD, pp. 110–121 (2003)
Wu, H., Ling, T.W., Chen, B.: VERT: A semantic approach for content search and content extraction in XML query processing. In: Parent, C., Schewe, K.-D., Storey, V.C., Thalheim, B. (eds.) ER 2007. LNCS, vol. 4801, pp. 534–549. Springer, Heidelberg (2007)
Wu, H., Ling, T.W., Dobbie, G., Bao, Z., Xu, L.: Reducing graph matching to tree matching for XML queries with ID references. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010. LNCS, vol. 6262, pp. 391–406. Springer, Heidelberg (2010)
XMark. An xml benchmark project, http://www.xml-benchmark.org
Xu, L., Ling, T.W., Wu, H., Bao, Z.: DDE: From Dewey to a Fully Dynamic XML Labeling Scheme. In: Proc. of SIGMOD, pp. 719–730 (2009)
Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S.: XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans. Internet Techn. 1(1), 110–141 (2001)
Yu, C., Jagadish, H.V.: Efficient discovery of XML data redundancies. In: Proc. of VLDB, pp. 103–114 (2006)
Yu, T., Ling, T.W., Lu, J.: Twigstacklistnot: A holistic twig join algorithm for twig query with NOT-predicates on XML data. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 249–263. Springer, Heidelberg (2006)
Zhang, C., Naughton, J., Dewitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD, pp. 425–436 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Wu, H., Ling, T.W., Chen, B., Xu, L. (2011). TwigTable: Using Semantics in XML Twig Pattern Query Processing. In: Spaccapietra, S. (eds) Journal on Data Semantics XV. Lecture Notes in Computer Science, vol 6720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22630-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-22630-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22629-8
Online ISBN: 978-3-642-22630-4
eBook Packages: Computer ScienceComputer Science (R0)