Skip to main content
Log in

Related Axis: The Extension to XPath Towards Effective XML Search

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

We investigate the limitations of existing XML search methods and propose a new semantics, related relationship, to effectively capture meaningful relationships of data elements from XML data in the absence of structural constraints. Then we make an extension to XPath by introducing a new axis, related axis, to specify the related relationship between query nodes so as to enhance the flexibility of XPath. We propose to reduce the cost of computing the related relationship by a new schema summary that summarizes the related relationship from the original schema without any loss. Based on this schema summary, we introduce two indices to improve the performance of query processing. Our algorithm shows that the evaluation of most queries can be equivalently transformed into just a few selection and value join operations, thus avoids the costly structural join operations. The experimental results show that our method is effective and efficient in terms of comparing the effectiveness of the related relationship with existing keyword search semantics and comparing the efficiency of our evaluation methods with existing query engines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Christophides V, Cluet S, Simèon S. On wrapping query languages and efficient XML integration. In Proc. the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD2000), Dallas, USA, May 14–19, 2000, pp.141–152.

  2. Bruno N, Koudas N, Srivastava D. Holistic twig joins: Optimal XML pattern matching. In Proc. the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD2002), Madison, USA, June 3–6, 2002, pp.310–321.

  3. Jiang H, Wang W, Lu H, Yu J X. Holistic twig joins on indexed XML documents. In Proc. the 29th International Conference on Very Large Data Bases (VLDB2003), Berlin, Germany, Sept. 12–13, 2003, pp.273–284.

  4. Lu J, Ling T W, Chan C Y, Chen T. From region encoding to extended dewey: On efficient processing of XML twig pattern matching. In Proc. the 31st International Conference on Very Large Data Bases (VLDB2005), Trondheim, Norway, Aug. 30-Sept. 2, 2005, pp.193–204.

  5. Chen T, Lu J, Ling T W. On boosting holism in XML twig pattern matching using structural indexing techniques. In Proc. the ACM SIGMOD International Conference on Management of Data (SIGMOD2005), Baltimore, USA, June 13–16, 2005, pp.455–466.

  6. Chen S, Li H, Tatemura J et al. Twig2Stack: Bottom-up processing of generalized-tree-pattern queries over XML documents. In Proc. the 32nd International Conference on Very Large Data Bases (VLDB2006), Seoul, Korea, Sept. 12–15, 2006, pp.283–294.

  7. Xu Y, Papakonstantinou Y. Efficient keyword search for smallest LCAs in XML databases. In Proc. the ACM SIGMOD International Conference on Management of Data (SIGMOD2005), Baltimore, USA, June 13–16, 2005, pp.527–538.

  8. Guo L, Shao F, Botev C, Shanmugasunda J. XRANK: Ranked keyword search over XML documents. In Proc. the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD2003), San Diego, USA, June 9–12, 2003, pp.16–27.

  9. Cohen S, Mamou J, Kanza Y, Sagiv Y. XSEarch: A semantic search engine for XML. In Proc. the 29th International Conference on Very Large Data Bases (VLDB2003), Berlin, Germany, Sept. 12–13, 2003, pp.45–56.

  10. Hristidis V, Papakonstantinou Y, Balmin A. Keyword proximity search on XML graphs. In Proc. the 19th International Conference on Data Engineering (ICDE2003), Bangalor, India, March 5–8, 2003, pp.367–378.

  11. Liu Z, Chen Y. Reasoning and identifying relevant matches for XML keyword search. In Proc. the VLDB Endowment, Aug. 2008, 1(1): 921–932.

  12. Cohen S, Kanza Y, Kimelfeld B, Sagir Y. Interconnection semantics for keyword search in XML. In Proc. the 14th ACM CIKM International Conference on Information and Knowledge Management (CIKM2005), Bremen, Germany, Oct. 31-Nov. 5, 2005, pp.389–396.

  13. Liu Z, Chen Y. Identifying meaningful return information for XML keyword search. In Proc. International Conference on Management of Data (SIGMOD2007), Beijing, China, June 12–14, 2007, pp.329–340.

  14. Li G, Feng J, Wang J, Zhou L. Effective keyword search for valuable LCAs over XML documents. In Proc. the 6th ACM Conf. Information and Knowledge Management (CIKM2007), Lisbon, Portugal, Nov. 6–9, 2007, pp.31–40.

  15. Li Y, Yu C, Jagadish H V. Schema-free XQuery. In Proc. the 30th International Conf. Very Large Data Bases (VLDB2004), Toronto, Canada, Aug. 29-Sept. 3, 2004, pp.72–83.

  16. Yu C, Jagadish H V. Querying complex structured databases. In Proc. the 33rd International Conf. Very Large Data Bases (VLDB2007), Vienna, Austria, Sept. 23–28, 2007, pp.1010–1021.

  17. Chen L J, Papakonstantinou Y. Supporting top-K keyword search in XML databases. In Proc. the 26th International Conference on Data Engineering (ICDE2010), Long Beach, USA, March 1–6, 2010, pp.689–700.

  18. Zhou R, Liu C, Li J. Fast ELCA computation for keyword queries on XML data. In Proc. the 13th International Conference on Extending Database Technology (EDBT2010), Lausanne, Switzerland, March 22–26, 2010, pp.549–560.

  19. Kong L, Gilleron R, Lemay A. Retrieving meaningful relaxed tightest fragments for XML keyword search. In Proc. the 12th International Conference on Extending Database Technology (EDBT2009), Saint-Petersburg, Russia, Mar. 23–26, 2009, pp.815–826.

  20. Bao Z, Ling T W, Chen B, Lu J. Effective XML keyword search with relevance oriented ranking. In Proc. the 25th International Conference on Data Engineering (ICDE2009), Shanghai, China, March 29-April 2, 2009, pp.517–528.

  21. Li J, Liu C, Zhou R, Wang W. Suggestion of promising result types for XML keyword search. In Proc. the 13th International Conference on Extending Database Technology (EDBT2010), Lausanne, Switzerland, Mar. 22–26, 2010, pp.561–572.

  22. Li G, Li C, Feng J, Zhou L. SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents. Inf. Sci., 2009, 179(21): 3745–3762.

    Article  Google Scholar 

  23. Feng J, Li G, Wang J, Zhou L. Finding and ranking compact connected trees for effective keyword proximity search in XML documents. Inf. Syst., 2010, 35(2): 186–203.

    Article  Google Scholar 

  24. Trotman A, Sigurbjörnsson B. NEXI, now and next. In Proc. the 3rd International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX2004), Dagstuhl Castle, Germany, Dec. 6–8, 2004, pp.41–53.

  25. Fuhr N, Groβjohann K. XIRQL: A query language for information retrieval in XML documents. In Proc. the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2001), New Orleans, USA, Sept. 9–21, 2001, pp.172–180.

  26. Theobald M, Schenkel R, Weikum G. An efficient and versatile query engine for Topx search. In Proc. the 31st International Conference on Very Large Data Bases (VLDB2005), Trondheim, Norway, Aug. 30-Sept. 2, 2005, pp.625–636.

  27. Yu C, Jagadish H V. Schema summarization. In Proc. the 32nd International Conference on Very Large Data Bases (VLDB2006), Seoul, Korea, Sept. 12–15, 2006, pp.319–330.

  28. Hristidis V, Papakonstantinou Y. DISCOVER: Keyword search in relational databases. In Proc. the 28th International Conference on Very Large Data Bases (VLDB2002), Hong Kong, China, Aug. 20–23, 2002, pp.670–681.

  29. Pal S, Cseri I, Seeliger O, Schaller G, Giakoumakis L, Zolotov V. Indexing XML data stored in a relational database. In Proc. the 30th International Conference on Very Large Data Bases (VLDB2004), Toronto, Canada, Aug. 29-Sept. 3, 2004, pp.1134–1145.

  30. Bex G J, Neven F, Vansummeren S. Inferring XML schema definitions from XML data. In Proc. the 33rd International Conference on Very Large Data Bases (VLDB2007), Vienna, Austria, Sept. 23–28, 2007, pp.998–1009.

  31. Bex G J, Neven F, Schwentick T, Tuyls K. Inference of concise DTDs from XML data. In Proc. the 32nd International Conference on Very Large Data Bases (VLDB2006), Seoul, Korea, Sept. 12–15, 2006, pp.115–126.

  32. Bernstein P A, Melnik S, Mork P. Interactive schema translation with instance-level mappings. In Proc. the 31st International Conference on Very Large Data Bases (VLDB2005), Trondheim, Norway, Aug. 30-Sept. 2, 2005, pp.1283–1286.

  33. Vries A P, Vercoustre A M, Thom J A, Craswell N, Lalmas M. Overview of the INEX 2007 entity ranking track. In Proc. the 6th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX2007), Germany: Springer, 2007, pp.245–251.

  34. Golenberg K, Kimelfeld B, Sagir Y. Keyword proximity search in complex data graphs. In Proc. the ACM SIGMOD International Conference on Management of Data (SIGMOD2008), Vancouver, Canada, June 10–12, 2008, pp.927–940.

  35. Reich G, Widmayer P. Beyond Steiner's problem: A VLSI oriented generalization. In Proc. the 15th International Workshop on Graph-Theoretic Concepts in Computer Science (WG1989), Castle Rolduc, Netherlands, June 14–16, 1989, pp.196–210.

  36. Yu J X, Qin L, Chang L. Keyword search in relational databases: A survey. IEEE Data Eng. Bull., 2010, 33(1): 67–78.

    Google Scholar 

  37. Chen Y, Wang W, Liu Z, Lin X. Keyword search on structured and semi-structured data. In Proc. the International Conference on Management of Data (SIGMOD2009), Providence, USA, June 29-July 2, 2009, pp.1005–1010.

  38. Amer-Yahiq S, Kondas N, Marian A, Srivastava D, Toman D. Structure and content scoring for XML. In Proc. the 31st International Conference on Very Large Data Bases (VLDB2005), Trondheim, Norway, Aug. 30-Sept. 2, 2005, pp.361–372.

  39. Zhang S, Dyreson C. Symmetrically exploiting XML. In Proc. the 15th International Conference on World Wide Web (WWW2006), Edinburgh, UK, May 22–26, 2006, pp.103–111.

  40. Botan I, Fischer P M, Florescu D, Kossmann D, Kraska T, Tamosevicius R. Extending XQuery with window functions. In Proc. the 33rd International Conference on Very Large Data Bases (VLDB2007), Vienna, Austria, Sept. 23–28, 2007, pp.75–86.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun-Feng Zhou.

Additional information

This research was partially supported by the National Science and Technology Major Project of China under Grant No. 2010ZX01042-002-003, the National Natural Science Foundation of China under Grant Nos. 61073060, 61040023, 61070055, 91024032, the Fundamental Research Funds for the Central Universities of China, and the Research Funds of Renmin University of China under Grant No. 10XNI018.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 111 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, JF., Ling, T.W., Bao, ZF. et al. Related Axis: The Extension to XPath Towards Effective XML Search. J. Comput. Sci. Technol. 27, 195–212 (2012). https://doi.org/10.1007/s11390-012-1217-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-012-1217-0

Keywords

Navigation