Exploiting Semantic Tags in XML Retrieval

Wang, Qiuyue; Li, Qiushi; Wang, Shan; Du, Xiaoyong

doi:10.1007/978-3-642-14556-8_15

Exploiting Semantic Tags in XML Retrieval

Qiuyue Wang¹⁹,
Qiushi Li¹⁹,
Shan Wang¹⁹ &
…
Xiaoyong Du¹⁹

Conference paper

573 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6203))

Abstract

With the new semantically annotated Wikipedia XML corpus, we attempt to investigate the following two research questions. Do the structural constraints in CAS queries help in retrieving an XML document collection containing semantically rich tags? How to exploit the semantic tag information to improve the CO queries as most users prefer to express the simplest forms of queries? In this paper, we describe and analyze the work done on comparing CO and CAS queries over the document collection at INEX 2009 ad hoc track, and we propose a method to improve the effectiveness of CO queries by enriching the element content representations with semantic tags. Our results show that the approaches of enriching XML element representations with semantic tags are effective in improving the early precision, while on average precisions, strict interpretation of CAS queries are generally superior.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chu-Carroll, J., Prager, J., Czuba, K., Ferrucci, D., Duboue, P.: Semantic Search via XML Fragments: A High-Precision Approach to IR. In: SIGIR 2006 (2006)
Google Scholar
Carmel, D., Maarek, Y.S., Mandelbrod, M., et al.: Searching XML documents via XML fragments. In: SIGIR 2003 (2003)
Google Scholar
Trotman, A., Sigurbjörnsson, B.: Narrowed extended xPath I (NEXI). In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 16–40. Springer, Heidelberg (2005)
Chapter Google Scholar
Lemur/Indri, http://www.lemurproject.org
XQuery Full-Text, http://www.w3.org/TR/xpath-full-text-10/
Trotman, A., Lalmas, M.: Why Structural Hints in Queries do not Help XML-Retrieval? In: SIGIR 2006 (2006)
Google Scholar
Schenkel, R., Suchanek, F., Kasneci, G.: YAWN: A Semantically Annotated Wikipedia XML Corpus. In: BTW 2007 (2007)
Google Scholar
Hiemstra, D.: Statistical Language Models for Intelligent XML Retrieval. In: Blanken, H., et al. (eds.) Intelligent Search on XML Data. LNCS, vol. 2818, pp. 107–118. Springer, Heidelberg (2003)
Chapter Google Scholar
Ogilvie, P., Callan, J.: Language Models and Structured Document Retrieval. In: INEX 2003 (2003)
Google Scholar
Ogilvie, P., Callan, J.: Hierarchical Language Models for XML Component Retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 224–237. Springer, Heidelberg (2005)
Chapter Google Scholar
Ogilvie, P., Callan, J.: Parameter Estimation for a Simple Hierarchical Generative Model for XML Retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 211–224. Springer, Heidelberg (2006)
Google Scholar
Zhai, C.: Statistical Language Models for Information Retrieval: A Critical Review. Foundations and Trends in Information Retrieval 2(3) (2008)
Google Scholar
Zhai, C., Lafferty, J.: A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In: SIGIR 2001 (2001)
Google Scholar
Zhai, C., Lafferty, J.: Two-Stage Language Models for Information Retrieval. In: SIGIR 2002 (2002)
Google Scholar
Mei, Q., Zhang, D., Zhai, C.: A General Optimization Framework for Smoothing Language Models on Graph Structures. In: SIGIR 2008 (2008)
Google Scholar
Wang, Q., Li, Q., Wang, S.: Preliminary Work on XML Retrieval. In: Pre-Proceedings of INEX 2007 (2007)
Google Scholar
Pektova, D., Croft, W.B., Diao, Y.: Refining Keyword Queries for XML Retrieval by Combining Content and Structure. In: ECIR 2009 (2009)
Google Scholar
Kim, J., Xue, X., Croft, W.B.: A Probabilistic Retrieval Model for Semistructured Data. In: ECIR 2009 (2009)
Google Scholar
Bo, Z., Ling, T.W., Chen, B., Lu, J.: Effective XML Keyword Search with Relevance Oriented Ranking. In: ICDE 2009 (2009)
Google Scholar
Metzler, D., Novak, J., Cui, H., Reddy, S.: Building Enriched Document Representations using Aggregated Anchor Text. In: SIGIR 2009 (2009)
Google Scholar
Kamps, J., Marx, M., de Rijke, M., Sigurbjörnsson, B.: Structured Queries in XML Retrieval. In: CIKM 2005 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Renmin University of China, Beijing, 100872, P.R. China
Qiuyue Wang, Qiushi Li, Shan Wang & Xiaoyong Du

Authors

Qiuyue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qiushi Li
View author publications
You can also search for this author in PubMed Google Scholar
Shan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyong Du
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Science and Technology, Queensland University of Technology, GPO Box 2434, 4001, Brisbane, Qld, Australia
Shlomo Geva
Archives and Information Studies/Humanities, University of Amsterdam, Turfdraagsterpad 9, 1012 XT, Amsterdam, The Netherlands
Jaap Kamps
Department of Computer Science, University of Otago, P.O. Box 56,, 9054, Dunedin, New Zealand
Andrew Trotman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Q., Li, Q., Wang, S., Du, X. (2010). Exploiting Semantic Tags in XML Retrieval. In: Geva, S., Kamps, J., Trotman, A. (eds) Focused Retrieval and Evaluation. INEX 2009. Lecture Notes in Computer Science, vol 6203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14556-8_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-14556-8_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14555-1
Online ISBN: 978-3-642-14556-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics