Skip to main content

Effective Use of Semantic Structure in XML Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4425))

Abstract

The objective of XML retrieval is to return relevant XML document fragments that answer a given user information need, by exploiting the document structure. The focus in this article is on automatically deriving and using semantic XML structure to enhance the retrieval performance of XML retrieval systems. Based on a naive approach for named entity detection, we discuss how the structure of an XML document can be enriched using the Reuters 21587 news collection.

Based on a retrieval performance experiment, we study the effect of the additional semantic structure on the retrieval performance of our XSee search engine for XML documents. The experiment provides some initial evidence that an XML retrieval system significantly benefits from having meaningful XML structure.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alias-i. Lingpipe (2006), http://www.alias-i.com/lingpipe/

  2. Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR’04, Sheffield, United Kingdom, p. 25. ACM Press, New York (2004)

    Google Scholar 

  3. Ciaramita, M., Altun, Y.: Named-Entity Recognition in Novel Domains with External Lexical Knowledge. In: Workshop on Advances in Structured Learning for Text and Speech Processing (NIPS 2005) (2005)

    Google Scholar 

  4. Fuhr, N., et al. (eds.): INEX 2005. LNCS, vol. 3977. Springer, Heidelberg (2006)

    Google Scholar 

  5. Geva, S.: GPX: Gardens point XML information retrieval at INEX 2005. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Ramirez, G., Westerveld, T., de Vries, A.P.: Using small XML elements to support relevance. In: SIGIR’06, Seattle, Washington, USA, pp. 693–694. ACM Press, New York (2006)

    Google Scholar 

  7. Trotman, A., Lalmas, M.: Why Structural Hints in Queries do not Help XML-Retrieval. In: SIGIR’06, Seatle, Washington, USA, Aug. 2006, ACM Press, New York (2006)

    Google Scholar 

  8. Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (NEXI). In: Fuhr, N., et al. (eds.) INEX 2004. LNCS, vol. 3493, pp. 16–40. Springer, Heidelberg (2005)

    Google Scholar 

  9. van Loosbroek, T.M.: An ad hoc approach for creating a semantic enhanced document collection. Master’s thesis, Department of Computer Sciences, Utrecht University (April 2006)

    Google Scholar 

  10. van Oostendorp, H., van Zwol, R.: Google’s ”I’m feeling lucky”, truly a gamble? In: Zhou, X., et al. (eds.) WISE 2004. LNCS, vol. 3306, pp. 378–390. Springer, Heidelberg (2004)

    Google Scholar 

  11. van Zwol, R.: B3-SDR and effective use of structural hints. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, Springer, Heidelberg (2006)

    Google Scholar 

  12. van Zwol, R.: XSee: Structure Xposed. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Giambattista Amati Claudio Carpineto Giovanni Romano

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

van Zwol, R., van Loosbroek, T. (2007). Effective Use of Semantic Structure in XML Retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71496-5_59

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71494-1

  • Online ISBN: 978-3-540-71496-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics