Skip to main content

Combining Strategies for XML Retrieval

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6932))

Abstract

This paper describes Peking University’s approaches to the Ad Hoc, Data Centric and Relevance Feedback track. In Ad Hoc track, results for four tasks were submitted, Efficiency, Restricted Focused, Relevance In Context and Restricted Relevance In Context. To evaluate the relevance between documents and a given query, multiple strategies, such as Two-Step retrieval, MAXLCA query results, BM25, distribution measurements and learn-to-optimize method are combined to form a more effective search engine. In Data Centric track, to gain a set of closely related nodes that are collectively relevant to a given keyword query, we promote three factors, correlation, explicitnesses and distinctiveness. In Relevance Feedback track, to obtain useful information from feedbacks, our implementation employs two techniques, a revised Rocchio algorithm and criterion weight adjustment.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www.inex.otago.ac.nz/

  2. Carmel, D., Maarek, Y.S., Mandelbrod, M., et al.: Searching XML documents via XML fragments. In: SIGIR 2003, pp. 151–158 (2003)

    Google Scholar 

  3. Gao, N., Deng, Z.H., Jiang, J.J., Xiang, Y.Q., Yu, H.: MAXLCA A Semantic XML Search Model Using Keywords. Technical Report

    Google Scholar 

  4. Huang, Y., Liu, Z., Chen, Y.: eXtract: A Snippet Generation System for XML Search. In: VLDB 2008, pp. 1392–1395 (2008)

    Google Scholar 

  5. Theobald, M., Schenkel, R., Wiekum, G.: An Efficient and Versatile Query Engine for TopX Search. In: VLDB 2005, pp. 625–636 (2005)

    Google Scholar 

  6. Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked Keyword Search over XML Documents. In: SIGMOD 2003, pp. 16–27 (2003)

    Google Scholar 

  7. Xu, Y., Papakonstantinou, Y.: Efficient Keyword Search for Smallest LCAs in XML Databases. In: SIGMOD 2005, pp. 537–538 (2005)

    Google Scholar 

  8. Liu, Z., Chen, Y.: Identifying Meaningful Return Information for XML Keyword Search. In: SIGMOD 2007, pp. 329–340 (2007)

    Google Scholar 

  9. Gao, N., Deng, Z.H., Yu, H., Jiang, J.J.: ListOPT: A Learning to Optimize Method for XML Ranking. In: PAKDD 2010 (2010)

    Google Scholar 

  10. Liu, Z., Chen, Y.: Identifying Meaningful Return Information for XML Keyword Search. In: SIGMOD 2007, pp. 329–340 (2007)

    Google Scholar 

  11. Huang, Y., Liu, Z.Y., Chen, Y.: eXtract: A Snippet Generation System for XML Search. In: VLDB 2008, pp. 1392–1395 (2008)

    Google Scholar 

  12. Jiang, J., Deng, Z.H., Gao, N., Lv, S.L., Yu, H.: MRepA: Extracting the Most Representative Attributes in XML Keyword Search. Technical Report

    Google Scholar 

  13. Ruthven, I., Lalmas, M.: A survey on the use of relevance feedback for information access systems. The Knowledge Engineering Review 18(2), 95–145 (2003)

    Article  Google Scholar 

  14. Ide, E.: New experiments in relevance feedback. In: Salton, G. (ed.) The SMART Retrieval System Experiments in Automatic Document Processing, ch. 16, pp. 337–354 (1971)

    Google Scholar 

  15. Ide, E., Salton, G.: Interactive search strategies and dynamic file organization in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System - Experiments in Automatic Document Processing, ch.18, pp. 373–393 (1971)

    Google Scholar 

  16. Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. Journal of the American Society of Information Science 27(3), 129–146 (1976)

    Article  Google Scholar 

  17. Zhai, C., Lafferty, J.D.: Model-basedfeedback in the language modeling approach toinformation retrieval. In: CIKM 2001, pp. 403–410 (2001)

    Google Scholar 

  18. Lavrenko, V., Bruce Croft, W.: Relevance-basedlanguage models. In: SIGIR 2001, pp. 120–127 (2001)

    Google Scholar 

  19. Geva, S., Kamps, J., Lethonen, M., Schenkel, R., Thom, J.A., Trotman, A.: Overview of the INEX 2009 ad hoc track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 4–25. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Zhao, J., Yun, Y.: A proximity language model for information retrieval. In: SIGIR 2009, pp. 291–298 (2009)

    Google Scholar 

  21. Xu, J., Croft, W.B.: Improving the effectiveness of information retrieval with local context analysis. In: TOIS 2000, pp. 79–112 (2000)

    Google Scholar 

  22. van Rijsbergen, C.J.: Information Retireval. Butterworths, London (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gao, N., Deng, ZH., Jiang, JJ., Lv, SL., Yu, H. (2011). Combining Strategies for XML Retrieval. In: Geva, S., Kamps, J., Schenkel, R., Trotman, A. (eds) Comparative Evaluation of Focused Retrieval. INEX 2010. Lecture Notes in Computer Science, vol 6932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23577-1_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23577-1_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23576-4

  • Online ISBN: 978-3-642-23577-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics