Skip to main content

Query Expansion Using External Evidence

  • Conference paper
Advances in Information Retrieval (ECIR 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5478))

Included in the following conference series:

Abstract

Automatic query expansion may be used in document retrieval to improve search effectiveness. Traditional query expansion methods are based on the document collection itself. For example, pseudo-relevance feedback (PRF) assumes that the top retrieved documents are relevant, and uses the terms extracted from those documents for query expansion. However, there are other sources of evidence that can be used for expansion, some of which may give better search results with greater efficiency at query time. In this paper, we use the external evidence, especially the hints obtained from external web search engines to expand the original query. We explore 6 different methods using search engine query log, snippets and search result documents. We conduct extensive experiments, with state of the art PRF baselines and careful parameter tuning, on three TREC collections: AP, WT10g, GOV2. Log-based methods do not show consistent significant gains, despite being very efficient at query-time. Snippet-based expansion, using the summaries provided by an external search engine, provides significant effectiveness gains with good efficiency at query-time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bai, J., Song, D., Bruza, P., Nie, J.-Y., Cao, G.: Query expansion using term relationships in language models for information retrieval. In: Proceedings of the 14th ACM CIKM conference, Bremen, Germany, pp. 688–695 (2005)

    Google Scholar 

  2. Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of the sixth ACM SIGKDD conference, Boston, MA, pp. 407–416 (2000)

    Google Scholar 

  3. Billerbeck, B., Scholer, F., Williams, H.E., Zobel, J.: Query expansion using associated queries. In: Proceedings of the twelfth ACM CIKM conference, New Orleans, LA, pp. 2–9 (2003)

    Google Scholar 

  4. Buscher, G., Dengel, A., van Elst, L.: Query expansion using gaze-based feedback on the subdocument level. In: Proceedings of the 31st ACM SIGIR conference, Singapore, pp. 387–394 (2008)

    Google Scholar 

  5. Cao, G., Nie, J.-Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st ACM SIGIR conference, Singapore, pp. 243–250 (2008)

    Google Scholar 

  6. Cao, G., Robertson, S., Nie, J.-Y.: Selecting query term alternations for web search by exploiting query contexts. In: Proceedings of ACL 2008: HLT, Columbus, Ohio, pp. 148–155. Association for Computational Linguistics (June 2008)

    Google Scholar 

  7. Collins-Thompson, K., Callan, J.: Query expansion using random walk models. In: Proceedings of the 14th ACM CIKM conference, Bremen, Germany, pp. 704–711 (2005)

    Google Scholar 

  8. Craswell, N., Szummer, M.: Random walks on the click graph. In: Proceedings of the 30th ACM SIGIR conference, Amsterdam, The Netherlands, pp. 239–246 (2007)

    Google Scholar 

  9. Cui, H., Wen, J.-R., Nie, J.-Y., Ma, W.-Y.: Probabilistic query expansion using query logs. In: Proceedings of the 11th WWW conference, Honolulu, HI, pp. 325–332 (2002)

    Google Scholar 

  10. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems 20(4), 422–446 (2002)

    Article  Google Scholar 

  11. Rocchio Jr., J.J.: The smart retrieval system: Experiments in automatic document processing. In: Relevance feedback in information retrieval, pp. 313–323 (1971)

    Google Scholar 

  12. Lavrenko, V., Croft, W.B.: Relevance based language models. In: Proceedings of the 24th ACM SIGIR conference, New Orleans, LA, pp. 120–127 (2001)

    Google Scholar 

  13. Metzler, D., Croft, W.B.: Latent concept expansion using markov random fields. In: Proceedings of the 30th ACM SIGIR conference, Amsterdam, The Netherlands, pp. 311–318 (2007)

    Google Scholar 

  14. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st ACM SIGIR conference, Melbourne, Australia, pp. 275–281 (1998)

    Google Scholar 

  15. Robertson, S.: On gmap: and other transformations. In: Proceedings of the 15th ACM CIKM conference, Arlington, VA, pp. 78–83 (2006)

    Google Scholar 

  16. Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th WWW conference, Edinburgh, Scotland, pp. 377–386 (2006)

    Google Scholar 

  17. Tao, T., Zhai, C.: Regularized estimation of mixture models for robust pseudo-relevance feedback. In: Proceedings of the 29th ACM SIGIR conference, Seattle, WA, pp. 162–169. ACM, New York (2006)

    Google Scholar 

  18. Voorhees, E.: The TREC robust retrieval track. In: ACM SIGIR Forum, vol. 39, pp. 11–20. ACM, New York (2005)

    Google Scholar 

  19. Wang, X., Fang, H., Zhai, C.: A study of methods for negative relevance feedback. In: Proceedings of the 31st ACM SIGIR conference, Singapore, pp. 219–226 (2008)

    Google Scholar 

  20. Wen, J., Nie, J., Zhang, H.: Query clustering using user logs. ACM Transactions on Information Systems 20(1), 59–81 (2002)

    Article  Google Scholar 

  21. Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: Proceedings of the 19th ACM SIGIR conference, Zurich, Switzerland, pp. 4–11 (1996)

    Google Scholar 

  22. Xu, Z., Akella, R.: A bayesian logistic regression model for active relevance feedback. In: Proceedings of the 31st ACM SIGIR conference, Singapore, pp. 227–234 (2008)

    Google Scholar 

  23. Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of the tenth ACM CIKM conference, Atlanta, GA, pp. 403–410 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yin, Z., Shokouhi, M., Craswell, N. (2009). Query Expansion Using External Evidence. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00958-7_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00957-0

  • Online ISBN: 978-3-642-00958-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics