Skip to main content

Selecting Effective Terms for Query Formulation

  • Conference paper
Book cover Information Retrieval Technology (AIRS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5839))

Included in the following conference series:

Abstract

It is difficult for users to formulate appropriate queries for search. In this paper, we propose an approach to query term selection by measuring the effectiveness of a query term in IR systems based on its linguistic and statistical properties in document collections. Two query formulation algorithms are presented for improving IR performance. Experiments on NTCIR-4 and NTCIR-5 ad-hoc IR tasks demonstrate that the algorithms can significantly improve the retrieval performance by 9.2% averagely, compared to the performance of the original queries given in the benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan, J., Callan, J., Croft, W.B., Ballesteros, L., Broglio, J., Xu, J., Shu, H.: INQUERY at TREC-5. In: Fifth Text REtrieval Conference (TREC-5), pp. 119–132 (1997)

    Google Scholar 

  2. Amati, G., Carpineto, C., Romano, G.: Query difficulty, robustness, and selective application of query expansion. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 127–137. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: 31st annual international ACM SIGIR, pp. 491–498 (2008)

    Google Scholar 

  4. Cao, G., Nie, J.Y., Gao, J.F., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: 31st annual international ACM SIGIR, pp. 243–250 (2008)

    Google Scholar 

  5. Carmel, D., Yom-Tov, E., Soboroff, I.: SIGIR Workshop Report: Predicting Query Difficulty - Methods and Applications. In: Workshop Session: SIGIR, pp. 25–28 (2005)

    Google Scholar 

  6. Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: 29th annual international ACM SIGIR, pp. 390–397 (2006)

    Google Scholar 

  7. Carmel, D., Farchi, E., Petruschka, Y., Soffer, A.: Automatic query refinement using lexical affinities with maximal information gain. In: 25th annual international ACM SIGIR, pp. 283–290 (2002)

    Google Scholar 

  8. Chang, C.C., Lin, C.J.: LIBSVM (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm

  9. He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: 11th International Conference of String Processing and Information Retrieval, pp. 43–54 (2004)

    Google Scholar 

  10. Jones, R., Fain, D.C.: Query word deletion prediction. In: 26th annual international ACM SIGIR, pp. 435–436 (2003)

    Google Scholar 

  11. Kumaran, G., Allan, J.: Effective and efficient user interaction for long queries. In: 31st annual international ACM SIGIR, pp. 11–18 (2008)

    Google Scholar 

  12. Kumaran, G., Allan, J.: Adapting information retrieval systems to user queries. In: Information Processing and Management, pp. 1838–1862 (2008)

    Google Scholar 

  13. Kwok, K.L.: A new method of weighting query terms for ad-hoc retrieval. In: 19th annual international ACM SIGIR, pp. 187–195 (1996)

    Google Scholar 

  14. Lioma, C., Ounis, I.: Examining the content load of part of speech blocks for information retrieval. In: COLING/ACL 2006 Main Conference Poster Sessions (2006)

    Google Scholar 

  15. Mandl, T., Womser-Hacker, C.: Linguistic and statistical analysis of the CLEF topics. In: Third Workshop of the Cross-Language Evaluation Forum CLEF (2002)

    Google Scholar 

  16. Mothe, J., Tanguy, L.: ACM SIGIR 2005 Workshop on Predicting Query Difficulty - Methods and Applications (2005)

    Google Scholar 

  17. Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)

    MATH  Google Scholar 

  18. Yom-Tov, E., Fine, S., Carmel, D., Darlow, A., Amitay, E.: Juru at TREC 2004: Experiments with prediction of query difficulty. In: 13th Text Retrieval Conference (2004)

    Google Scholar 

  19. Zhou, Y., Croft, W.B.: Query performance prediction in Web search environments. In: 30th Annual International ACM SIGIR Conference, pp. 543–550 (2007)

    Google Scholar 

  20. Zhou, Y., Croft, W.B.: Ranking Robustness: A novel framework to predict query performance. In: 15th ACM international conference on Information and knowledge management, pp. 567–574 (2006)

    Google Scholar 

  21. The Lemur Toolkit: http://www.lemurproject.org/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, CJ., Lin, YC., Chen, RC., Cheng, PJ. (2009). Selecting Effective Terms for Query Formulation. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04769-5_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04768-8

  • Online ISBN: 978-3-642-04769-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics