Selecting Effective Terms for Query Formulation

Lee, Chia-Jung; Lin, Yi-Chun; Chen, Ruey-Cheng; Cheng, Pu-Jen

doi:10.1007/978-3-642-04769-5_15

Chia-Jung Lee²³,
Yi-Chun Lin²³,
Ruey-Cheng Chen²³ &
…
Pu-Jen Cheng²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5839))

Included in the following conference series:

Asia Information Retrieval Symposium

870 Accesses
11 Citations

Abstract

It is difficult for users to formulate appropriate queries for search. In this paper, we propose an approach to query term selection by measuring the effectiveness of a query term in IR systems based on its linguistic and statistical properties in document collections. Two query formulation algorithms are presented for improving IR performance. Experiments on NTCIR-4 and NTCIR-5 ad-hoc IR tasks demonstrate that the algorithms can significantly improve the retrieval performance by 9.2% averagely, compared to the performance of the original queries given in the benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allan, J., Callan, J., Croft, W.B., Ballesteros, L., Broglio, J., Xu, J., Shu, H.: INQUERY at TREC-5. In: Fifth Text REtrieval Conference (TREC-5), pp. 119–132 (1997)
Google Scholar
Amati, G., Carpineto, C., Romano, G.: Query difficulty, robustness, and selective application of query expansion. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 127–137. Springer, Heidelberg (2004)
Chapter Google Scholar
Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: 31st annual international ACM SIGIR, pp. 491–498 (2008)
Google Scholar
Cao, G., Nie, J.Y., Gao, J.F., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: 31st annual international ACM SIGIR, pp. 243–250 (2008)
Google Scholar
Carmel, D., Yom-Tov, E., Soboroff, I.: SIGIR Workshop Report: Predicting Query Difficulty - Methods and Applications. In: Workshop Session: SIGIR, pp. 25–28 (2005)
Google Scholar
Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: 29th annual international ACM SIGIR, pp. 390–397 (2006)
Google Scholar
Carmel, D., Farchi, E., Petruschka, Y., Soffer, A.: Automatic query refinement using lexical affinities with maximal information gain. In: 25th annual international ACM SIGIR, pp. 283–290 (2002)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: 11th International Conference of String Processing and Information Retrieval, pp. 43–54 (2004)
Google Scholar
Jones, R., Fain, D.C.: Query word deletion prediction. In: 26th annual international ACM SIGIR, pp. 435–436 (2003)
Google Scholar
Kumaran, G., Allan, J.: Effective and efficient user interaction for long queries. In: 31st annual international ACM SIGIR, pp. 11–18 (2008)
Google Scholar
Kumaran, G., Allan, J.: Adapting information retrieval systems to user queries. In: Information Processing and Management, pp. 1838–1862 (2008)
Google Scholar
Kwok, K.L.: A new method of weighting query terms for ad-hoc retrieval. In: 19th annual international ACM SIGIR, pp. 187–195 (1996)
Google Scholar
Lioma, C., Ounis, I.: Examining the content load of part of speech blocks for information retrieval. In: COLING/ACL 2006 Main Conference Poster Sessions (2006)
Google Scholar
Mandl, T., Womser-Hacker, C.: Linguistic and statistical analysis of the CLEF topics. In: Third Workshop of the Cross-Language Evaluation Forum CLEF (2002)
Google Scholar
Mothe, J., Tanguy, L.: ACM SIGIR 2005 Workshop on Predicting Query Difficulty - Methods and Applications (2005)
Google Scholar
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)
MATH Google Scholar
Yom-Tov, E., Fine, S., Carmel, D., Darlow, A., Amitay, E.: Juru at TREC 2004: Experiments with prediction of query difficulty. In: 13th Text Retrieval Conference (2004)
Google Scholar
Zhou, Y., Croft, W.B.: Query performance prediction in Web search environments. In: 30th Annual International ACM SIGIR Conference, pp. 543–550 (2007)
Google Scholar
Zhou, Y., Croft, W.B.: Ranking Robustness: A novel framework to predict query performance. In: 15th ACM international conference on Information and knowledge management, pp. 567–574 (2006)
Google Scholar
The Lemur Toolkit: http://www.lemurproject.org/

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University, Taiwan
Chia-Jung Lee, Yi-Chun Lin, Ruey-Cheng Chen & Pu-Jen Cheng

Authors

Chia-Jung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Chun Lin
View author publications
You can also search for this author in PubMed Google Scholar
Ruey-Cheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Pu-Jen Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, 790-784, Pohang, Korea
Gary Geunbae Lee
School of Computing, The Robert Gordon University, St Andrew Street, AB25 1HG, Aberdeen, UK
Dawei Song
Microsoft Reseach Asia, 5F Beijing Sigma Center, 49 Zhichun Road, Haidian District, 100190, Beijing, P.R. China
Chin-Yew Lin
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Akiko Aizawa
School of Literature, Shirayuri College, 1-25 Midorigaoka, Chofu-shi, 182-8525, Tokyo, Japan
Kazuko Kuriyama
Graduate School of Information Science and Technology, Hokkaido University, North 14 West 9, Kita-ku. Sapporo-shi, 060-0814, Hokkaido, Japan
Masaharu Yoshioka
Microsoft Research Asia, 5F Beijing Sigma Center, 49 Zhichun Road, Haidian District, 100190, Beijing, P.R. China
Tetsuya Sakai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, CJ., Lin, YC., Chen, RC., Cheng, PJ. (2009). Selecting Effective Terms for Query Formulation. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-04769-5_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04768-8
Online ISBN: 978-3-642-04769-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics