Abstract
We explore the use of phrase and proximity terms in the context of web retrieval, which is different from traditional ad-hoc retrieval both in document structure and in query characteristics. We show that for this type of task, the usage of both phrase and proximity terms is highly beneficial for early precision as well as for overall retrieval effectiveness. We also analyze why phrase and proximity terms are far more effective for web retrieval than for ad-hoc retrieval.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ahn, D., Jijkoun, V., Kamps, J., Mishne, G., Müller, K., de Rijke, M., Schlobach, S.: The University of Amsterdam at TREC 2004. In: TREC 2004 Conference Notebook, Gaithersburg, Maryland USA (2004)
Amitay, E., Carmel, D., Darlow, A., Herscovici, M., Kraft, R., Lempel, R., Soffer, A., Zien, J.: Juru at TREC 2003 - Topic Distillation using Query-Sensitive Tuning and Cohesiveness Filtering. In: Proceedings of the 12th Text REtrieval Conference (2003)
Arampatzis, A.T., van der Weide, T.P., Koster, C.H.A., van Bommel, P.: An Evaluation of Linguistically-motivated Indexing Schemes. In: Proceedings of the 22nd BCS-IRSG Colloquium on IR Research (2000)
Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press / Addison-Wesley (1999)
Bartell, B.T., Cottrell, G.W., Belew, R.K.: Automatic Combination of Multiple Ranked Retrieval Systems. In: Research and Development in Information Retrieval, pp. 173–181 (1994)
Brill, E., Dumais, S., Banko, M.: An analysis of the AskMSR question-answering system. In: Proceedings 39th Annual ACL (2002)
Cacheda, F., Vina, A.: Understanding how people use search engines: a statistical analysis for e-business. In: Proceedings of the e-Business and e-Work Conference and Exhibition, Venice, Italy, October 2001, pp. 319–325 (2001)
Chakrabarti, S.: Mining the Web: Analysis of Hypertext and Semi Structured Data. Morgan Kaufmann, San Francisco (2002)
Clarke, C.L.A., Cormack, G.V.: Shortest-substring retrieval and ranking. ACM Transactions on Information Systems (TOIS) 18(1), 44–78 (2000)
Craswell, N., Hawking, D.: Overview of the TREC-2002 web track. In: Proceedings of TREC-2002, Gaithersburg, Maryland USA (November 2002)
Craswell, N., Hawking, D., Wilkinson, R., Wu, M.: Overview of the TREC-2003 web track. In: Proceedings of TREC 2003, Gaithersburg, Maryland USA (November 2003)
Croft, W.B., Turtle, H.R., Lewis, D.D.: The use of phrases and structured queries in information retrieval. In: Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, Chicago, Illinois, United States, pp. 32–45. ACM Press, New York (1991)
Craswell, N., et al.: Overview of the TREC-2004 web track. In: Proceedings 13th Text REtrieval Conference, Gaithersburg, Maryland USA (2004) (to appear)
Fagan, J.L.: Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and non-syntactic methods. Technical report, Cornell University (1987)
Fuhr, N., Lalmas, M., Malik, S. (eds.): INEX 2003 Workshop Proceedings (2004)
Hawking, D., Thistlewaite, P.: Proximity operators—So near and yet so far. In: Proceedings TREC-4, pp. 131–143 (1996)
Hawking, D., Thistlewaite, P.: Relevance weighting using distance between term occurrences. Technical Report TR-CS-96-08, Department of Computer Science, Australian National University (1996)
Hersh, W., Bhupatiraju, R.T.: TREC GENOMICS Track Overview. In: Proceedings TREC 2003, pp. 14–23 (2004)
Hull, D.A., Grefenstette, G., Schultze, B.M., Gaussier, E., Schutze, H., Pedersen, J.O.: Xerox TREC-5 Site Report: Routing, Filtering, NLP, and Spanish Tracks. In: Proceedings TREC-5, pp. 167–180 (1997)
Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management 36(2), 207–227 (2000)
Kamps, J., Mishne, G., de Rijke, M.: The University of Amsterdam at TREC 2004. In: Proceedings of the 13th Text REtrieval Conference (2004) (to appear)
Keen, E.M.: Term position ranking: some new test results. In: Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 66–76. ACM Press, New York (1992)
Kraaij, W., Pohlmann, R.: Comparing the effect of syntactic vs. Statistical phrase indexing strategies for dutch. In: Nikolaou, C., Stephanidis, C. (eds.) ECDL 1998. LNCS, vol. 1513, pp. 605–617. Springer, Heidelberg (1998)
Mitra, M., Buckley, C., Singhal, A., Cardie, C.: An analysis of statistical and syntactic phrases. In: Proceedings of RIAO 1997 (1997)
Mittal, V., Baluja, S., Sahami, M.: Google tutorial on web information retrieval. In: RIAO 2004 (2004)
Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. ACM Press, New York (2003)
Pickens, J., Croft, W.B.: An exploratory analysis of phrases in text retrieval. In: Proceedings of RIAO 2000 (2000)
Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Inc, New York (1986)
Savoy, J., Rasolofo, Y., Perret, L.: Report on the TREC-2003 experiment: Genomic and web searches. In: Proceedings TREC 2003, pp. 739–750 (2004)
Spink, A., Jansen, B.J., Wolfram, D., Saracevic, T.: From e-sex to e-commerce: Web search changes. Computer 35(3), 107–109 (2002)
Spink, A., Wolfram, D., Jansen, B.J., Saracevic, T.: Searching the web: the public and their queries. Journal of the American Society for Information Science and Technology 52(3), 226–234 (2001)
Wen, J., Song, R., Cai, D., Zhu, K., Yu, S., Ye, S., Ma, W.-Y.: Microsoft Research Asia at the Web Track of TREC 2003. In: Proceedings TREC 2003, pp. 408–417 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mishne, G., de Rijke, M. (2005). Boosting Web Retrieval Through Query Operations. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-31865-1_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25295-5
Online ISBN: 978-3-540-31865-1
eBook Packages: Computer ScienceComputer Science (R0)