ABSTRACT
The query-performance prediction task is estimating the effectiveness of a search performed in response to a query when no relevance judgments are available. Although there exist many effective prediction methods, these differ substantially in their basic principles, and rely on diverse hypotheses about the characteristics of effective retrieval. We present a novel fundamental probabilistic prediction framework. Using the framework, we derive and explain various previously proposed prediction methods that might seem completely different, but turn out to share the same formal basis. The derivations provide new perspectives on several predictors (e.g., Clarity). The framework is also used to devise new prediction approaches that outperform the state-of-the-art.
- G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness, and selective application of query expansion. In Proc. of ECIR, pages 127--137, 2004.Google ScholarCross Ref
- J. A. Aslam and V. Pavlu. Query hardness estimation using Jensen-Shannon divergence among multiple scoring functions. In Proc. of ECIR, pages 198--209, 2007. Google ScholarDigital Library
- M. Bendersky, W. B. Croft, and Y. Diao. Quality-biased ranking of web documents. In Proc. of WSDM, pages 95--104, 2011. Google ScholarDigital Library
- D. Carmel and E. Yom-Tov. Estimating the Query Difficulty for Information Retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, 2010. Google ScholarDigital Library
- D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. What makes a query difficult? In Proc. of SIGIR, pages 390--397, 2006. Google ScholarDigital Library
- C. L. A. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web track. In Proc. of TREC, 2009.Google Scholar
- K. Collins-Thompson and P. N. Bennett. Predicting query performance via classification. In Proc. of ECIR, pages 140--152, 2010. Google ScholarDigital Library
- G. V. Cormack, M. D. Smucker, and C. L. A. Clarke. Efficient and effective spam filtering and re-ranking for large web datasets. CoRR, abs/1004.5168, 2010.Google Scholar
- W. B. Croft and J. Lafferty, editors. Language Modeling for Information Retrieval. Number 13 in Information Retrieval Book Series. Kluwer, 2003. Google ScholarDigital Library
- S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proc. of SIGIR, pages 299--306, 2002. Google ScholarDigital Library
- S. Cronen-Townsend, Y. Zhou, and W. B. Croft. A language modeling framework for selective query expansion. Technical Report IR-338, Center for Intelligent Information Retrieval, University of Massachusetts, 2004.Google ScholarCross Ref
- R. Cummins. Predicting query performance directly from score distributions. In Proc. of AIRS, pages 315--326, 2011. Google ScholarDigital Library
- R. Cummins, J. M. Jose, and C. O'Riordan. Improved query performance prediction using standard deviation. In Proc. of SIGIR, pages 1089--1090, 2011. Google ScholarDigital Library
- F. Diaz. Performance prediction using spatial autocorrelation. In Proc. of SIGIR, pages 583--590, 2007. Google ScholarDigital Library
- C. Hauff, L. Azzopardi, and D. Hiemstra. The combination and evaluation of query performance prediction methods. In Proc. of ECIR, pages 301--312, 2009. Google ScholarDigital Library
- C. Hauff, D. Hiemstra, and F. de Jong. A survey of pre-retrieval query performance predictors. In Proc. of CIKM, pages 1419--1420, 2008. Google ScholarDigital Library
- C. Hauff, D. Kelly, and L. Azzopardi. A comparison of user and system query performance predictions. In Proc. of CIKM, pages 979--988, 2010. Google ScholarDigital Library
- C. Hauff, V. Murdock, and R. Baeza-Yates. Improved query difficulty prediction for the web. In Proc. of CIKM, pages 439--448, 2008. Google ScholarDigital Library
- B. He and I. Ounis. Inferring query performance using pre-retrieval predictors. In Proc. of SPIRE, pages 43--54, 2004.Google ScholarCross Ref
- S. Hummel, A. Shtok, F. Raiber, O. Kurland, and D. Carmel. Clarity re-visited. In Proc. of SIGIR, 2012. Poster. Google ScholarDigital Library
- O. Kurland, A. Shtok, D. Carmel, and S. Hummel. A unified framework for post-retrieval query-performance prediction. In Proc. of ICTIR, pages 15--26, 2011. Google ScholarDigital Library
- J. Lafferty and C. Zhai. Probabilistic relevance models based on document and query generation. In Croft and Lafferty {9}, pages 1--10.Google Scholar
- J. D. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In Proc. of SIGIR, pages 111--119, 2001. Google ScholarDigital Library
- V. Lavrenko and W. B. Croft. Relevance-based language models. In Proc. of SIGIR, pages 120--127, 2001. Google ScholarDigital Library
- J. Mothe and L. Tanguy. Linguistic features to predict query difficulty. In ACM SIGIR 2005 Workshop on Predicting Query Difficulty - Methods and Applications, 2005.Google Scholar
- S. E. Robertson. The probability ranking principle in IR. Journal of Documentation, pages 294--304, 1977.Google ScholarCross Ref
- T. Rölleke and J. Wang. A parallel derivation of probabilistic information retrieval models. In SIGIR, pages 107--114, 2006. Google ScholarDigital Library
- F. Scholer and S. Garcia. A case for improved evaluation of query difficulty prediction. In Proc. of SIGIR, pages 640--641, 2009. Google ScholarDigital Library
- F. Scholer, H. E. Williams, and A. Turpin. Query association surrogates for web search. JASIST, 55(7):637--650, 2004. Google ScholarDigital Library
- A. Shtok, O. Kurland, and D. Carmel. Predicting query performance by query-drift estimation. In Proc. of ICTIR, pages 305--312, 2009. Google ScholarDigital Library
- A. Shtok, O. Kurland, and D. Carmel. Using statistical decision theory and relevance models for query-performance prediction. In Proccedings of SIGIR, pages 259--266, 2010. Google ScholarDigital Library
- F. Song and W. B. Croft. A general language model for information retrieval (poster abstract). In Proc. of SIGIR, pages 279--280, 1999. Google ScholarDigital Library
- K. Sparck Jones, S. Walker, and S. E. Robertson. A probabilistic model of information retrieval: development and comparative experiments - part 1. Information Processing and Management, 36(6):779--808, 2000. Google ScholarDigital Library
- J. H. Steiger. Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87(2):245--251, 1980.Google ScholarCross Ref
- S. Tomlinson. Robust, Web and Terabyte Retrieval with Hummingbird Search Server at TREC 2004. In Proc. of TREC-13, 2004.Google Scholar
- V. Vinay, I. J. Cox, N. Milic-Frayling, and K. R. Wood. On ranking the effectiveness of searches. In Proc. of SIGIR, pages 398--404, 2006. Google ScholarDigital Library
- E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In Proc. of SIGIR, pages 512--519, 2005. Google ScholarDigital Library
- C. Zhai and J. D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proc. of SIGIR, pages 334--342, 2001. Google ScholarDigital Library
- Y. Zhao, F. Scholer, and Y. Tsegay. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Proc. of ECIR, pages 52--64, 2008. Google ScholarDigital Library
- Y. Zhou. Retrieval Performance Prediction and Document Quality. PhD thesis, University of Massachusetts, 2007. Google ScholarDigital Library
- Y. Zhou and W. B. Croft. Ranking robustness: a novel framework to predict query performance. In Proc. of CIKM, pages 567--574, 2006. Google ScholarDigital Library
- Y. Zhou and W. B. Croft. Query performance prediction in web search environments. In Proc. of SIGIR, pages 543--550, 2007. Google ScholarDigital Library
Index Terms
- Back to the roots: a probabilistic framework for query-performance prediction
Recommendations
Predicting Query Performance by Query-Drift Estimation
Predicting query performance, that is, the effectiveness of a search performed in response to a query, is a highly important and challenging problem. We present a novel approach to this task that is based on measuring the standard deviation of retrieval ...
Query-Performance Prediction Using Minimal Relevance Feedback
ICTIR '13: Proceedings of the 2013 Conference on the Theory of Information RetrievalThere has been much work on devising query-performance prediction approaches that estimate search effectiveness without relevance judgments (i.e., zero feedback). Specifically, post-retrieval predictors analyze the result list of top-retrieved ...
A unified framework for post-retrieval query-performance prediction
ICTIR'11: Proceedings of the Third international conference on Advances in information retrieval theoryThe query-performance prediction task is estimating the effectiveness of a search performed in response to a query in lack of relevance judgments. Post-retrieval predictors analyze the result list of top-retrieved documents. While many of these ...
Comments