research-article

Document Score Distribution Models for Query Performance Inference and Prediction

Author:
Ronan Cummins

University of Greenwich

University of Greenwich
View Profile

Authors Info & Claims

ACM Transactions on Information Systems Volume 32 Issue 1Article No.: 2pp 1–28https://doi.org/10.1145/2559170

Published:01 January 2014Publication History

ACM Transactions on Information Systems

Abstract

Modelling the distribution of document scores returned from an information retrieval (IR) system in response to a query is of both theoretical and practical importance. One of the goals of modelling document scores in this manner is the inference of document relevance. There has been renewed interest of late in modelling document scores using parameterised distributions. Consequently, a number of hypotheses have been proposed to constrain the mixture distribution from which document scores could be drawn.

In this article, we show how a standard performance measure (i.e., average precision) can be inferred from a document score distribution using labelled data. We use the accuracy of the inference of average precision as a measure for determining the usefulness of a particular model of document scores. We provide a comprehensive study which shows that certain mixtures of distributions are able to infer average precision more accurately than others. Furthermore, we analyse a number of mixture distributions with regard to the recall-fallout convexity hypothesis and show that the convexity hypothesis is practically useful.

Consequently, based on one of the best-performing score-distribution models, we develop some techniques for query-performance prediction (QPP) by automatically estimating the parameters of the document score-distribution model when relevance information is unknown. We present experimental results that outline the benefits of this approach to query-performance prediction.

References

G. Amati and C. J. Van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20, 4, 357--389. ISSN 1046-8188. Google ScholarDigital Library
A. Arampatzis and J. Kamps. 2009. A signal-to-noise approach to score normalization. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09). ACM, New York, NY, 797--806. ISBN 978-1-60558-512-3. Google ScholarDigital Library
A. Arampatzis and J. Kamps. 2010. An empirical study of query specificity. In Proceedings of the 32nd European Conference on Information Retrieval (ECIR). 594--597. Google ScholarDigital Library
A. Arampatzis and S. Robertson. 2011. Modeling score distributions in information retrieval. Inf. Retr. 14, 1, 26--46. Google ScholarDigital Library
A. Arampatzis and A. van Hameren. 2001. The score-distributional threshold optimization for adaptive binary classification tasks. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 285--293. Google ScholarDigital Library
A. Arampatzis, J. Kamps, and S. Robertson. 2009a. Where to stop reading a ranked list?: Threshold optimization using truncated core distributions. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 524--531. Google ScholarDigital Library
A. Arampatzis, S. Robertson, and J. Kamps. 2009b. Score distributions in information retrieval. In Proceedings of the 2nd International Conference on the Theory of Information Retrieval (ICTIR’09). Lecture Notes in Computer Science, vol. 5766, Springer-Verlag, Berlin, 139--151. ISBN 978-3-642-04416-8. Google ScholarDigital Library
J. A. Aslam and E. Yilmaz. 2005. A geometric interpretation and analysis of r-precision. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM). 664--671. Google ScholarDigital Library
J. A. Aslam and E. Yilmaz. 2006. Inferring document relevance via average precision. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 601--602. Google ScholarDigital Library
N. Balasubramanian, G. Kumaran, and V. R. Carvalho. 2010. Exploring reductions for long Web queries. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). 571--578. ISBN 978-1-4503-0153-4. Google ScholarDigital Library
C. Baumgarten. 1999. A probabilistic solution to the selection and fusion problem in distributed information retrieval. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’99). ACM, New York, NY, 246--253. ISBN 1-58113-096-1. Google ScholarDigital Library
A. Bookstein. 1977. When the most pertinent document should not be retrieved---An analysis of the Swets model. Inf. Process. Manage. 13, 6, 377--383.Google ScholarCross Ref
C. Buckley and E. M. Voorhees. 2000. Evaluating evaluation measure stability. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in International Retrieval (SIGIR). 33--40. Google ScholarDigital Library
O. Butman, A. Shtok, O. Kurland, and D. Carmel. 2013. Query-performance prediction using minimal relevance feedback. In Proceedings of the Conference on the Theory of Information Retrieval (ICTIR’13). ACM, New York, NY. ISBN 978-1-4503-2107-5. Google ScholarDigital Library
K. Collins-Thompson, P. Ogilvie, Y. Zhang, and J. Callan. 2002. Information filtering, novelty detection, and named-page finding. In Proceedings of the 11th Text Retrieval Conference.Google Scholar
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. 2002. Predicting query performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York, NY, 299--306. ISBN 1-58113-561-0. Google ScholarDigital Library
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. 2006. Precision prediction based on ranked list coherence. Inf. Retr. 9, 6, 723--755. Google ScholarDigital Library
R. Cummins. 2011. Predicting query performance directly from score distributions. In Proceedings of the 7th Asia Conference on Information Retrieval Technology (AIRS’11). Springer-Verlag, Berlin, 315--326. ISBN 978-3-642-25630-1. Google ScholarDigital Library
R. Cummins. 2012a. Investigating performance predictors using Monte Carlo simulation and score distribution models. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’12). ACM, New York, NY, 1097--1098. ISBN 978-1-4503-1472-5. Google ScholarDigital Library
R. Cummins. 2012b. On the inference of average precision from score distributions. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). ACM, New York, NY, 2435--2438. ISBN 978-1-4503-1156-4. Google ScholarDigital Library
R. Cummins and C. O’Riordan. 2012. On theoretically valid score distributions in information retrieval. In Proceedings of the 34th European Conference on Advances in Information Retrieval (ECIR’12). Springer-Verlag, Berlin, 451--454. ISBN 978-3-642-28996-5. Google ScholarDigital Library
R. Cummins, J. Jose, and C. O’Riordan. 2011. Improved query performance prediction using standard deviation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information (SIGIR’11). ACM, New York, NY, 1089--1090. ISBN 978-1-4503-0757-4. Google ScholarDigital Library
K. Dai, E. Kanoulas, V. Pavlu, and J. A. Aslam. 2011. Variational bayes for modeling score distributions. Inf. Retr. 14, 1, 47--67. Google ScholarDigital Library
K. Dai, V. Pavlu, E. Kanoulas, and J. A. Aslam. 2012. Extended expectation maximization for inferring score distributions. In Proceedings of the 34th European Conference on Advances in Information Retrieval (ECIR’12). Springer-Verlag, Berlin, 293--304. ISBN 978-3-642-28996-5. Google ScholarDigital Library
V. Dang, M. Bendersky, and W. B. Croft. 2010. Learning to rank query reformulations. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). ACM, New York, NY, 807--808. ISBN 978-1-4503-0153-4. Google ScholarDigital Library
F. Diaz. 2007. Performance prediction using spatial autocorrelation. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07). ACM, New York, NY, 583--590. ISBN 978-1-59593-597-7. Google ScholarDigital Library
M. Evans, N. Hastings, and B. Peacock. 2001. Statistical distributions, third edition. Measure. Sci. Technol. 12, 1, 117.Google Scholar
H. Fang and C. Zhai. 2005. An exploration of axiomatic approaches to information retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference of Research and Development in Information Retrieval (SIGIR). 480--487. Google ScholarDigital Library
G. A. Fredricks and R. B. Nelsen. 2007. On the relationship between Spearman’s rho and Kendall’s tau for pairs of continuous random variables. J. Stat. Plan. Inference 137, 7, 2143--2150.Google ScholarCross Ref
C. Hauff and L. Azzopardi. 2009. When is query performance prediction effective? In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 829--830. Google ScholarDigital Library
C. Hauff, D. Hiemstra, and F. de Jong. 2008a. A survey of pre-retrieval query performance predictors. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). 1419--1420. Google ScholarDigital Library
C. Hauff, V. Murdock, and R. Baeza-Yates. 2008b. Improved query difficulty prediction for the Web. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM’08). ACM, New York, NY, 439--448. ISBN 978-1-59593-991-3. Google ScholarDigital Library
C. Hauff, L. Azzopardi, D. Hiemstra, and F. de Jong. 2010a. Query performance prediction: Evaluation contrasted with effectiveness. In Proceedings of the 32nd European Conference on Advances in Information Retrieval (ECIR). 204--216. Google ScholarDigital Library
C. Hauff, D. Kelly, and L. Azzopardi. 2010b. A comparison of user and system query performance predictions. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). ACM, New York, NY, 979--988. ISBN 978-1-4503-0099-5. Google ScholarDigital Library
D. Hawking and S. E. Robertson. 2003. On collection size and retrieval effectiveness. Inf. Retr. 6, 1, 99--105. Google ScholarDigital Library
B. He and I. Ounis. 2006. Query performance prediction. Inf. Syst. 31, 7, 585--594. Google ScholarDigital Library
E. Kanoulas, V. Pavlu, K. Dai, and J. A. Aslam. 2009. Modeling the score distributions of relevant and nonrelevant documents. In Proceedings of the 2nd International Conference on the Theory of Information Retrieval (ICTIR). Lecture Notes in Computer Science, vol. 5766, Springer-Verlag, Berlin, 152--163. Google ScholarDigital Library
E. Kanoulas, K. Dai, V. Pavlu, and J. A. Aslam. 2010. Score distribution models: Assumptions, intuition, and robustness to score manipulation. In Proceedings of the 33rd Annual International ACM SIGIR Conference on Research Development in Information Retrieval (SIGIR). 242--249. Google ScholarDigital Library
T. Kim, A. V. Nefian, and M. J. Broxton. 2010. Photometric recovery of Apollo metric imagery with Lunar-Lambertian reflectance. Electron. Lett. 46, 9, 63--633.Google Scholar
O. Kurland, A. Shtok, D. Carmel, and S. Hummel. 2011. A unified framework for post-retrieval query-performance prediction. In Proceedings of the 3rd International Conference on the Theory of Information Retrieval (ICTIR). Lecture Notes in Computer Science, vol. 6931, Springer-Verlag, Berlin, 15--26. Google ScholarDigital Library
O. Kurland, A. Shtok, S. Hummel, F. Raiber, D. Carmel, and O. Rom. 2012. Back to the roots: A probabilistic framework for query-performance prediction. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, New York, NY, 823--832. ISBN 978-1-4503-1156-4. Google ScholarDigital Library
H. Lang, B. Wang, G. Jones, J.-T. Li, F. Ding, and Y.-X. Liu. 2008. Query performance prediction for information retrieval based on covering topic score. J. Comput. Sci. Technol. 23, 4, 590--601. ISSN 1000-9000. Google ScholarDigital Library
Y. Lv and C. Zhai. 2011. Lower-bounding term frequency normalization. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, NY, 7--16. ISBN 978-1-4503-0717-8. Google ScholarDigital Library
D. Madigan, Y. Vardi, and I. Weissman. 2006. Extreme value theory applied to document retrieval from large collections. Inf. Retr. 9, 3, 273--294. ISSN 1386-4564. Google ScholarDigital Library
R. Manmatha, T. Rath, and F. Feng. 2001. Modeling score distributions for combining the outputs of search engines. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’01). ACM, New York, NY, 267--275. ISBN 1-58113-331-6. Google ScholarDigital Library
G. Marsaglia. 1986. The incomplete {gamma} function as a continuous poisson distribution. Comput. Math. Appl. 12, 5--6, 1187--1190. ISSN 0898-1221.Google ScholarCross Ref
J. Pérez-Iglesias and L. Araujo. 2010. Standard deviation as a query hardness estimator. In Proceedings of the 17th International Conference on String Processing and Information Retrieval (SPIRE). 207--212. Google ScholarDigital Library
C. J. V. Rijsbergen. 1979. Information Retrieval 2nd Ed. Butterworth-Heinemann, Newton, MA. ISBN 0408709294. Google ScholarDigital Library
S. Robertson. 2007. On score distributions and relevance. In Proceedings of the 29th European Conference on Information Retrieval Research (ECIR’07). Springer-Verlag, Berlin, 40--51. ISBN 978-3-540-71494-1. Google ScholarDigital Library
S. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. 1994. Okapi at trec-3. In Proceedings of the 3rd Text REtrieval Conference (TREC’94). 109--126.Google Scholar
S. E. Robertson, E. Kanoulas, and E. Yilmaz. 2010. Extending average precision to graded relevance judgments. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). ACM, New York, NY, 603--610. ISBN 978-1-4503-0153-4. Google ScholarDigital Library
G. Salton and C. Buckley. 1988. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24, 5, 513--523. Google ScholarDigital Library
A. Shtok, O. Kurland, and D. Carmel. 2009. Predicting query performance by query-drift estimation. In Proceedings of the 2nd International Conference on the Theory of Information Retrieval (ICTIR). Lecture Notes in Computer Science, vol. 5766, Springer-Verlag, Berlin, 305--312. Google ScholarDigital Library
A. Shtok, O. Kurland, and D. Carmel. 2010. Using statistical decision theory and relevance models for query-performance prediction. In Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 259--266. Google ScholarDigital Library
A. Singhal, C. Buckley, and M. Mitra. 1996. Pivoted document length normalization. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’96). ACM, New York, NY, 21--29. ISBN 0-89791-792-8. Google ScholarDigital Library
J. A. Swets. 1963. Information retrieval systems. Science 141, 3577, 245--250.Google Scholar
S. Tomlinson. 2004. Robust, Web and terabyte retrieval with Hummingbird Searchserver at TREC 2004. In Proceedings of the 13th Text Retrieval Conference (TREC).Google Scholar
V. Vinay, N. Milic-Frayling, and I. Cox. 2008. Estimating retrieval effectiveness using rank distributions. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM’08). ACM, New York, NY, 1425--1426. ISBN 978-1-59593-991-3. Google ScholarDigital Library
P. Wilkins, A. F. Smeaton, and P. Ferguson. 2010. Properties of optimally weighted data fusion in CBMIR. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). 643--650. ISBN 978-1-4503-0153-4. Google ScholarDigital Library
E. Yilmaz and J. A. Aslam. 2006. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM’06). ACM, New York, NY, 102--111. ISBN 1-59593-433-2. Google ScholarDigital Library
E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. 2005. Learning to estimate query difficulty: Including applications to missing content detection and distributed information retrieval. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’05). ACM, New York, NY, 512--519. ISBN 1-59593-034-5. Google ScholarDigital Library
C. Zhai and J. Lafferty. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22, 2, 179--214. ISSN 1046-8188. Google ScholarDigital Library
Y. Zhao, F. Scholer, and Y. Tsegay. 2008. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Proceedings of the 30th European Conference on Information Retrieval Research (ECIR’08). Lecture Notes in Computer Science, vol. 4956, Springer-Verlag, Berlin, 52--64. ISBN 3-540-78645-7, 978-3-540-78645-0. Google ScholarDigital Library
Y. Zhou and W. B. Croft. 2006. Ranking robustness: A novel framework to predict query performance. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM’06). ACM, New York, NY, 567--574. ISBN 1-59593-433-2. Google ScholarDigital Library
Y. Zhou and W. B. Croft. 2007. Query performance prediction in Web search environments. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07). ACM, New York, NY, 543--550. ISBN 978-1-59593-597-7. Google ScholarDigital Library

Index Terms

Document Score Distribution Models for Query Performance Inference and Prediction
1. Information systems
  1. Information retrieval

Recommendations

Score distribution models: assumptions, intuition, and robustness to score manipulation
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Inferring the score distribution of relevant and non-relevant documents is an essential task for many IR applications (e.g. information filtering, recall-oriented IR, meta-search, distributed IR). Modeling score distributions in an accurate manner is ...
Read More
On the inference of average precision from score distributions
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Modelling the document scores returned from an IR system for a given query using parameterised score distributions is an area of research that has become more popular in recent years. Score distribution (SD) models are useful for a number of IR tasks. ...
Read More
A rank fusion approach based on score distributions for prioritizing relevance assessments in information retrieval evaluation
Highlights
- We study how to prioritize relevance assessments in the process of creating an Information Retrieval test collection.
Abstract
In this paper we study how to prioritize relevance assessments in the process of creating an Information Retrieval test collection. A test collection consists of a set of queries, a document collection, and a set of relevance ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information Systems Volume 32, Issue 1
January 2014
123 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/2576772
Editor:
Jamie Callan
Carnegie Mellon University, USA
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 January 2014
- Accepted: 1 September 2013
- Revised: 1 August 2013
- Received: 1 October 2012
Published in tois Volume 32, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Score distributions
query performance
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 442
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Document Score Distribution Models for Query Performance Inference and Prediction

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Score distribution models: assumptions, intuition, and robustness to score manipulation

On the inference of average precision from score distributions

A rank fusion approach based on score distributions for prioritizing relevance assessments in information retrieval evaluation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Document Score Distribution Models for Query Performance Inference and Prediction

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Score distribution models: assumptions, intuition, and robustness to score manipulation

On the inference of average precision from score distributions

A rank fusion approach based on score distributions for prioritizing relevance assessments in information retrieval evaluation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media