skip to main content
research-article

Document Score Distribution Models for Query Performance Inference and Prediction

Published:01 January 2014Publication History
Skip Abstract Section

Abstract

Modelling the distribution of document scores returned from an information retrieval (IR) system in response to a query is of both theoretical and practical importance. One of the goals of modelling document scores in this manner is the inference of document relevance. There has been renewed interest of late in modelling document scores using parameterised distributions. Consequently, a number of hypotheses have been proposed to constrain the mixture distribution from which document scores could be drawn.

In this article, we show how a standard performance measure (i.e., average precision) can be inferred from a document score distribution using labelled data. We use the accuracy of the inference of average precision as a measure for determining the usefulness of a particular model of document scores. We provide a comprehensive study which shows that certain mixtures of distributions are able to infer average precision more accurately than others. Furthermore, we analyse a number of mixture distributions with regard to the recall-fallout convexity hypothesis and show that the convexity hypothesis is practically useful.

Consequently, based on one of the best-performing score-distribution models, we develop some techniques for query-performance prediction (QPP) by automatically estimating the parameters of the document score-distribution model when relevance information is unknown. We present experimental results that outline the benefits of this approach to query-performance prediction.

References

  1. G. Amati and C. J. Van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20, 4, 357--389. ISSN 1046-8188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Arampatzis and J. Kamps. 2009. A signal-to-noise approach to score normalization. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09). ACM, New York, NY, 797--806. ISBN 978-1-60558-512-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Arampatzis and J. Kamps. 2010. An empirical study of query specificity. In Proceedings of the 32nd European Conference on Information Retrieval (ECIR). 594--597. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Arampatzis and S. Robertson. 2011. Modeling score distributions in information retrieval. Inf. Retr. 14, 1, 26--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Arampatzis and A. van Hameren. 2001. The score-distributional threshold optimization for adaptive binary classification tasks. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 285--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Arampatzis, J. Kamps, and S. Robertson. 2009a. Where to stop reading a ranked list?: Threshold optimization using truncated core distributions. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 524--531. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Arampatzis, S. Robertson, and J. Kamps. 2009b. Score distributions in information retrieval. In Proceedings of the 2nd International Conference on the Theory of Information Retrieval (ICTIR’09). Lecture Notes in Computer Science, vol. 5766, Springer-Verlag, Berlin, 139--151. ISBN 978-3-642-04416-8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. A. Aslam and E. Yilmaz. 2005. A geometric interpretation and analysis of r-precision. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM). 664--671. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. A. Aslam and E. Yilmaz. 2006. Inferring document relevance via average precision. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 601--602. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. Balasubramanian, G. Kumaran, and V. R. Carvalho. 2010. Exploring reductions for long Web queries. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). 571--578. ISBN 978-1-4503-0153-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Baumgarten. 1999. A probabilistic solution to the selection and fusion problem in distributed information retrieval. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’99). ACM, New York, NY, 246--253. ISBN 1-58113-096-1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Bookstein. 1977. When the most pertinent document should not be retrieved---An analysis of the Swets model. Inf. Process. Manage. 13, 6, 377--383.Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Buckley and E. M. Voorhees. 2000. Evaluating evaluation measure stability. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in International Retrieval (SIGIR). 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. O. Butman, A. Shtok, O. Kurland, and D. Carmel. 2013. Query-performance prediction using minimal relevance feedback. In Proceedings of the Conference on the Theory of Information Retrieval (ICTIR’13). ACM, New York, NY. ISBN 978-1-4503-2107-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Collins-Thompson, P. Ogilvie, Y. Zhang, and J. Callan. 2002. Information filtering, novelty detection, and named-page finding. In Proceedings of the 11th Text Retrieval Conference.Google ScholarGoogle Scholar
  16. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. 2002. Predicting query performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’02). ACM, New York, NY, 299--306. ISBN 1-58113-561-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. 2006. Precision prediction based on ranked list coherence. Inf. Retr. 9, 6, 723--755. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Cummins. 2011. Predicting query performance directly from score distributions. In Proceedings of the 7th Asia Conference on Information Retrieval Technology (AIRS’11). Springer-Verlag, Berlin, 315--326. ISBN 978-3-642-25630-1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Cummins. 2012a. Investigating performance predictors using Monte Carlo simulation and score distribution models. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’12). ACM, New York, NY, 1097--1098. ISBN 978-1-4503-1472-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Cummins. 2012b. On the inference of average precision from score distributions. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). ACM, New York, NY, 2435--2438. ISBN 978-1-4503-1156-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Cummins and C. O’Riordan. 2012. On theoretically valid score distributions in information retrieval. In Proceedings of the 34th European Conference on Advances in Information Retrieval (ECIR’12). Springer-Verlag, Berlin, 451--454. ISBN 978-3-642-28996-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Cummins, J. Jose, and C. O’Riordan. 2011. Improved query performance prediction using standard deviation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information (SIGIR’11). ACM, New York, NY, 1089--1090. ISBN 978-1-4503-0757-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Dai, E. Kanoulas, V. Pavlu, and J. A. Aslam. 2011. Variational bayes for modeling score distributions. Inf. Retr. 14, 1, 47--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Dai, V. Pavlu, E. Kanoulas, and J. A. Aslam. 2012. Extended expectation maximization for inferring score distributions. In Proceedings of the 34th European Conference on Advances in Information Retrieval (ECIR’12). Springer-Verlag, Berlin, 293--304. ISBN 978-3-642-28996-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. V. Dang, M. Bendersky, and W. B. Croft. 2010. Learning to rank query reformulations. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). ACM, New York, NY, 807--808. ISBN 978-1-4503-0153-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Diaz. 2007. Performance prediction using spatial autocorrelation. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07). ACM, New York, NY, 583--590. ISBN 978-1-59593-597-7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Evans, N. Hastings, and B. Peacock. 2001. Statistical distributions, third edition. Measure. Sci. Technol. 12, 1, 117.Google ScholarGoogle Scholar
  28. H. Fang and C. Zhai. 2005. An exploration of axiomatic approaches to information retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference of Research and Development in Information Retrieval (SIGIR). 480--487. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. A. Fredricks and R. B. Nelsen. 2007. On the relationship between Spearman’s rho and Kendall’s tau for pairs of continuous random variables. J. Stat. Plan. Inference 137, 7, 2143--2150.Google ScholarGoogle ScholarCross RefCross Ref
  30. C. Hauff and L. Azzopardi. 2009. When is query performance prediction effective? In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 829--830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Hauff, D. Hiemstra, and F. de Jong. 2008a. A survey of pre-retrieval query performance predictors. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). 1419--1420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. Hauff, V. Murdock, and R. Baeza-Yates. 2008b. Improved query difficulty prediction for the Web. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM’08). ACM, New York, NY, 439--448. ISBN 978-1-59593-991-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. C. Hauff, L. Azzopardi, D. Hiemstra, and F. de Jong. 2010a. Query performance prediction: Evaluation contrasted with effectiveness. In Proceedings of the 32nd European Conference on Advances in Information Retrieval (ECIR). 204--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. C. Hauff, D. Kelly, and L. Azzopardi. 2010b. A comparison of user and system query performance predictions. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). ACM, New York, NY, 979--988. ISBN 978-1-4503-0099-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. Hawking and S. E. Robertson. 2003. On collection size and retrieval effectiveness. Inf. Retr. 6, 1, 99--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. B. He and I. Ounis. 2006. Query performance prediction. Inf. Syst. 31, 7, 585--594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. E. Kanoulas, V. Pavlu, K. Dai, and J. A. Aslam. 2009. Modeling the score distributions of relevant and nonrelevant documents. In Proceedings of the 2nd International Conference on the Theory of Information Retrieval (ICTIR). Lecture Notes in Computer Science, vol. 5766, Springer-Verlag, Berlin, 152--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. E. Kanoulas, K. Dai, V. Pavlu, and J. A. Aslam. 2010. Score distribution models: Assumptions, intuition, and robustness to score manipulation. In Proceedings of the 33rd Annual International ACM SIGIR Conference on Research Development in Information Retrieval (SIGIR). 242--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. T. Kim, A. V. Nefian, and M. J. Broxton. 2010. Photometric recovery of Apollo metric imagery with Lunar-Lambertian reflectance. Electron. Lett. 46, 9, 63--633.Google ScholarGoogle Scholar
  40. O. Kurland, A. Shtok, D. Carmel, and S. Hummel. 2011. A unified framework for post-retrieval query-performance prediction. In Proceedings of the 3rd International Conference on the Theory of Information Retrieval (ICTIR). Lecture Notes in Computer Science, vol. 6931, Springer-Verlag, Berlin, 15--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. O. Kurland, A. Shtok, S. Hummel, F. Raiber, D. Carmel, and O. Rom. 2012. Back to the roots: A probabilistic framework for query-performance prediction. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, New York, NY, 823--832. ISBN 978-1-4503-1156-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. H. Lang, B. Wang, G. Jones, J.-T. Li, F. Ding, and Y.-X. Liu. 2008. Query performance prediction for information retrieval based on covering topic score. J. Comput. Sci. Technol. 23, 4, 590--601. ISSN 1000-9000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Y. Lv and C. Zhai. 2011. Lower-bounding term frequency normalization. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, NY, 7--16. ISBN 978-1-4503-0717-8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. D. Madigan, Y. Vardi, and I. Weissman. 2006. Extreme value theory applied to document retrieval from large collections. Inf. Retr. 9, 3, 273--294. ISSN 1386-4564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. R. Manmatha, T. Rath, and F. Feng. 2001. Modeling score distributions for combining the outputs of search engines. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’01). ACM, New York, NY, 267--275. ISBN 1-58113-331-6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. G. Marsaglia. 1986. The incomplete {gamma} function as a continuous poisson distribution. Comput. Math. Appl. 12, 5--6, 1187--1190. ISSN 0898-1221.Google ScholarGoogle ScholarCross RefCross Ref
  47. J. Pérez-Iglesias and L. Araujo. 2010. Standard deviation as a query hardness estimator. In Proceedings of the 17th International Conference on String Processing and Information Retrieval (SPIRE). 207--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. C. J. V. Rijsbergen. 1979. Information Retrieval 2nd Ed. Butterworth-Heinemann, Newton, MA. ISBN 0408709294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. S. Robertson. 2007. On score distributions and relevance. In Proceedings of the 29th European Conference on Information Retrieval Research (ECIR’07). Springer-Verlag, Berlin, 40--51. ISBN 978-3-540-71494-1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. S. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. 1994. Okapi at trec-3. In Proceedings of the 3rd Text REtrieval Conference (TREC’94). 109--126.Google ScholarGoogle Scholar
  51. S. E. Robertson, E. Kanoulas, and E. Yilmaz. 2010. Extending average precision to graded relevance judgments. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). ACM, New York, NY, 603--610. ISBN 978-1-4503-0153-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. G. Salton and C. Buckley. 1988. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24, 5, 513--523. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. A. Shtok, O. Kurland, and D. Carmel. 2009. Predicting query performance by query-drift estimation. In Proceedings of the 2nd International Conference on the Theory of Information Retrieval (ICTIR). Lecture Notes in Computer Science, vol. 5766, Springer-Verlag, Berlin, 305--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. A. Shtok, O. Kurland, and D. Carmel. 2010. Using statistical decision theory and relevance models for query-performance prediction. In Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 259--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. A. Singhal, C. Buckley, and M. Mitra. 1996. Pivoted document length normalization. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’96). ACM, New York, NY, 21--29. ISBN 0-89791-792-8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. J. A. Swets. 1963. Information retrieval systems. Science 141, 3577, 245--250.Google ScholarGoogle Scholar
  57. S. Tomlinson. 2004. Robust, Web and terabyte retrieval with Hummingbird Searchserver at TREC 2004. In Proceedings of the 13th Text Retrieval Conference (TREC).Google ScholarGoogle Scholar
  58. V. Vinay, N. Milic-Frayling, and I. Cox. 2008. Estimating retrieval effectiveness using rank distributions. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM’08). ACM, New York, NY, 1425--1426. ISBN 978-1-59593-991-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. P. Wilkins, A. F. Smeaton, and P. Ferguson. 2010. Properties of optimally weighted data fusion in CBMIR. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). 643--650. ISBN 978-1-4503-0153-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. E. Yilmaz and J. A. Aslam. 2006. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM’06). ACM, New York, NY, 102--111. ISBN 1-59593-433-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. 2005. Learning to estimate query difficulty: Including applications to missing content detection and distributed information retrieval. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’05). ACM, New York, NY, 512--519. ISBN 1-59593-034-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. C. Zhai and J. Lafferty. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22, 2, 179--214. ISSN 1046-8188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Y. Zhao, F. Scholer, and Y. Tsegay. 2008. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Proceedings of the 30th European Conference on Information Retrieval Research (ECIR’08). Lecture Notes in Computer Science, vol. 4956, Springer-Verlag, Berlin, 52--64. ISBN 3-540-78645-7, 978-3-540-78645-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Y. Zhou and W. B. Croft. 2006. Ranking robustness: A novel framework to predict query performance. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM’06). ACM, New York, NY, 567--574. ISBN 1-59593-433-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Y. Zhou and W. B. Croft. 2007. Query performance prediction in Web search environments. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07). ACM, New York, NY, 543--550. ISBN 978-1-59593-597-7. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Document Score Distribution Models for Query Performance Inference and Prediction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Information Systems
      ACM Transactions on Information Systems  Volume 32, Issue 1
      January 2014
      123 pages
      ISSN:1046-8188
      EISSN:1558-2868
      DOI:10.1145/2576772
      Issue’s Table of Contents

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 January 2014
      • Accepted: 1 September 2013
      • Revised: 1 August 2013
      • Received: 1 October 2012
      Published in tois Volume 32, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader