ABSTRACT
This work describes an estimator from which unbiased measurements of precision, rank-biased precision, and cumulative gain may be derived from a uniform or non-uniform sample of relevance assessments. Adversarial testing supports the theory that our estimator yields unbiased low-variance measurements from sparse samples, even when used to measure results that are qualitatively different from those returned by known information retrieval methods. Our results suggest that test collections using sampling to select documents for relevance assessment yield more accurate measurements than test collections using pooling, especially for the results of retrieval methods not contributing to the pool.
- Aslam, J. A., Pavlu, V., and Savell, R. A unified model for metasearch and the efficient evaluation of retrieval systems via the hedge algorithm. In SIGIR 2003. Google ScholarDigital Library
- Cormack, G. V., and Grossman, M. R. Beyond pooling. In SIGIR 2018. Google ScholarDigital Library
- Horvitz, D. G., and Thompson, D. J. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association 47, 260 (1952), 663--685.Google ScholarCross Ref
- Pavlu, V., and Aslam, J. A practical sampling strategy for efficient retrieval evaluation. Northeastern University (2007).Google Scholar
- Sanderson, M., et al. Test collection based evaluation of information retrieval systems. Foundations and Trends in Information Retrieval 4, 4 (2010), 247--375.Google ScholarCross Ref
- Voorhees, E., and Harman, D. Overview of the eighth text retrieval conference. In TREC 8 (1999).Google Scholar
- Voorhees, E. M. The effect of sampling strategy on inferred measures. In SIGIR 2014. Google ScholarDigital Library
- Yilmaz, E., Kanoulas, E., and Aslam, J. A. A simple and efficient sampling method for estimating AP and NDCG. In SIGIR 2008. Google ScholarDigital Library
Index Terms
- Unbiased Low-Variance Estimators for Precision and Related Information Retrieval Effectiveness Measures
Recommendations
Beyond Pooling
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalDynamic Sampling is a novel, non-uniform, statistical sampling strategy in which documents are selected for relevance assessment based on the results of prior assessments. Unlike static and dynamic pooling methods that are commonly used to compile ...
Ratio estimators for the population variance in simple and stratified random sampling
We propose some ratio-type variance estimators using ratio estimators for the population mean in literature. We obtain mean square error (MSE) equations of proposed estimators and show that proposed estimators are more efficient than the traditional ...
Assessing the Impact of Vocabulary Similarity on Multilingual Information Retrieval for Bantu Languages
FIRE '16: Proceedings of the 8th Annual Meeting of the Forum for Information Retrieval EvaluationDespite the availability of massive open information and efforts to promote multilingualism on the Web, content in Bantu languages remains negligible. Additionally, Information Retrieval (IR) systems, such as the Google search engine, use algorithms ...
Comments