On Modeling Rank-Independent Risk in Estimating Probability of Relevance

Zhang, Peng; Song, Dawei; Wang, Jun; Zhao, Xiaozhao; Hou, Yuexian

doi:10.1007/978-3-642-25631-8_2

Peng Zhang²¹,
Dawei Song²¹,
Jun Wang²²,
Xiaozhao Zhao²³ &
…
Yuexian Hou²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7097))

Included in the following conference series:

Asia Information Retrieval Symposium

1332 Accesses
1 Citations

Abstract

Estimating the probability of relevance for a document is fundamental in information retrieval. From a theoretical point of view, risk exists in the estimation process, in the sense that the estimated probabilities may not be the actual ones precisely. The estimation risk is often considered to be dependent on the rank. For example, the probability ranking principle assumes that ranking documents in the order of decreasing probability of relevance can optimize the rank effectiveness. This implies that a precise estimation can yield an optimal rank. However, an optimal (or even ideal) rank does not always guarantee that the estimated probabilities are precise. This means that part of the estimation risk is rank-independent. It imposes practical risks in the applications, such as pseudo relevance feedback, where different estimated probabilities of relevance in the first-round retrieval will make a difference even when two ranks are identical. In this paper, we will explore the effect and the modeling of such rank-independent risk. A risk management method is proposed to adaptively adjust the rank-independent risk. Experimental results on several TREC collections demonstrate the effectiveness of the proposed models for both pseudo-relevance feedback and relevance feedback.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Croft, W.B., Cronen-Townsend, S., Lavrenko, V.: Relevance feedback and personalization: A language modeling perspective. In: DELOS Workshop: Personalisation and Recommender Systems in Digital Libraries (2001)
Google Scholar
Dillon, J.V., Collins-Thompson, K.: A unified optimization framework for robust pseudo-relevance feedback algorithms. In: CIKM, pp. 1069–1078 (2010)
Google Scholar
Lafferty, J.D., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: SIGIR, pp. 111–119 (2001)
Google Scholar
Lafferty, J.D., Zhai, C.: Probabilistic relevance models based on document and query generation. In: Language Modeling and Information Retrieval, pp. 1–10 (2003)
Google Scholar
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: SIGIR, pp. 120–127 (2001)
Google Scholar
Lv, Y., Zhai, C.: Adaptive relevance feedback in information retrieval. In: CIKM, pp. 255–264 (2009)
Google Scholar
Manmatha, R., Rath, T., Feng, F.: Modeling score distributions for combining the outputs of search engines. In: SIGIR, pp. 267–275 (2001)
Google Scholar
Maron, M.E., Kuhns, J.L.: On relevance, probabilistic indexing and information retrieval. J. ACM 7, 216–244 (1960)
Article Google Scholar
Ogilvie, P., Callan, J.: Experiments using the lemur toolkit. In: TREC 2002, pp. 103–108 (2002)
Google Scholar
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR, pp. 275–281 (1998)
Google Scholar
Robertson, S.E.: The probability ranking principle in IR. Journal of Documentation, 294–304 (1977)
Google Scholar
Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)
Article Google Scholar
van Rijsbergen, C.J.: Information Retrieval. Butterworths (1979)
Google Scholar
Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: SIGIR, pp. 115–122 (2009)
Google Scholar
Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: SIGIR, pp. 334–342 (2001)
Google Scholar
Zhai, C., Lafferty, J.D.: A risk minimization framework for information retrieval. Inf. Process. Manage. 42(1), 31–55 (2006)
Article MATH Google Scholar
Zhu, J., Wang, J., Cox, I.J., Taylor, M.J.: Risky business: modeling and exploiting uncertainty in information retrieval. In: SIGIR, pp. 99–106 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Robert Gordon University, UK
Peng Zhang & Dawei Song
Department of Computer Science, University College London, UK
Jun Wang
School of Computer Sci & Tec, Tianjin University, China
Xiaozhao Zhao & Yuexian Hou

Authors

Peng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dawei Song
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaozhao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yuexian Hou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Computer Science and Engineering, University of Wollongong, Dubai Knowledge Village, P.O. Box 20182, Dubai, United Arab Emirates
Mohamed Vall Mohamed Salem
Faculty of Engineering and IT, Dubai International Academic City, Block 11, 1st and 2nd Floor, P.O. Box 345015, Dubai, United Arab Emirates
Khaled Shaalan
Faculty of Computer Science and Engineering, University of Wollongong, Dubai Knowledge Village, P.O. Box 20183, Dubai, United Arab Emirates
Farhad Oroumchian
Department of Electrical and Computer Engineering, University of Tehran, Faculty of Engineering, North Kargar Street, P.O. Box 14395-515, Tehran, Iran
Azadeh Shakery
Faculty of Computer Science and Engineering, University of Wollongong, Dubai knowledge Village, P.O. Box 20183, Dubai, United Arab Emirates
Halim Khelalfa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, P., Song, D., Wang, J., Zhao, X., Hou, Y. (2011). On Modeling Rank-Independent Risk in Estimating Probability of Relevance. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds) Information Retrieval Technology. AIRS 2011. Lecture Notes in Computer Science, vol 7097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25631-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-25631-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25630-1
Online ISBN: 978-3-642-25631-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics