Adaptive Effort for Search Evaluation Metrics

Jiang, Jiepu; Allan, James

doi:10.1007/978-3-319-30671-1_14

Jiepu Jiang²¹ &
James Allan²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9626))

Included in the following conference series:

European Conference on Information Retrieval

4346 Accesses
8 Citations

Abstract

We explain a wide range of search evaluation metrics as the ratio of users’ gain to effort for interacting with a ranked list of results. According to this explanation, many existing metrics measure users’ effort as linear to the (expected) number of examined results. This implicitly assumes that users spend the same effort to examine different results. We adapt current metrics to account for different effort on relevant and non-relevant documents. Results show that such adaptive effort metrics better correlate with and predict user perceptions on search quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The dataset and source code for replicating our experiments can be accessed at https://github.com/jiepujiang/ir_metrics/.

References

Carterette, B.: System effectiveness, user models, and user utility: a conceptual framework for investigation. In: SIGIR 2011, pp. 903–912 (2011)
Google Scholar
Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: CIKM 2009, pp. 621–630 (2009)
Google Scholar
Cooper, W.S.: Expected search length: a single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. Am. Documentation 19(1), 30–41 (1968)
Article Google Scholar
De Vries, A.P., Kazai, G., Lalmas, M.: Tolerance to irrelevance: a user-effort oriented evaluation of retrieval systems without predefined retrieval unit. RIAO 2004, 463–473 (2004)
Google Scholar
Dunlop, M.D.: Time, relevance and interaction modelling for information retrieval. In: SIGIR 1997, pp. 206–213 (1997)
Google Scholar
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Article Google Scholar
Järvelin, K., Price, S.L., Delcambre, L.M.L., Nielsen, M.L.: Discounted cumulated gain based evaluation of multiple-query IR sessions. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 4–15. Springer, Heidelberg (2008)
Chapter Google Scholar
Jiang, J., Hassan Awadallah, A., Shi, X., White, R.W.: Understanding and predicting graded search satisfaction. In: WSDM 2015. pp. 57–66 (2015)
Google Scholar
Jiang, J., He, D., Allan, J.: Searching, browsing, and clicking in a search session: Changes in user behavior by task and over time. In: SIGIR 2014, pp. 607–616 (2014)
Google Scholar
Kanoulas, E., Carterette, B., Clough, P.D., Sanderson, M.: Evaluating multi-query sessions. In: SIGIR 2011, pp. 1053–1062 (2011)
Google Scholar
Kazai, G., Lalmas, M.: Extended cumulated gain measures for the evaluation of content-oriented xml retrieval. ACM Trans. Inf. Syst. 24(4), 503–542 (2006)
Article Google Scholar
Kelly, D., Belkin, N.J.: Display time as implicit feedback: Understanding task effects. In: SIGIR 2004, pp. 377–384 (2004)
Google Scholar
Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. 27(1), 2:1–2:27 (2008)
Article Google Scholar
Robertson, S.E.: A new interpretation of average precision. In: SIGIR 2008, pp. 689–690 (2008)
Google Scholar
Robertson, S.E., Kanoulas, E., Yilmaz, E.: Extending average precision to graded relevance judgments. In: SIGIR 2010, pp. 603–610 (2010)
Google Scholar
Sakai, T., Dou, Z.: Summaries, ranked retrieval and sessions: a unified framework for information access evaluation. In: SIGIR 2013, pp. 473–482, (2013)
Google Scholar
Smucker, M.D., Clarke, C.L.: Time-based calibration of effectiveness measures. In: SIGIR 2012, pp. 95–104(2012)
Google Scholar
Smucker, M.D., Jethani, C.P.: Human performance and retrieval precision revisited. In: SIGIR 2010, pp. 595–602 (2010)
Google Scholar
Villa, R., Halvey, M.: Is relevance hard work?: Evaluating the effort of making relevant assessments. In: SIGIR 2013, pp. 765–768 (2013)
Google Scholar
Yilmaz, E., Shokouhi, M., Craswell, N., Robertson, S.E.: Expected browsing utility for web search evaluation. In: CIKM 2010, pp. 1561–1564 (2010)
Google Scholar
Yilmaz, E., Verma, M., Craswell, N., Radlinski, F., Bailey, P.: Relevance and effort: an analysis of document utility. In: CIKM 2014, pp. 91–100 (2014)
Google Scholar

Download references

Acknowledgment

This work was supported in part by the Center for Intelligent Information Retrieval and in part by NSF grant #IIS-0910884. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsor.

Author information

Authors and Affiliations

Center for Intelligent Information Retrieval, College of Information and Computer Sciences, University of Massachusetts Amherst, Amherst, USA
Jiepu Jiang & James Allan

Authors

Jiepu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
James Allan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James Allan .

Editor information

Editors and Affiliations

Department of Information Engineering, University of Padua, Padova, Italy
Nicola Ferro
Faculty of Informatics, University of Lugano (USI), Lugano, Switzerland
Fabio Crestani
Department of Computer Science, Katholieke Universiteit Leuven, Heverlee, Belgium
Marie-Francine Moens
Systèmes d’informations, Big Data et Recherche d’Information, Institut de Recherche en Informatique de Toulouse IRIT/équipe SIG, Toulouse Cedex 04, France
Josiane Mothe
Yahoo! Labs London, London, UK
Fabrizio Silvestri
Department of Information Engineering, University of Padua, Padova, Italy
Giorgio Maria Di Nunzio
TU Delft - EWI/ST/WIS, Delft, The Netherlands
Claudia Hauff
Department of Information Engineering, University of Padua, Padova, Italy
Gianmaria Silvello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, J., Allan, J. (2016). Adaptive Effort for Search Evaluation Metrics. In: Ferro, N., et al. Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science(), vol 9626. Springer, Cham. https://doi.org/10.1007/978-3-319-30671-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-30671-1_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30670-4
Online ISBN: 978-3-319-30671-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics