Building a Common Framework for IIR Evaluation

Hall, Mark Michael; Toms, Elaine

doi:10.1007/978-3-642-40802-1_3

Mark Michael Hall²¹ &
Elaine Toms²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8138))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

1927 Accesses

Abstract

Cranfield-style evaluations standardised Information Retrieval (IR) evaluation practices, enabling the creation of programmes such as TREC, CLEF, and INEX, and long-term comparability of IR systems. However, the methodology does not translate well into the Interactive IR (IIR) domain, where the inclusion of the user into the search process and the repeated interaction between user and system creates more variability than the Cranfield-style evaluations can support. As a result, IIR evaluations of various systems have tended to be non-comparable, not because the systems vary, but because the methodologies used are non-comparable. In this paper we describe a standardised IIR evaluation framework, that ensures that IIR evaluations can share a standardised baseline methodology in much the same way that TREC, CLEF, and INEX imposed a process on IR evaluation. The framework provides a common baseline, derived by integrating existing, validated evaluation measures, that enables inter-study comparison, but is also flexible enough to support most kinds of IIR studies. This is achieved through the use of a “pluggable” system, into which any web-based IIR interface can be embedded. The framework has been implemented and the software will be made available to reduce the resource commitment required for IIR studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Streamlining Evaluation with ir-measures

An Intrinsic Framework of Information Retrieval Evaluation Measures

The Evolution of Cranfield

References

Trec 2002 interactive track guidelines. Technical report (2002)
Google Scholar
Bierig, R., Gwizdka, J., Cole, M.: A user-centered experiment and logging framework for interactive information retrieval. In: Proceedings of the SIGIR 2009 Workshop on Understanding the User: Logging and Interpreting User Interactions in Information Search and Retrieval, pp. 8–11 (2009)
Google Scholar
Cacioppo, J.T., Petty, R.E., Kao, C.F.: The efficient assessment of need for cognition. Journal of Personality Assessment 48(3), 306–307 (1984)
Article Google Scholar
Gwizdka, J.: Distribution of cognitive load in web search. Journal of the American Society for Information Science and Technology 61(11), 2167–2187 (2010)
Article Google Scholar
Hall, M., Clough, P., Stevenson, M.: Evaluating the use of clustering for automatically organising digital library collections. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 323–334. Springer, Heidelberg (2012)
Chapter Google Scholar
Hersh, W.: Trec 2002 interactive track report. In: Proc. TREC (2002)
Google Scholar
Ingwersen, P., Järvelin, K.: The turn: Integration of information seeking and retrieval in context, vol. 18. Springer (2005)
Google Scholar
Kanoulas, E., Hall, M., Clough, P., Carterette, B.: Overview of the trec 2013 session track. In: Proceedings of the Twentieth Text REtrieval Conference (TREC 2013) (2013)
Google Scholar
Kashdan, T.B., Gallagher, M.W., Silvia, P.J., Winterstein, B.P., Breen, W.E., Terhar, D., Steger, M.F.: The curiosity and exploration inventory-ii: Development, factor structure, and psychometrics. Journal of Research in Personality 43(6), 987–998 (2009)
Article Google Scholar
Kelly, D.: Measuring online information seeking context, part 1: background and method. Journal of the American Society for Information Science and Technology (14), 1862–1874 (2006)
Google Scholar
Kelly, D.: Methods for evaluating interactive information retrieval systems with users. Foundations and Trends in Information Retrieval 3(1), 1–224 (2009)
Google Scholar
Kelly, D., Gyllstrom, K., Bailey, E.W.: A comparison of query and term suggestion features for interactive searching. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 371–378. ACM (2009)
Google Scholar
Kelly, D., Sugimoto, C.: A systematic review of interactive information retrieval evaluation studies, 1967-2006. JASIST 64(4), 745–770 (2013)
Article Google Scholar
Lee, K., Ashton, M.: The hexaco personality inventory: A new measure of the major dimensions of personality. Multivariate Behavioral Research 39, 329–358 (2004)
Article Google Scholar
O’Brien, H.L., Toms, E.G.: The development and evaluation of a survey to measure user engagement. Journal of the American Society for Information Science and Technology 61(1), 50–69 (2009)
Article Google Scholar
Petras, V., Hall, M., Savoy, J., Bogers, T., Malak, P., Toms, E., Pawlowski, A.: Cultural heritage in clef (chic) (2013)
Google Scholar
Reips, U.-D.: Standards for internet-based experimenting. Experimental Psychology (formerly Zeitschrift für Experimentelle Psychologie) 49(4), 243–256 (2002)
Google Scholar
Reips, U.-D., Lengler, R.: Theweb experiment list: A web service for the recruitment of participants and archiving of internet-based experiments. Behavior Research Methods 37(2), 287–292 (2005)
Article Google Scholar
Renaud, G., Azzopardi, L.: Scamp: a tool for conducting interactive information retrieval experiments. In: Proceedings of the 4th Information Interaction in Context Symposium, pp. 286–289. ACM (2012)
Google Scholar
Riding, R.J., Rayner, S.: Cognitive styles and learning strategies: Understanding style differences in learning and behaviour. D. Fulton Publishers (1998)
Google Scholar
Tague-Sutcliffe, J.: The pragmatics of information retrieval experimentation, revisited. Information Processing & Management 28(4), 467–490 (1992)
Article Google Scholar
Toms, E.: Task-based information searching and retrieval, pp. 43–59. Facet Publishing (2011)
Google Scholar
Toms, E.G., Freund, L., Li, C.: Wiire: the web interactive information retrieval experimentation system prototype. Information Processing & Management 40(4), 655–675 (2004)
Article MATH Google Scholar
Toms, E.G., Freund, L., Li, C.: Wiire: the web interactive information retrieval experimentation system prototype. Information Processing & Management 40(4), 655–675 (2004)
Article MATH Google Scholar
Toms, E.G., O’Brien, H., Mackenzie, T., Jordan, C., Freund, L., Toze, S., Dawe, E., MacNutt, A.: Task effects on interactive search: The query factor. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 359–372. Springer, Heidelberg (2008)
Chapter Google Scholar
Toms, E.G., Villa, R., McCay-Peet, L.: How is a search system used in work task completion? Journal of Information Science 39(1), 15–25 (2013)
Article Google Scholar
Yuan, W., Meadow, C.T.: A study of the use of variables in information retrieval user studies. Journal of the American Society for Information Science 50(2), 140–150 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Information School, University of Sheffield, Sheffield, S1 4DP, UK
Mark Michael Hall & Elaine Toms

Authors

Mark Michael Hall
View author publications
You can also search for this author in PubMed Google Scholar
Elaine Toms
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for the Evaluation of Language and Communication Technologies (CELCT), via alla Cascata 56/c, 38123, Povo, Italy
Pamela Forner
HES-SO Valais, University of Applied Sciences Western Switzerland, Technopôle 3, 3960, Sierre, Switzerland
Henning Müller
Departamento de Sistemas Informáticos y Computación, Universitat Politècnica de València, Camino de Vera s/n, 46071, València, Spain
Roberto Paredes
Departamento de Sistemas Informáticos y Computación, Universitat Politècnica de València, Camino de Vera s/n, 46022, València, Spain
Paolo Rosso
Bauhaus-Universität Weimar, Bauhausstraße 11, 99423, Weimar, Germany
Benno Stein

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hall, M.M., Toms, E. (2013). Building a Common Framework for IIR Evaluation. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visualization. CLEF 2013. Lecture Notes in Computer Science, vol 8138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40802-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-40802-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40801-4
Online ISBN: 978-3-642-40802-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics