A Component-Level Analysis of an Academic Search Test Collection.

Dietz, Florian; Petras, Vivien

doi:10.1007/978-3-319-65813-1_2

Florian Dietz²¹ &
Vivien Petras²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10456))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

1119 Accesses

Abstract

This study analyzes search performance in an academic search test collection. In a component-level evaluation setting, 3,276 configurations over 100 topics were tested involving variations in queries, documents and system components resulting in 327,600 data points. Additional analyses of the recall base and the semantic heterogeneity of queries and documents are presented in a parallel paper. The study finds that the structure of the documents and topics as well as IR components significantly impact the general performance, while more content in either documents or topics does not necessarily improve a search. While achieving overall performance improvements, the component-level analysis did not find a component that would identify or improve badly performing queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Evolution of Cranfield

How to Run an Evaluation Task

Fewer topics? A million topics? Both?! On topics subsets in test collections

Article 08 May 2019

Notes

1.
https://www.lemurproject.org/lemur.php, last accessed: 04-30-2017.
2.
http://trec.nist.gov/trec_eval, last accessed: 04-30-2017.
3.
T: Life Satisfaction; D: Find documents which analyze people’s level of satisfaction with their lives.; N: Relevant documents report on people’s living conditions with respect to their subjective feeling of satisfaction with their personal life. Documents are also relevant in which only single areas of everyday life are discussed with respect to satisfaction.

References

Behnert, C., Lewandowski, D.: Ranking search results in library information systems - considering ranking approaches adapted from web search engines. J. Acad. Librariansh. 41(6), 725–735 (2015)
Article Google Scholar
Carmel, D., Yom-Tov, E.: Estimating the query difficulty for information retrieval. Synth. Lect. Inf. Concepts Retr. Serv. 2(1), 1–89 (2010)
MATH Google Scholar
Chowdhury, G.: Introduction to Modern Information Retrieval. Facet, London (2010)
Google Scholar
Cleverdon, C.: The Cranfield tests on index language devices. In: Aslib Proceedings, vol. 19, pp. 173–194. MCB UP Ltd. (1967)
Google Scholar
De Loupy, C., Bellot, P.: Evaluation of document retrieval systems and query difficulty. In: LREC 2000, Athens, pp. 32–39 (2000)
Google Scholar
Dietz, F., Petras, V.: A component-level analysis of an academic search test collection. Part II: query analysis. In: CLEF 2017 (2017). doi:10.1007/978-3-319-65813-1_3
Ferro, N., Harman, D.: CLEF 2009: Grid@CLEF pilot track overview. In: Peters, C., Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 552–565. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15754-7_68
Chapter Google Scholar
Ferro, N., Silvello, G.: A general linear mixed models approach to study system component effects. In: SIGIR 2016, pp. 25–34. ACM (2016)
Google Scholar
Grivolla, J., Jourlin, P., de Mori, R.: Automatic classification of queries by expected retrieval performance. In: Predicting Query Difficulty Workshop. SIGIR 2005 (2005)
Google Scholar
Han, H., Jeong, W., Wolfram, D.: Log analysis of an academic digital library: user query patterns. In: iConference 2014. iSchools (2014)
Google Scholar
Hanbury, A., Müller, H.: Automated component–level evaluation: present and future. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 124–135. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15998-5_14
Harman, D., Buckley, C.: Overview of the reliable information access workshop. Inf. Retr. 12(6), 615–641 (2009)
Article Google Scholar
Khabsa, M., Wu, Z., Giles, C.L.: Towards better understanding of academic search. In: JCDL 2016, pp. 111–114. ACM (2016)
Google Scholar
Kluck, M., Gey, F.C.: The domain-specific task of CLEF - specific evaluation strategies in cross-language information retrieval. In: Peters, C. (ed.) CLEF 2000. LNCS, vol. 2069, pp. 48–56. Springer, Heidelberg (2001). doi:10.1007/3-540-44645-1_5
Chapter Google Scholar
Kluck, M., Stempfhuber, M.: Domain-specific track CLEF 2005: overview of results and approaches, remarks on the assessment analysis. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., Rijke, M. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 212–221. Springer, Heidelberg (2006). doi:10.1007/11878773_25
Chapter Google Scholar
Kürsten, J.: A generic approach to component-level evaluation in information retrieval. Ph.D. thesis, Technical University Chemnitz, Germany (2012)
Google Scholar
Li, X., Schijvenaars, B.J., de Rijke, M.: Investigating queries and search failures in academic search. Inf. Process. Manag. 53(3), 666–683 (2017)
Article Google Scholar
Mayr, P., Scharnhorst, A., Larsen, B., Schaer, P., Mutschke, P.: Bibliometric-enhanced information retrieval. In: de Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 798–801. Springer, Cham (2014). doi:10.1007/978-3-319-06028-6_99
McCarn, D.B., Leiter, J.: On-line services in medicine and beyond. Science 181(4097), 318–324 (1973)
Article Google Scholar
Scholer, F., Garcia, S.: A case for improved evaluation of query difficulty prediction. In: SIGIR 2009, pp. 640–641. ACM (2009)
Google Scholar
Vanopstal, K., Buysschaert, J., Laureys, G., Stichele, R.V.: Lost in PubMed. Factors influencing the success of medical information retrieval. Expert Systems with Applications 40(10), 4106–4114 (2013)
Article Google Scholar
Verberne, S., Sappelli, M., Kraaij, W.: Query term suggestion in academic search. In: de Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 560–566. Springer, Cham (2014). doi:10.1007/978-3-319-06028-6_57
Voorhees, E.M.: The TREC robust retrieval track. ACM SIGIR Forum 39, 11–20 (2005)
Article Google Scholar
Ware, M., Mabe, M.: The STM report: an overview of scientific and scholarly journal publishing (2015). http://www.stm-assoc.org/2015_02_20_STM_Report_2015.pdf
Web of Science: Journal Citation Report. Thomson Reuters (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Berlin School of Library and Information Science, Humboldt-Universität zu Berlin, Dorotheenstr. 26, 10117, Berlin, Germany
Florian Dietz & Vivien Petras

Authors

Florian Dietz
View author publications
You can also search for this author in PubMed Google Scholar
Vivien Petras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vivien Petras .

Editor information

Editors and Affiliations

Dublin City University, Dublin, Ireland
Gareth J.F. Jones
Trinity College Dublin, Dublin, Ireland
Séamus Lawless
National University of Distance Education, Madrid, Spain
Julio Gonzalo
Dublin City University, Dublin, Ireland
Liadh Kelly
Université Grenoble Alpes, Grenoble, France
Lorraine Goeuriot
University of Hildesheim, Hildesheim, Germany
Thomas Mandl
University of Padua, Padua, Italy
Linda Cappellato
University of Padua, Padua, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dietz, F., Petras, V. (2017). A Component-Level Analysis of an Academic Search Test Collection.. In: Jones, G., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2017. Lecture Notes in Computer Science(), vol 10456. Springer, Cham. https://doi.org/10.1007/978-3-319-65813-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-65813-1_2
Published: 17 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65812-4
Online ISBN: 978-3-319-65813-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics