Abstract
This study analyzes search performance in an academic search test collection. In a component-level evaluation setting, 3,276 configurations over 100 topics were tested involving variations in queries, documents and system components resulting in 327,600 data points. Additional analyses of the recall base and the semantic heterogeneity of queries and documents are presented in a parallel paper. The study finds that the structure of the documents and topics as well as IR components significantly impact the general performance, while more content in either documents or topics does not necessarily improve a search. While achieving overall performance improvements, the component-level analysis did not find a component that would identify or improve badly performing queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://www.lemurproject.org/lemur.php, last accessed: 04-30-2017.
- 2.
http://trec.nist.gov/trec_eval, last accessed: 04-30-2017.
- 3.
T: Life Satisfaction; D: Find documents which analyze people’s level of satisfaction with their lives.; N: Relevant documents report on people’s living conditions with respect to their subjective feeling of satisfaction with their personal life. Documents are also relevant in which only single areas of everyday life are discussed with respect to satisfaction.
References
Behnert, C., Lewandowski, D.: Ranking search results in library information systems - considering ranking approaches adapted from web search engines. J. Acad. Librariansh. 41(6), 725–735 (2015)
Carmel, D., Yom-Tov, E.: Estimating the query difficulty for information retrieval. Synth. Lect. Inf. Concepts Retr. Serv. 2(1), 1–89 (2010)
Chowdhury, G.: Introduction to Modern Information Retrieval. Facet, London (2010)
Cleverdon, C.: The Cranfield tests on index language devices. In: Aslib Proceedings, vol. 19, pp. 173–194. MCB UP Ltd. (1967)
De Loupy, C., Bellot, P.: Evaluation of document retrieval systems and query difficulty. In: LREC 2000, Athens, pp. 32–39 (2000)
Dietz, F., Petras, V.: A component-level analysis of an academic search test collection. Part II: query analysis. In: CLEF 2017 (2017). doi:10.1007/978-3-319-65813-1_3
Ferro, N., Harman, D.: CLEF 2009: Grid@CLEF pilot track overview. In: Peters, C., Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 552–565. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15754-7_68
Ferro, N., Silvello, G.: A general linear mixed models approach to study system component effects. In: SIGIR 2016, pp. 25–34. ACM (2016)
Grivolla, J., Jourlin, P., de Mori, R.: Automatic classification of queries by expected retrieval performance. In: Predicting Query Difficulty Workshop. SIGIR 2005 (2005)
Han, H., Jeong, W., Wolfram, D.: Log analysis of an academic digital library: user query patterns. In: iConference 2014. iSchools (2014)
Hanbury, A., Müller, H.: Automated component–level evaluation: present and future. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 124–135. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15998-5_14
Harman, D., Buckley, C.: Overview of the reliable information access workshop. Inf. Retr. 12(6), 615–641 (2009)
Khabsa, M., Wu, Z., Giles, C.L.: Towards better understanding of academic search. In: JCDL 2016, pp. 111–114. ACM (2016)
Kluck, M., Gey, F.C.: The domain-specific task of CLEF - specific evaluation strategies in cross-language information retrieval. In: Peters, C. (ed.) CLEF 2000. LNCS, vol. 2069, pp. 48–56. Springer, Heidelberg (2001). doi:10.1007/3-540-44645-1_5
Kluck, M., Stempfhuber, M.: Domain-specific track CLEF 2005: overview of results and approaches, remarks on the assessment analysis. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., Rijke, M. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 212–221. Springer, Heidelberg (2006). doi:10.1007/11878773_25
Kürsten, J.: A generic approach to component-level evaluation in information retrieval. Ph.D. thesis, Technical University Chemnitz, Germany (2012)
Li, X., Schijvenaars, B.J., de Rijke, M.: Investigating queries and search failures in academic search. Inf. Process. Manag. 53(3), 666–683 (2017)
Mayr, P., Scharnhorst, A., Larsen, B., Schaer, P., Mutschke, P.: Bibliometric-enhanced information retrieval. In: de Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 798–801. Springer, Cham (2014). doi:10.1007/978-3-319-06028-6_99
McCarn, D.B., Leiter, J.: On-line services in medicine and beyond. Science 181(4097), 318–324 (1973)
Scholer, F., Garcia, S.: A case for improved evaluation of query difficulty prediction. In: SIGIR 2009, pp. 640–641. ACM (2009)
Vanopstal, K., Buysschaert, J., Laureys, G., Stichele, R.V.: Lost in PubMed. Factors influencing the success of medical information retrieval. Expert Systems with Applications 40(10), 4106–4114 (2013)
Verberne, S., Sappelli, M., Kraaij, W.: Query term suggestion in academic search. In: de Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 560–566. Springer, Cham (2014). doi:10.1007/978-3-319-06028-6_57
Voorhees, E.M.: The TREC robust retrieval track. ACM SIGIR Forum 39, 11–20 (2005)
Ware, M., Mabe, M.: The STM report: an overview of scientific and scholarly journal publishing (2015). http://www.stm-assoc.org/2015_02_20_STM_Report_2015.pdf
Web of Science: Journal Citation Report. Thomson Reuters (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Dietz, F., Petras, V. (2017). A Component-Level Analysis of an Academic Search Test Collection.. In: Jones, G., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2017. Lecture Notes in Computer Science(), vol 10456. Springer, Cham. https://doi.org/10.1007/978-3-319-65813-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-65813-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65812-4
Online ISBN: 978-3-319-65813-1
eBook Packages: Computer ScienceComputer Science (R0)