Skip to main content

A Component-Level Analysis of an Academic Search Test Collection.

Part I: System and Collection Configurations

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10456))

  • 1119 Accesses

Abstract

This study analyzes search performance in an academic search test collection. In a component-level evaluation setting, 3,276 configurations over 100 topics were tested involving variations in queries, documents and system components resulting in 327,600 data points. Additional analyses of the recall base and the semantic heterogeneity of queries and documents are presented in a parallel paper. The study finds that the structure of the documents and topics as well as IR components significantly impact the general performance, while more content in either documents or topics does not necessarily improve a search. While achieving overall performance improvements, the component-level analysis did not find a component that would identify or improve badly performing queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.lemurproject.org/lemur.php, last accessed: 04-30-2017.

  2. 2.

    http://trec.nist.gov/trec_eval, last accessed: 04-30-2017.

  3. 3.

    T: Life Satisfaction; D: Find documents which analyze people’s level of satisfaction with their lives.; N: Relevant documents report on people’s living conditions with respect to their subjective feeling of satisfaction with their personal life. Documents are also relevant in which only single areas of everyday life are discussed with respect to satisfaction.

References

  1. Behnert, C., Lewandowski, D.: Ranking search results in library information systems - considering ranking approaches adapted from web search engines. J. Acad. Librariansh. 41(6), 725–735 (2015)

    Article  Google Scholar 

  2. Carmel, D., Yom-Tov, E.: Estimating the query difficulty for information retrieval. Synth. Lect. Inf. Concepts Retr. Serv. 2(1), 1–89 (2010)

    MATH  Google Scholar 

  3. Chowdhury, G.: Introduction to Modern Information Retrieval. Facet, London (2010)

    Google Scholar 

  4. Cleverdon, C.: The Cranfield tests on index language devices. In: Aslib Proceedings, vol. 19, pp. 173–194. MCB UP Ltd. (1967)

    Google Scholar 

  5. De Loupy, C., Bellot, P.: Evaluation of document retrieval systems and query difficulty. In: LREC 2000, Athens, pp. 32–39 (2000)

    Google Scholar 

  6. Dietz, F., Petras, V.: A component-level analysis of an academic search test collection. Part II: query analysis. In: CLEF 2017 (2017). doi:10.1007/978-3-319-65813-1_3

  7. Ferro, N., Harman, D.: CLEF 2009: Grid@CLEF pilot track overview. In: Peters, C., Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 552–565. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15754-7_68

    Chapter  Google Scholar 

  8. Ferro, N., Silvello, G.: A general linear mixed models approach to study system component effects. In: SIGIR 2016, pp. 25–34. ACM (2016)

    Google Scholar 

  9. Grivolla, J., Jourlin, P., de Mori, R.: Automatic classification of queries by expected retrieval performance. In: Predicting Query Difficulty Workshop. SIGIR 2005 (2005)

    Google Scholar 

  10. Han, H., Jeong, W., Wolfram, D.: Log analysis of an academic digital library: user query patterns. In: iConference 2014. iSchools (2014)

    Google Scholar 

  11. Hanbury, A., Müller, H.: Automated component–level evaluation: present and future. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 124–135. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15998-5_14

  12. Harman, D., Buckley, C.: Overview of the reliable information access workshop. Inf. Retr. 12(6), 615–641 (2009)

    Article  Google Scholar 

  13. Khabsa, M., Wu, Z., Giles, C.L.: Towards better understanding of academic search. In: JCDL 2016, pp. 111–114. ACM (2016)

    Google Scholar 

  14. Kluck, M., Gey, F.C.: The domain-specific task of CLEF - specific evaluation strategies in cross-language information retrieval. In: Peters, C. (ed.) CLEF 2000. LNCS, vol. 2069, pp. 48–56. Springer, Heidelberg (2001). doi:10.1007/3-540-44645-1_5

    Chapter  Google Scholar 

  15. Kluck, M., Stempfhuber, M.: Domain-specific track CLEF 2005: overview of results and approaches, remarks on the assessment analysis. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., Rijke, M. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 212–221. Springer, Heidelberg (2006). doi:10.1007/11878773_25

    Chapter  Google Scholar 

  16. Kürsten, J.: A generic approach to component-level evaluation in information retrieval. Ph.D. thesis, Technical University Chemnitz, Germany (2012)

    Google Scholar 

  17. Li, X., Schijvenaars, B.J., de Rijke, M.: Investigating queries and search failures in academic search. Inf. Process. Manag. 53(3), 666–683 (2017)

    Article  Google Scholar 

  18. Mayr, P., Scharnhorst, A., Larsen, B., Schaer, P., Mutschke, P.: Bibliometric-enhanced information retrieval. In: de Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 798–801. Springer, Cham (2014). doi:10.1007/978-3-319-06028-6_99

  19. McCarn, D.B., Leiter, J.: On-line services in medicine and beyond. Science 181(4097), 318–324 (1973)

    Article  Google Scholar 

  20. Scholer, F., Garcia, S.: A case for improved evaluation of query difficulty prediction. In: SIGIR 2009, pp. 640–641. ACM (2009)

    Google Scholar 

  21. Vanopstal, K., Buysschaert, J., Laureys, G., Stichele, R.V.: Lost in PubMed. Factors influencing the success of medical information retrieval. Expert Systems with Applications 40(10), 4106–4114 (2013)

    Article  Google Scholar 

  22. Verberne, S., Sappelli, M., Kraaij, W.: Query term suggestion in academic search. In: de Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 560–566. Springer, Cham (2014). doi:10.1007/978-3-319-06028-6_57

  23. Voorhees, E.M.: The TREC robust retrieval track. ACM SIGIR Forum 39, 11–20 (2005)

    Article  Google Scholar 

  24. Ware, M., Mabe, M.: The STM report: an overview of scientific and scholarly journal publishing (2015). http://www.stm-assoc.org/2015_02_20_STM_Report_2015.pdf

  25. Web of Science: Journal Citation Report. Thomson Reuters (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vivien Petras .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Dietz, F., Petras, V. (2017). A Component-Level Analysis of an Academic Search Test Collection.. In: Jones, G., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2017. Lecture Notes in Computer Science(), vol 10456. Springer, Cham. https://doi.org/10.1007/978-3-319-65813-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65813-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65812-4

  • Online ISBN: 978-3-319-65813-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics