Generating Pseudo Test Collections for Learning to Rank Scientific Articles

Berendsen, Richard; Tsagkias, Manos; de Rijke, Maarten; Meij, Edgar

doi:10.1007/978-3-642-33247-0_6

Richard Berendsen²¹,
Manos Tsagkias²¹,
Maarten de Rijke²¹ &
…
Edgar Meij²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7488))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

774 Accesses
10 Citations

Abstract

Pseudo test collections are automatically generated to provide training material for learning to rank methods. We propose a method for generating pseudo test collections in the domain of digital libraries, where data is relatively sparse, but comes with rich annotations. Our intuition is that documents are annotated to make them better findable for certain information needs. We use these annotations and the associated documents as a source for pairs of queries and relevant documents. We investigate how learning to rank performance varies when we use different methods for sampling annotations, and show how our pseudo test collection ranks systems compared to editorial topics with editorial judgements. Our results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments. In some cases, performance is on par with learning on manually obtained ground truth.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asadi, N., Metzler, D., Elsayed, T., Lin, J.: Pseudo test collections for learning web search ranking functions. In: SIGIR 2011, pp. 1073–1082. ACM (2011)
Google Scholar
Azzopardi, L., de Rijke, M., Balog, K.: Building simulated queries for known-item topics: an analysis using six european languages. In: SIGIR 2007, pp. 455–462. ACM (2007)
Google Scholar
Beitzel, S., Jensen, E., Chowdhury, A., Grossman, D.: Using titles and category names from editor-driven taxonomies for automatic evaluation. In: CIKM 2003, pp. 17–23. ACM (2003)
Google Scholar
Cronen-Townsend, S., Croft, W.: Quantifying query ambiguity. In: HLT 2002, pp. 104–109. Morgan Kaufmann Publishers Inc. (2002)
Google Scholar
Di Nunzio, G.M.: Working notes CLEF 2007, Appendix C, Results of the Domain Specific Track. In: Working notes CLEF 2007 (2007)
Google Scholar
Di Nunzio, G.M.: Working notes CLEF 2008, Appendix D, Results of the Domain Specific Track. In: Working notes CLEF 2008 (2008)
Google Scholar
Easley, D., Kleinberg, J.: Networks, crowds, and markets. Cambridge University Press (2010)
Google Scholar
Huurnink, B., Hofmann, K., de Rijke, M.: Simulating searches from transaction logs. In: SIGIR 2010 Workshop on the Simulation of Interaction (2010)
Google Scholar
Huurnink, B., Hofmann, K., de Rijke, M., Bron, M.: Validating Query Simulators: An Experiment Using Commercial Searches and Purchases. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 40–51. Springer, Heidelberg (2010)
Chapter Google Scholar
Kim, J., Croft, W.B.: Retrieval experiments using pseudo-desktop collections. In: CIKM 2009, pp. 1297–1306. ACM (2009)
Google Scholar
Kluck, M., Gey, F.C.: The Domain-Specific Task of CLEF - Specific Evaluation Strategies in Cross-Language Information Retrieval. In: Peters, C. (ed.) CLEF 2000. LNCS, vol. 2069, pp. 48–56. Springer, Heidelberg (2001)
Chapter Google Scholar
Kluck, M., Stempfhuber, M.: Domain-Specific Track CLEF 2005: Overview of Results and Approaches, Remarks on the Assessment Analysis. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 212–221. Springer, Heidelberg (2006)
Chapter Google Scholar
Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR 2001, pp. 120–127. ACM (2001)
Google Scholar
Liu, T.-Y.: Learning to Rank for Information Retrieval. Springer (2011) ISBN 978-3-642-14266-6
Google Scholar
Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press (1999)
Google Scholar
Meij, E., de Rijke, M.: The University of Amsterdam at the CLEF 2008 Domain Specific Track - parsimonious relevance and concept models. In: CLEF 2008 Working Notes (2008)
Google Scholar
Petras, V.: How one word can make all the difference - using subject metadata for automatic query expansion and reformulation. In: Working notes CLEF 2005 (2005)
Google Scholar
Petras, V.: The domain-specific track at CLEF 2008. In: Working notes CLEF 2008 (2008)
Google Scholar
Sculley, D.: Combined regression and ranking. In: KDD 2010, pp. 979–988. ACM (2010)
Google Scholar
Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for SVM. In: 24th International Conference on Machine Learning, pp. 807–814. ACM (2007)
Google Scholar
Smucker, M., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: CIKM 2007, pp. 623–632. ACM (2007)
Google Scholar
Tague, J., Nelson, M.: Simulation of user judgments in bibliographic retrieval systems. In: SIGIR 1981, pp. 66–71 (1981)
Google Scholar
Tague, J., Nelson, M., Wu, H.: Problems in the simulation of bibliographic retrieval systems. In: SIGIR 1980, pp. 236–255 (1980)
Google Scholar
Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing & Management 36(5), 697–716 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

ISLA, University of Amsterdam, Science Park 904, 1098, XH, Amsterdam, The Netherlands
Richard Berendsen, Manos Tsagkias, Maarten de Rijke & Edgar Meij

Authors

Richard Berendsen
View author publications
You can also search for this author in PubMed Google Scholar
Manos Tsagkias
View author publications
You can also search for this author in PubMed Google Scholar
Maarten de Rijke
View author publications
You can also search for this author in PubMed Google Scholar
Edgar Meij
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer, Control and Management, Engenerring Antonio Ruberti, Sapienza University of Rome, Via Ariosto 25, 00185, Rome, Italy
Tiziana Catarci
Center for the Evaluation of Language and Communication Technologies (CELCT), Via alla Casata 56/c, 38123, Povo, TN, Italy
Pamela Forner
Department of Computer Science, Database Group, University of Twente, PO Box 217, 7500 AE, Enschede, The Netherlands
Djoerd Hiemstra
UNED Natural Language Processing and Information Retrieval Research Group, E.T.S.I. Informática de la UNED, c/ Juan del Rosal 16, 28040, Madrid, Spain
Anselmo Peñas
Department of Computer, Control and Management, Engeneering Antonio Ruberti, Sapienza University of Rome, Via Ariosto 25, 00185, Rome, Italy
Giuseppe Santucci

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berendsen, R., Tsagkias, M., de Rijke, M., Meij, E. (2012). Generating Pseudo Test Collections for Learning to Rank Scientific Articles. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. CLEF 2012. Lecture Notes in Computer Science, vol 7488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33247-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-33247-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33246-3
Online ISBN: 978-3-642-33247-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics