Skip to main content

Generating Pseudo Test Collections for Learning to Rank Scientific Articles

  • Conference paper
Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics (CLEF 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7488))

Abstract

Pseudo test collections are automatically generated to provide training material for learning to rank methods. We propose a method for generating pseudo test collections in the domain of digital libraries, where data is relatively sparse, but comes with rich annotations. Our intuition is that documents are annotated to make them better findable for certain information needs. We use these annotations and the associated documents as a source for pairs of queries and relevant documents. We investigate how learning to rank performance varies when we use different methods for sampling annotations, and show how our pseudo test collection ranks systems compared to editorial topics with editorial judgements. Our results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments. In some cases, performance is on par with learning on manually obtained ground truth.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asadi, N., Metzler, D., Elsayed, T., Lin, J.: Pseudo test collections for learning web search ranking functions. In: SIGIR 2011, pp. 1073–1082. ACM (2011)

    Google Scholar 

  2. Azzopardi, L., de Rijke, M., Balog, K.: Building simulated queries for known-item topics: an analysis using six european languages. In: SIGIR 2007, pp. 455–462. ACM (2007)

    Google Scholar 

  3. Beitzel, S., Jensen, E., Chowdhury, A., Grossman, D.: Using titles and category names from editor-driven taxonomies for automatic evaluation. In: CIKM 2003, pp. 17–23. ACM (2003)

    Google Scholar 

  4. Cronen-Townsend, S., Croft, W.: Quantifying query ambiguity. In: HLT 2002, pp. 104–109. Morgan Kaufmann Publishers Inc. (2002)

    Google Scholar 

  5. Di Nunzio, G.M.: Working notes CLEF 2007, Appendix C, Results of the Domain Specific Track. In: Working notes CLEF 2007 (2007)

    Google Scholar 

  6. Di Nunzio, G.M.: Working notes CLEF 2008, Appendix D, Results of the Domain Specific Track. In: Working notes CLEF 2008 (2008)

    Google Scholar 

  7. Easley, D., Kleinberg, J.: Networks, crowds, and markets. Cambridge University Press (2010)

    Google Scholar 

  8. Huurnink, B., Hofmann, K., de Rijke, M.: Simulating searches from transaction logs. In: SIGIR 2010 Workshop on the Simulation of Interaction (2010)

    Google Scholar 

  9. Huurnink, B., Hofmann, K., de Rijke, M., Bron, M.: Validating Query Simulators: An Experiment Using Commercial Searches and Purchases. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 40–51. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Kim, J., Croft, W.B.: Retrieval experiments using pseudo-desktop collections. In: CIKM 2009, pp. 1297–1306. ACM (2009)

    Google Scholar 

  11. Kluck, M., Gey, F.C.: The Domain-Specific Task of CLEF - Specific Evaluation Strategies in Cross-Language Information Retrieval. In: Peters, C. (ed.) CLEF 2000. LNCS, vol. 2069, pp. 48–56. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  12. Kluck, M., Stempfhuber, M.: Domain-Specific Track CLEF 2005: Overview of Results and Approaches, Remarks on the Assessment Analysis. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 212–221. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  13. Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR 2001, pp. 120–127. ACM (2001)

    Google Scholar 

  14. Liu, T.-Y.: Learning to Rank for Information Retrieval. Springer (2011) ISBN 978-3-642-14266-6

    Google Scholar 

  15. Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press (1999)

    Google Scholar 

  16. Meij, E., de Rijke, M.: The University of Amsterdam at the CLEF 2008 Domain Specific Track - parsimonious relevance and concept models. In: CLEF 2008 Working Notes (2008)

    Google Scholar 

  17. Petras, V.: How one word can make all the difference - using subject metadata for automatic query expansion and reformulation. In: Working notes CLEF 2005 (2005)

    Google Scholar 

  18. Petras, V.: The domain-specific track at CLEF 2008. In: Working notes CLEF 2008 (2008)

    Google Scholar 

  19. Sculley, D.: Combined regression and ranking. In: KDD 2010, pp. 979–988. ACM (2010)

    Google Scholar 

  20. Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for SVM. In: 24th International Conference on Machine Learning, pp. 807–814. ACM (2007)

    Google Scholar 

  21. Smucker, M., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: CIKM 2007, pp. 623–632. ACM (2007)

    Google Scholar 

  22. Tague, J., Nelson, M.: Simulation of user judgments in bibliographic retrieval systems. In: SIGIR 1981, pp. 66–71 (1981)

    Google Scholar 

  23. Tague, J., Nelson, M., Wu, H.: Problems in the simulation of bibliographic retrieval systems. In: SIGIR 1980, pp. 236–255 (1980)

    Google Scholar 

  24. Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing & Management 36(5), 697–716 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berendsen, R., Tsagkias, M., de Rijke, M., Meij, E. (2012). Generating Pseudo Test Collections for Learning to Rank Scientific Articles. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. CLEF 2012. Lecture Notes in Computer Science, vol 7488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33247-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33247-0_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33246-3

  • Online ISBN: 978-3-642-33247-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics