Skip to main content

LongEval: Longitudinal Evaluation of Model Performance at CLEF 2023

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13982))

Abstract

In this paper, we describe the plans for the first LongEval CLEF 2023 shared task dedicated to evaluating the temporal persistence of Information Retrieval (IR) systems and Text Classifiers. The task is motivated by recent research showing that the performance of these models drops as the test data becomes more distant, with respect to time, from the training data. LongEval differs from traditional shared IR and classification tasks by giving special consideration to evaluating models aiming to mitigate performance drop over time. We envisage that this task will draw attention from the IR community and NLP researchers to the problem of temporal persistence of models, what enables or prevents it, potential solutions and their limitations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Qwant search engine: https://www.qwant.com/.

  2. 2.

    https://figshare.com/articles/dataset/TM-Senti/16438281.

  3. 3.

    https://huggingface.co/roberta-base.

References

  1. Alkhalifa, R., Kochkina, E., Zubiaga, A.: Building for tomorrow: assessing the temporal persistence of text classifiers. arXiv preprint arXiv:2205.05435 (2022)

  2. Alkhalifa, R., Zubiaga, A.: Capturing stance dynamics in social media: open challenges and research directions. Int. J. Digit. Hum., 1–21 (2022)

    Google Scholar 

  3. Chapelle, O., Zhang, Y.: A dynamic Bayesian network click model for web search ranking. In: Proceedings of the 18th international conference on World Wide Web, WWW 2009, pp. 1–10. Association for Computing Machinery, New York (2009). https://doi.org/10.1145/1526709.1526711

  4. Chuklin, A., Markov, I., Rijke, M.D.: Click models for web search. Synth. Lect. Inf. Concepts Retrieval Serv. 7(3), 1–115 (2015). https://doi.org/10.2200/S00654ED1V01Y201507ICR043

  5. Florio, K., Basile, V., Polignano, M., Basile, P., Patti, V.: Time of your hate: the challenge of time in hate speech detection on social media. Appl. Sci. 10(12), 4180 (2020)

    Article  Google Scholar 

  6. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  7. Lukes, J., Søgaard, A.: Sentiment analysis under temporal shift. In: Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 65–71 (2018)

    Google Scholar 

  8. Ren, R., et al.: A thorough examination on zero-shot dense retrieval (2022). arxiv:2204.12755. https://doi.org/10.48550/ARXIV.2204.12755

  9. Yin, W., Alkhalifa, R., Zubiaga, A.: The emojification of sentiment on social media: collection and analysis of a longitudinal Twitter sentiment dataset. arXiv preprint arXiv:2108.13898 (2021)

Download references

Acknowledgements

This work is supported by the ANR Kodicare bi-lateral project, grant ANR-19-CE23-0029 of the French Agence Nationale de la Recherche, and by the Austrian Science Fund (FWF, grant I4471-N). This work is also supported by a UKRI/EPSRC Turing AI Fellowship to Maria Liakata (grant no. EP/V030302/1) and The Alan Turing Institute (grant no. EP/N510129/1) through project funding and its Enrichment PhD Scheme for Iman Bilal. This work has been using services provided by the LINDAT/CLARIAH-CZ Research Infrastructure (https://lindat.cz), supported by the Ministry of Education, Youth and Sports of the Czech Republic (Project No. LM2018101) and has been also supported by the Ministry of Education, Youth and Sports of the Czech Republic, Project No. LM2018101 LINDAT/CLARIAH-CZ.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philippe Mulhem .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alkhalifa, R. et al. (2023). LongEval: Longitudinal Evaluation of Model Performance at CLEF 2023. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13982. Springer, Cham. https://doi.org/10.1007/978-3-031-28241-6_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-28241-6_58

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-28240-9

  • Online ISBN: 978-3-031-28241-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics