LongEval: Longitudinal Evaluation of Model Performance at CLEF 2024

Alkhalifa, Rabab; Borkakoty, Hsuvas; Deveaud, Romain; El-Ebshihy, Alaa; Espinosa-Anke, Luis; Fink, Tobias; Gonzalez-Saez, Gabriela; Galuščáková, Petra; Goeuriot, Lorraine; Iommi, David; Liakata, Maria; Madabushi, Harish Tayyar; Medina-Alias, Pablo; Mulhem, Philippe; Piroi, Florina; Popel, Martin; Servan, Christophe; Zubiaga, Arkaitz

doi:10.1007/978-3-031-56072-9_8

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14613))

Included in the following conference series:

European Conference on Information Retrieval

127 Accesses

Abstract

This paper introduces the planned second LongEval Lab, part of the CLEF 2024 conference. The aim of the lab’s two tasks is to give researchers test data for addressing temporal effectiveness persistence challenges in both information retrieval and text classification, motivated by the fact that model performance degrades as the test data becomes temporally distant from the training data. LongEval distinguishes itself from traditional IR and classification tasks by emphasizing the evaluation of models designed to mitigate performance drop over time using evolving data. The second LongEval edition will further engage the IR community and NLP researchers in addressing the crucial challenge of temporal persistence in models, exploring the factors that enable or hinder it, and identifying potential solutions along with their limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Qwant being mostly used by French speaker, it explains why it is easier to gather data (user queries and documents) in this language rather than English.
2.
Qwant search engine: https://www.qwant.com/.
3.
https://www.kaggle.com/datasets/edqian/twitter-climate-change-sentiment-dataset.
4.
https://huggingface.co/bert-base-uncased.
5.
https://clef-longeval.github.io.

References

Alkhalifa, R., et al.: Extended overview of the CLEF-2023 longeval lab on longitudinal evaluation of model performance (2023). https://api.semanticscholar.org/CorpusID:259953335
Alkhalifa, R., Kochkina, E., Zubiaga, A.: Opinions are made to be changed: temporally adaptive stance classification. In: Proceedings of the 2021 Workshop on Open Challenges in Online Social Networks, pp. 27–32 (2021)
Google Scholar
Alkhalifa, R., Kochkina, E., Zubiaga, A.: Building for tomorrow: assessing the temporal persistence of text classifiers. Inf. Process. Manag. 60(2), 103200 (2023)
Article Google Scholar
Alkhalifa, R., Zubiaga, A.: Capturing stance dynamics in social media: open challenges and research directions. Int. J. Digital Humanit. 1–21 (2022)
Google Scholar
Chapelle, O., Zhang, Y.: A dynamic bayesian network click model for web search ranking. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1–10. WWW 2009, Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1526709.1526711
Chuklin, A., Markov, I., Rijke, M.D.: Click models for web search. Synth. Lect. Inf. Concepts Retrieval Serv. 7(3), 1–115 (2015). https://doi.org/10.2200/S00654ED1V01Y201507ICR043
Effrosynidis, D., Karasakalidis, A.I., Sylaios, G., Arampatzis, A.: The climate change twitter dataset. Expert Syst. Appl. 204, 117541 (2022). https://doi.org/10.1016/j.eswa.2022.117541, https://www.sciencedirect.com/science/article/pii/S0957417422008624
Florio, K., Basile, V., Polignano, M., Basile, P., Patti, V.: Time of your hate: the challenge of time in hate speech detection on social media. Appl. Sci. 10(12), 4180 (2020)
Article Google Scholar
Küçük, D., Can, F.: Stance detection: a survey. ACM Comput. Surv. 53(1), 1–37 (2020). https://doi.org/10.1145/3369026
Lukes, J., Søgaard, A.: Sentiment analysis under temporal shift. In: Proceedings of the 9th workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 65–71 (2018)
Google Scholar
Mohammad, S.M., Sobhani, P., Kiritchenko, S.: Stance and sentiment in tweets. ACM Trans. Internet Technol. 17(3), 1–23 (2017). https://doi.org/10.1145/3003433, http://alt.qcri.org/semeval2016/task6/
Ren, R., et al.: A thorough examination on zero-shot dense retrieval (2022). arxiv:2204.12755). https://doi.org/10.48550/ARXIV.2204.12755, https://arxiv.org/abs/2204.12755

Download references

Acknowledgements

This work is supported by the ANR Kodicare bi-lateral project, grant ANR-19-CE23-0029 of the French Agence Nationale de la Recherche, and by the Austrian Science Fund (FWF, grant I4471-N). This work is also supported by a UKRI/EPSRC Turing AI Fellowship to Maria Liakata (grant no. EP/V030302/1). This work has been using services provided by the LINDAT/CLARIAH-CZ Research Infrastructure (https://lindat.cz), supported by the Ministry of Education, Youth and Sports of the Czech Republic (Project No. LM2023062) and has been also supported by the Ministry of Education, Youth and Sports of the Czech Republic, Project No. LM2023062 LINDAT/CLARIAH-CZ.

Author information

Authors and Affiliations

Queen Mary University of London, London, UK
Rabab Alkhalifa, Maria Liakata & Arkaitz Zubiaga
Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
Rabab Alkhalifa
Cardiff University, Cardiff, UK
Hsuvas Borkakoty & Luis Espinosa-Anke
Qwant, Paris, France
Romain Deveaud & Christophe Servan
Research Studios Austria, Data Science Studio, Vienna, Austria
Alaa El-Ebshihy, Tobias Fink, David Iommi & Florina Piroi
TU Wien, Vienna, Austria
Alaa El-Ebshihy, Tobias Fink & Florina Piroi
AMPLYFI, Cardiff, UK
Luis Espinosa-Anke
Univ. Grenoble Alpes, CNRS, Grenoble INP, Institute of Engineering Univ. Grenoble Alpes, LIG, Grenoble, France
Gabriela Gonzalez-Saez, Lorraine Goeuriot & Philippe Mulhem
University of Stavanger, Stavanger, Norway
Petra Galuščáková
Alan Turing Institute, London, UK
Maria Liakata
University of Warwick, Coventry, UK
Maria Liakata
University of Bath, Bath, UK
Harish Tayyar Madabushi & Pablo Medina-Alias
Charles University, Prague, Czech Republic
Martin Popel
Paris-Saclay University, CNRS, LISN, Paris, France
Christophe Servan

Authors

Rabab Alkhalifa
View author publications
You can also search for this author in PubMed Google Scholar
Hsuvas Borkakoty
View author publications
You can also search for this author in PubMed Google Scholar
Romain Deveaud
View author publications
You can also search for this author in PubMed Google Scholar
Alaa El-Ebshihy
View author publications
You can also search for this author in PubMed Google Scholar
Luis Espinosa-Anke
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Fink
View author publications
You can also search for this author in PubMed Google Scholar
Gabriela Gonzalez-Saez
View author publications
You can also search for this author in PubMed Google Scholar
Petra Galuščáková
View author publications
You can also search for this author in PubMed Google Scholar
Lorraine Goeuriot
View author publications
You can also search for this author in PubMed Google Scholar
David Iommi
View author publications
You can also search for this author in PubMed Google Scholar
Maria Liakata
View author publications
You can also search for this author in PubMed Google Scholar
Harish Tayyar Madabushi
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Medina-Alias
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Mulhem
View author publications
You can also search for this author in PubMed Google Scholar
Florina Piroi
View author publications
You can also search for this author in PubMed Google Scholar
Martin Popel
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Servan
View author publications
You can also search for this author in PubMed Google Scholar
Arkaitz Zubiaga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe Mulhem .

Editor information

Editors and Affiliations

Georgetown University, Washington, WA, USA
Nazli Goharian
University of Pisa, PISA, Pisa, Italy
Nicola Tonellotto
King's College London, London, UK
Yulan He
University College London, London, UK
Aldo Lipani
University of Glasgow, Glasgow, UK
Graham McDonald
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Glasgow, Glasgow, UK
Iadh Ounis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alkhalifa, R. et al. (2024). LongEval: Longitudinal Evaluation of Model Performance at CLEF 2024. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14613. Springer, Cham. https://doi.org/10.1007/978-3-031-56072-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-56072-9_8
Published: 20 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56071-2
Online ISBN: 978-3-031-56072-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

LongEval: Longitudinal Evaluation of Model Performance at CLEF 2024