skip to main content
10.1145/3539618.3591825acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Exploratory Visualization Tool for the Continuous Evaluation of Information Retrieval Systems

Published: 18 July 2023 Publication History

Abstract

This paper introduces a novel visualization tool that facilitates the exploratory analysis of continuous evaluation for information retrieval systems. We base our analysis on score standardization and meta-analysis techniques applied to Information Retrieval evaluation. We present three functionalities: evaluation overview, delta evaluation, and meta-analysis applied to three perspectives: evaluation rounds, queries, and systems. To illustrate the use of the tool, we provide an example using the TREC-COVID test collection.

References

[1]
Tamer Abdulghani, Mahmoud Al Najar, Rayhane Belaroussi, Josiane Mothe, Mikhail Ryzhov, and Sarune Samoskaite. 2018. Browsing Information Retrieval System Results. In COnfé rence en Recherche d'Informations et Applications - CORIA 2018, 15th French Information Retrieval Conference, Rennes, France, May 16-18, 2018. Proceedings, Josiane Mothe, Peggy Cellier, and Anne-Laure Ligozat (Eds.). ARIA. https://doi.org/10.24348/coria.2018.paper21short
[2]
Rabab Alkhalifa, Iman Bilal, Hsuvas Borkakoty, Jose Camacho-Collados, Romain Deveaud, Alaa El-Ebshihy, Luis Espinosa-Anke, Gabriela Gonzalez-Saez, Petra Galuščková, Lorraine Goeuriot, et al. 2023. LongEval: Longitudinal Evaluation of Model Performance at CLEF 2023. In Advances in Information Retrieval: 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2-6, 2023, Proceedings, Part III. Springer, 499--505.
[3]
Yi-Shyuan Chiang, Yu-Ze Liu, Chen-Feng Tsai, Jing-Kai Lou, Ming-Feng Tsai, and Chuan-Ju Wang. 2022. RecDelta: An Interactive Dashboard on Top-k Recommendation for Cross-model Evaluation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3224--3228.
[4]
Gabriela González-Sáez. 2022. Continuous Result Delta Evaluation of IR Systems. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3493--3493.
[5]
Gabriela Nicole González-Sáez, Philippe Mulhem, and Lorraine Goeuriot. 2021. Towards the evaluation of information retrieval systems on evolving datasets with pivot systems. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: 12th International Conference of the CLEF Association, CLEF 2021, Virtual Event, September 21--24, 2021, Proceedings 12. Springer, 91--102.
[6]
Kevin Martin Jose, Thong Nguyen, Sean MacAvaney, Jeffrey Dalton, and Andrew Yates. 2021. DiffIR: Exploring Differences in Ranking Models' Behavior. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2595--2599.
[7]
Wes McKinney et al. 2011. pandas: a foundational Python library for data analysis and statistics. Python for high performance and scientific computing, Vol. 14, 9 (2011), 1--9.
[8]
Mark E. Rorvig. 1999. Retrieval Performance and Visual Dispersion of Query Sets. In Proceedings of The Eighth Text REtrieval Conference, TREC 1999, Gaithersburg, Maryland, USA, November 17-19, 1999 (NIST Special Publication, Vol. 500--246), Ellen M. Voorhees and Donna K. Harman (Eds.). National Institute of Standards and Technology (NIST). http://trec.nist.gov/pubs/trec8/papers/unt_rorvig.pdf
[9]
Tetsuya Sakai. 2016. A simple and effective approach to score standardisation. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval. 95--104.
[10]
Skipper Seabold and Josef Perktold. 2010. statsmodels: Econometric and statistical modeling with python. In 9th Python in Science Conference.
[11]
Ian Soboroff. 2018. Meta-Analysis for Retrieval Experiments Involving Multiple Test Collections. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 713--722.
[12]
Mahtab Tamannaee, Negar Arabzadeh, and Ebrahim Bagheri. 2020. Vis-Trec: A System for the In-Depth Analysis of Trec_eval Results. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR '20). Association for Computing Machinery, New York, NY, USA, 2181--2184. https://doi.org/10.1145/3397271.3401412
[13]
Julián Urbano, Harlley Lima, and Alan Hanjalic. 2019. A New Perspective on Score Standardization. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1061--1064.
[14]
Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux. 2011. The NumPy array: a structure for efficient numerical computation. Computing in science & engineering, Vol. 13, 2 (2011), 22--30.
[15]
Pauli Virtanen, Ralf Gommers, Travis E Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, et al. 2020. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods, Vol. 17, 3 (2020), 261--272.
[16]
Ellen Voorhees, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, William R Hersh, Kyle Lo, Kirk Roberts, Ian Soboroff, and Lucy Lu Wang. 2021. TREC-COVID: constructing a pandemic information retrieval test collection. In ACM SIGIR Forum, Vol. 54(1). ACM New York, NY, USA, 1--12.
[17]
Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, Russell Reas, Jiangjiang Yang, Doug Burdick, Darrin Eide, Kathryn Funk, Yannis Katsis, Rodney Michael Kinney, Yunyao Li, Ziyang Liu, William Merrill, Paul Mooney, Dewey A. Murdick, Devvret Rishi, Jerry Sheehan, Zhihong Shen, Brandon Stilson, Alex D. Wade, Kuansan Wang, Nancy Xin Ru Wang, Christopher Wilhelm, Boya Xie, Douglas M. Raymond, Daniel S. Weld, Oren Etzioni, and Sebastian Kohlmeier. 2020. CORD-19: The COVID-19 Open Research Dataset. In Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020. Association for Computational Linguistics, Online. https://www.aclweb.org/anthology/2020.nlpcovid19-acl.1
[18]
William Webber, Alistair Moffat, and Justin Zobel. 2008. Score standardization for inter-collection comparison of retrieval systems. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 51--58.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2023
3567 pages
ISBN:9781450394086
DOI:10.1145/3539618
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. continuous evaluation
  2. information retrieval
  3. visualization

Qualifiers

  • Short-paper

Funding Sources

Conference

SIGIR '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 114
    Total Downloads
  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)3
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media