skip to main content
10.1145/3539618.3591823acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper
Open access

ranxhub: An Online Repository for Information Retrieval Runs

Published: 18 July 2023 Publication History

Abstract

ranxhub is an online repository for sharing artifacts deriving from the evaluation of Information Retrieval systems. Specifically, we provide a platform for sharing pre-computed runs: the ranked lists of documents retrieved for a specific set of queries by a retrieval model. We also extend ranx, a Python library for the evaluation and comparison of Information Retrieval runs, adding functionalities to integrate the usage of ranxhub seamlessly, allowing the user to compare the results of multiple systems in just a few lines of code. In this paper, we first outline the many advantages and implications that an online repository for sharing runs can bring to the table. Then, we introduce ranxhub and its integration with ranx, showing its very simple usage. Finally, we discuss some use cases for which ranxhub can be highly valuable for the research community.

References

[1]
Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, and Tong Wang. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. https://doi.org/10.48550/ARXIV.1611.09268
[2]
Elias Bassani. 2022. ranx: A Blazing-Fast Python Library for Ranking Evaluation and Comparison. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10-14, 2022, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 13186), Matthias Hagen, Suzan Verberne, Craig Macdonald, Christin Seifert, Krisztian Balog, Kjetil Nørvåg, and Vinay Setty (Eds.). Springer, 259--264. https://doi.org/10.1007/978-3-030-99739-7_30
[3]
Elias Bassani, Pranav Kasela, Alessandro Raganato, and Gabriella Pasi. 2022. A Multi-Domain Benchmark for Personalized Search Evaluation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, Mohammad Al Hasan and Li Xiong (Eds.). ACM, 3822--3827. https://doi.org/10.1145/3511808.3557536
[4]
Elias Bassani and Luca Romelli. 2022. ranx.fuse: A Python Library for Metasearch. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, Mohammad Al Hasan and Li Xiong (Eds.). ACM, 4808--4812. https://doi.org/10.1145/3511808.3557207
[5]
Timo Breuer, Jüri Keller, and Philipp Schaer. 2022. ir_metadata: An Extensible Metadata Schema for IR Experiments. In SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, and Gabriella Kazai (Eds.). ACM, 3078--3089. https://doi.org/10.1145/3477495.3531738
[6]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, and Daniel Campos. 2020a. Overview of the TREC 2020 Deep Learning Track. In Proceedings of the Twenty-Ninth Text REtrieval Conference, TREC 2020, Virtual Event [Gaithersburg, Maryland, USA], November 16-20, 2020 (NIST Special Publication, Vol. 1266), Ellen M. Voorhees and Angela Ellis (Eds.). National Institute of Standards and Technology (NIST). https://trec.nist.gov/pubs/trec29/papers/OVERVIEW.DL.pdf
[7]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M. Voorhees. 2020b. Overview of the TREC 2019 deep learning track. CoRR, Vol. abs/2003.07820 (2020). showeprint[arXiv]2003.07820 https://arxiv.org/abs/2003.07820
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, and Short Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/n19-1423
[9]
Thibault Formal, Benjamin Piwowarski, and Stéphane Clinchant. 2021. A White Box Analysis of ColBERT. In Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28 - April 1, 2021, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 12657), Djoerd Hiemstra, Marie-Francine Moens, Josiane Mothe, Raffaele Perego, Martin Potthast, and Fabrizio Sebastiani (Eds.). Springer, 257--263. https://doi.org/10.1007/978-3-030-72240-1_23
[10]
Thibault Formal, Benjamin Piwowarski, and Stéphane Clinchant. 2022. Match Your Words! A Study of Lexical Matching in Neural Information Retrieval. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10-14, 2022, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 13186), Matthias Hagen, Suzan Verberne, Craig Macdonald, Christin Seifert, Krisztian Balog, Kjetil Nørvåg, and Vinay Setty (Eds.). Springer, 120--127. https://doi.org/10.1007/978-3-030-99739-7_14
[11]
Donna Harman. 2011. Information Retrieval Evaluation. Morgan & Claypool Publishers.
[12]
David A. Patterson, Joseph Gonzalez, Quoc V. Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David R. So, Maud Texier, and Jeff Dean. 2021. Carbon Emissions and Large Neural Network Training. CoRR, Vol. abs/2104.10350 (2021). [arXiv]2104.10350 https://arxiv.org/abs/2104.10350
[13]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., Vol. 21 (2020), 140:1--140:67. http://jmlr.org/papers/v21/20-074.html
[14]
David Rau and Jaap Kamps. 2022. How Different are Pre-trained Transformers for Text Ranking?. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10-14, 2022, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 13186), Matthias Hagen, Suzan Verberne, Craig Macdonald, Christin Seifert, Krisztian Balog, Kjetil Nørvåg, and Vinay Setty (Eds.). Springer, 207--214. https://doi.org/10.1007/978--3-030--99739--7_24
[15]
Mark Sanderson. 2010. Test Collection Based Evaluation of Information Retrieval Systems. Found. Trends Inf. Retr., Vol. 4, 4 (2010), 247--375.
[16]
Harrisen Scells, Shengyao Zhuang, and Guido Zuccon. 2022. Reduce, Reuse, Recycle: Green Information Retrieval Research. In SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, and Gabriella Kazai (Eds.). ACM, 2825--2837. https://doi.org/10.1145/3477495.3531766
[17]
Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and Policy Considerations for Deep Learning in NLP. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28-August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Mà rquez (Eds.). Association for Computational Linguistics, 3645--3650. https://doi.org/10.18653/v1/p19-1355
[18]
Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2020. Energy and Policy Considerations for Modern Deep Learning Research. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 13693--13696. https://ojs.aaai.org/index.php/AAAI/article/view/7123
[19]
Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, and Iryna Gurevych. 2021. BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual, Joaquin Vanschoren and Sai-Kit Yeung (Eds.). https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/65b9eea6e1cc6bb9f0cd2a47751a186f-Abstract-round2.html
[20]
E Voorhees and D Harman. 2005. Experiment and evaluation in information retrieval.
[21]
Ellen M. Voorhees, Shahzad Rajput, and Ian Soboroff. 2016. Promoting Repeatability Through Open Runs. In Proceedings of the Seventh International Workshop on Evaluating Information Access, EVIA 2016, a Satellite Workshop of the NTCIR-12 Conference, National Center of Sciences, Tokyo, Japan, june 7, 2016, Emine Yilmaz and Charles L. A. Clarke (Eds.). National Institute of Informatics (NII). http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings12/pdf/evia/04-EVIA2016-VoorheesE.pdf
[22]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, and Jamie Brew. 2019. HuggingFace's Transformers: State-of-the-art Natural Language Processing. CoRR, Vol. abs/1910.03771 (2019). showeprint[arXiv]1910.03771 http://arxiv.org/abs/1910.03771

Cited By

View all
  • (2024)Browsing and Searching Metadata of TRECProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657873(313-323)Online publication date: 10-Jul-2024
  • (2024)Mission Reproducibility: An Investigation on Reproducibility Issues in Machine Learning and Information Retrieval Research2024 IEEE 20th International Conference on e-Science (e-Science)10.1109/e-Science62913.2024.10678657(1-9)Online publication date: 16-Sep-2024
  • (2024)Implicit relational attention network for few-shot knowledge graph completionApplied Intelligence10.1007/s10489-024-05511-054:8(6433-6443)Online publication date: 14-May-2024
  • Show More Cited By

Index Terms

  1. ranxhub: An Online Repository for Information Retrieval Runs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2023
    3567 pages
    ISBN:9781450394086
    DOI:10.1145/3539618
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 July 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. artifacts sharing
    2. comparison
    3. evaluation
    4. information retrieval
    5. online platform
    6. pre-computed runs

    Qualifiers

    • Short-paper

    Conference

    SIGIR '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)130
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Browsing and Searching Metadata of TRECProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657873(313-323)Online publication date: 10-Jul-2024
    • (2024)Mission Reproducibility: An Investigation on Reproducibility Issues in Machine Learning and Information Retrieval Research2024 IEEE 20th International Conference on e-Science (e-Science)10.1109/e-Science62913.2024.10678657(1-9)Online publication date: 16-Sep-2024
    • (2024)Implicit relational attention network for few-shot knowledge graph completionApplied Intelligence10.1007/s10489-024-05511-054:8(6433-6443)Online publication date: 14-May-2024
    • (2024)DESIRE-ME: Domain-Enhanced Supervised Information Retrieval Using Mixture-of-ExpertsAdvances in Information Retrieval10.1007/978-3-031-56060-6_8(111-125)Online publication date: 24-Mar-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media