research-article

Axiomatic Retrieval Experimentation with ir_axioms

Authors:

Alexander Bondarenko,

Jan Heinrich Reimer,

Michael Völske,

Matthias HagenAuthors Info & Claims

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 3131 - 3140

https://doi.org/10.1145/3477495.3531743

Published: 07 July 2022 Publication History

Abstract

Axiomatic approaches to information retrieval have played a key role in determining basic constraints that characterize good retrieval models. Beyond their importance in retrieval theory, axioms have been operationalized to improve an initial ranking, to "guide" retrieval, or to explain some model's rankings. However, recent open-source retrieval frameworks like PyTerrier and Pyserini, which made it easy to experiment with sparse and dense retrieval models, have not included any retrieval axiom support so far.

To fill this gap, we propose ir_axioms, an open-source Python framework that integrates retrieval axioms with common retrieval frameworks. We include reference implementations for 25 retrieval axioms, as well as components for preference aggregation, re-ranking, and evaluation. New axioms can easily be defined by implementing an abstract data type or by intuitively combining existing axioms with Python operators or regression. Integration with PyTerrier and ir_datasets makes standard retrieval models, corpora, topics, and relevance judgments---including those used at TREC---immediately accessible for axiomatic experimentation. Our experiments on the TREC Deep Learning tracks showcase some potential research questions that ir_axioms can help to address.

Supplementary Material

MP4 File (SIGIR22-rp1813.mp4)

Presentation video showcasing axiomatic retrieval experimentation with ir_axioms

Download
33.95 MB

References

[1]

Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: Ranking and clustering. Journal of the ACM, Vol. 55, 5 (2008), 23:1--23:27. https://doi.org/10.1145/1411509.1411513

Digital Library

[2]

Siddhant Arora and Andrew Yates. Investigating retrieval method selection with axiomatic features. In Proceedings of the 1st Interdisciplinary Workshop on Algorithm Selection and Meta-Learning in Information Retrieval co-located with the 41st European Conference on IR Research, AMIR@ECIR 2019 (CEUR Workshop Proceedings, Vol. 2360). CEUR-WS.org, 18--31. http://ceur-ws.org/Vol-2360/paper4Axiomatic.pdf

[3]

Andrzej Bialecki, Robert Muir, and Grant Ingersoll. Apache Lucene 4. In Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, OSIR@SIGIR 2012 . University of Otago, 17--24.

[4]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomá s Mikolov. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguistics, Vol. 5 (2017), 135--146. https://transacl.org/ojs/index.php/tacl/article/view/999

[5]

Alexander Bondarenko, Maik Frö be, Vaibhav Kasturia, Matthias Hagen, Michael Vö lske, and Benno Stein. Webis at TREC 2019: Decision track. In Proceedings of the Twenty-Eighth Text REtrieval Conference, TREC 2019 (NIST Special Publication, Vol. 1250). NIST, https://webis.de/publications.html#bondarenko_2019

[6]

Alexander Bondarenko, Michael Vö lske, Alexander Panchenko, Chris Biemann, Benno Stein, and Matthias Hagen. Webis at TREC 2018: Common Core track. In Proceedings of the Twenty-Seventh Text REtrieval Conference, TREC 2018 (NIST Special Publication, Vol. 500--331). NIST, https://webis.de/publications.html#bondarenko_2018

[7]

Christopher J. C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Learning, Vol. 11, 23--581 (2010), 81.

[8]

Arthur Câ mara and Claudia Hauff. Diagnosing BERT with retrieval heuristics. In Proceedings of the 42nd European Conference on IR Research, ECIR 2020 (Lecture Notes in Computer Science, Vol. 12035). Springer, 605--618. https://doi.org/10.1007/978--3-030--45439--5_40

[9]

Artem N. Chernodub, Oleksiy Oliynyk, Philipp Heidenreich, Alexander Bondarenko, Matthias Hagen, Chris Biemann, and Alexander Panchenko. textttTARGER: Neural argument mining at your fingertips. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019 . ACL, 195--200. https://doi.org/10.18653/v1/p19--3031

[10]

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, and Daniel Campos. Overview of the TREC 2020 Deep Learning track. Proceedings of the Twenty-Ninth Text REtrieval Conference, TREC 2020 (NIST Special Publication, Vol. 1266). NIST. https://trec.nist.gov/pubs/trec29/papers/OVERVIEW.DL.pdf

[11]

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M. Voorhees. Overview of the TREC 2019 Deep Learning track. Proceedings of the Twenty-Eighth Text REtrieval Conference, TREC 2019 (NIST Special Publication, Vol. 1250). NIST. https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.DL.pdf

[12]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019. ACL, 4171--4186. https://doi.org/10.18653/v1/n19--1423

[13]

Hui Fang, Tao Tao, and ChengXiang Zhai. A formal study of information retrieval heuristics. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2004. ACM, 49--56. https://doi.org/10.1145/1008992.1009004

Digital Library

[14]

Hui Fang, Tao Tao, and ChengXiang Zhai. Diagnostic evaluation of information retrieval models. ACM Transactions on Information Systems, Vol. 29, 2 (2011), 7:1--7:42. https://doi.org/10.1145/1961209.1961210

Digital Library

[15]

Hui Fang and ChengXiang Zhai. Semantic term matching in axiomatic approaches to information retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2006. ACM, 115--122. https://doi.org/10.1145/1148170.1148193

Digital Library

[16]

Thibault Formal, Benjamin Piwowarski, and Sté phane Clinchant. A white box analysis of ColBERT. In Proceedings of the 43rd European Conference on IR Research, ECIR 2021 (Lecture Notes in Computer Science, Vol. 12657). Springer, 257--263. https://doi.org/10.1007/978--3-030--72240--1_23

[17]

Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. Tevatron: An efficient and flexible toolkit for dense retrieval. CoRR, Vol. abs/2203.05765 (2022). https://doi.org/10.48550/arXiv.2203.05765

[18]

Sreenivas Gollapudi and Aneesh Sharma. An axiomatic approach for result diversification. In Proceedings of the 18th International Conference on World Wide Web, WWW 2009. ACM, 381--390. https://doi.org/10.1145/1526709.1526761

Digital Library

[19]

Jiafeng Guo, Yixing Fan, Xiang Ji, and Xueqi Cheng. MatchZoo: A learning, practicing, and developing system for neural text matching. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019. ACM, 1297--1300. https://doi.org/10.1145/3331184.3331403

Digital Library

[20]

Matthias Hagen, Michael Vö lske, Steve Gö ring, and Benno Stein. Axiomatic result re-ranking. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016 . ACM, 721--730. https://doi.org/10.1145/2983323.2983704

Digital Library

[21]

Kalervo J"a rvelin and Jaana Kek"a l"a inen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, Vol. 20, 4 (2002), 422--446. https://doi.org/10.1145/582415.582418

Digital Library

[22]

Omar Khattab and Matei Zaharia. ColBERT: Efficient and effective passage search via contextualized late interaction over BERT. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020. ACM, 39--48. https://doi.org/10.1145/3397271.3401075

Digital Library

[23]

Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021. ACM, 2356--2362. https://doi.org/10.1145/3404835.3463238

Digital Library

[24]

Zhenghao Liu, Kaitao Zhang, Chenyan Xiong, Zhiyuan Liu, and Maosong Sun. OpenMatch: An open source library for Neu-IR research. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021. ACM, 2531--2535. https://doi.org/10.1145/3404835.3462789

Digital Library

[25]

Yuanhua Lv and ChengXiang Zhai. Lower-bounding term frequency normalization. In Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011 . ACM, 7--16. https://doi.org/10.1145/2063576.2063584

Digital Library

[26]

Sean MacAvaney. OpenNIR: A complete neural ad-hoc ranking pipeline. In Proceedings of the 13th ACM International Conference on Web Search and Data Mining, WSDM 2020 . ACM, 845--848. https://doi.org/10.1145/3336191.3371864

Digital Library

[27]

Sean MacAvaney, Sergey Feldman, Nazli Goharian, Doug Downey, and Arman Cohan. ABNIRML: Analyzing the behavior of neural IR models. Transactions of the Association for Computational Linguistics, Vol. 10 (03 2022), 224--239. https://doi.org/10.1162/tacl_a_00457

[28]

Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Arman Cohan, and Nazli Goharian. Simplified data wrangling with textttfontsize6.87.8selectfontir_datasets. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021. ACM, 2429--2436. https://doi.org/10.1145/3404835.3463254

Digital Library

[29]

Craig Macdonald, Richard McCreadie, Rodrygo L. T. Santos, and Iadh Ounis. From puppy to maturity: Experiences in developing Terrier. In Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, OSIR@SIGIR 2012 . University of Otago, 60--63.

[30]

Craig Macdonald, Nicola Tonellotto, Sean MacAvaney, and Iadh Ounis. PyTerrier: Declarative experimentation in Python from BM25 to dense retrieval. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management, CIKM 2021. ACM, 4526--4533. https://doi.org/10.1145/3459637.3482013

Digital Library

[31]

Wes McKinney. Data structures for statistical computing in Python. In Proceedings of the 9th Python in Science Conference, SciPy 2010. 56--61. https://doi.org/10.25080/Majora-92bf1922-00a

[32]

George A. Miller. WordNet: A lexical database for English. Communications of the ACM, Vol. 38, 11 (1995), 39--41. https://doi.org/10.1145/219717.219748

Digital Library

[33]

Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. MS MARCO: A human generated MAchine Reading COmprehension dataset. In Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016) (CEUR Workshop Proceedings, Vol. 1773). CEUR-WS.org. https://ceur-ws.org/Vol-1773/CoCoNIPS_2016_paper9.pdf

[34]

Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. Document ranking with a pretrained sequence-to-sequence model. In Findings of the Association for Computational Linguistics: EMNLP 2020. ACL, 708--718. https://doi.org/10.18653/v1/2020.findings-emnlp.63

[35]

Fabian Pedregosa, Gaë l Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Edouard Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, Vol. 12 (2011), 2825--2830. https://dl.acm.org/doi/10.5555/1953048.2078195

Digital Library

[36]

Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. The expando-mono-duo design pattern for text ranking with pretrained sequence-to-sequence models. CoRR, Vol. abs/2101.05667 (2021). https://arxiv.org/abs/2101.05667

[37]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, Vol. 21 (2020), 140:1--140:67. https://jmlr.org/papers/v21/20-074.html

[38]

Danië l Rennings, Felipe Moraes, and Claudia Hauff. An axiomatic approach to diagnosing neural IR models. In Proceedings of the 41st European Conference on IR Research, ECIR 2019 (Lecture Notes in Computer Science, Vol. 11437). Springer, 489--503. https://doi.org/10.1007/978--3-030--15712--8_32

[39]

Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. Okapi at TREC-3. In Proceedings of The Third Text REtrieval Conference, TREC 1994 (NIST Special Publication, Vol. 500--225). NIST, 109--126. https://trec.nist.gov/pubs/trec3/papers/city.ps.gz

[40]

Corby Rosset, Bhaskar Mitra, Chenyan Xiong, Nick Craswell, Xia Song, and Saurabh Tiwary. An axiomatic approach to regularizing neural ranking models. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019. ACM, 981--984. https://doi.org/10.1145/3331184.3331296

Digital Library

[41]

Shuming Shi, Ji-Rong Wen, Qing Yu, Ruihua Song, and Wei-Ying Ma. Gravitation-based model for information retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2005. ACM, 488--495. https://doi.org/10.1145/1076034.1076117

Digital Library

[42]

Michael Vö lske, Alexander Bondarenko, Maik Frö be, Benno Stein, Jaspreet Singh, Matthias Hagen, and Avishek Anand. Towards axiomatic explanations for neural ranking models. In Proceedings of the 2021 ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2021. ACM, 13--22. https://doi.org/10.1145/3471158.3472256

Digital Library

[43]

Hao Wu and Hui Fang. Relation based term weighting regularization. In Proceedings of the 34th European Conference on IR Research, ECIR 2012 (Lecture Notes in Computer Science, Vol. 7224). Springer, 109--120. https://doi.org/10.1007/978--3--642--28997--2_10

[44]

Qiang Wu, Christopher J. C. Burges, Krysta M. Svore, and Jianfeng Gao. Adapting boosting for information retrieval measures. Information Retrieval, Vol. 13, 3 (2010), 254--270. https://doi.org/10.1007/s10791-009--9112--1

Digital Library

[45]

Peilin Yang, Hui Fang, and Jimmy Lin. Anserini: Enabling the use of Lucene for information retrieval research. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017. ACM, 1253--1256. https://doi.org/10.1145/3077136.3080721

Digital Library

[46]

Peilin Yang, Hui Fang, and Jimmy Lin. Anserini: Reproducible ranking baselines using Lucene. ACM Journal of Data and Information Quality, Vol. 10, 4 (2018), 16:1--16:20. https://doi.org/10.1145/3239571

Digital Library

[47]

Andrew Yates, Kevin Martin Jose, Xinyu Zhang, and Jimmy Lin. Flexible IR pipelines with Capreolus. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management, CIKM 2020 . ACM, 3181--3188. https://doi.org/10.1145/3340531.3412780

Digital Library

[48]

Wei Zheng and Hui Fang. Query aspect based term weighting regularization in information retrieval. In Proceedings of the 32nd European Conference on IR Research, ECIR 2010 (Lecture Notes in Computer Science, Vol. 5993). Springer, 344--356. https://doi.org/10.1007/978--3--642--12275-0_31

Cited By

Thakur NBonifacio LFröbe MBondarenko AKamalloo EPotthast MHagen MLin JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIRProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657861(1420-1430)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657861
Anand ASaha SSen PMitra M(2023)Explainability of Text Processing and Retrieval MethodsProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3632754.3632944(153-157)Online publication date: 15-Dec-2023
https://dl.acm.org/doi/10.1145/3632754.3632944
Anand ASen PSaha SVerma MMitra MChen HDuh WHuang HKato MMothe JPoblete B(2023)Explainable Information RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3594249(3448-3451)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3594249

Index Terms

Axiomatic Retrieval Experimentation with ir_axioms
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Sound and complete relevance assessment for XML retrieval

In information retrieval research, comparing retrieval approaches requires test collections consisting of documents, user requests and relevance assessments. Obtaining relevance assessments that are as sound and complete as possible is crucial for the ...
Current Status of the Evaluation of Information Retrieval

This is the second in the series of the articles on an application of the systems analytic approach to evaluation of information retrieval (IR). In the previous article a historical overview of IR was presented and existing terminological problems ...
Large-scale information retrieval experimentation with terrier
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

This tutorial aims to provide a practical introduction to conducting large-scale information retrieval (IR) experiments, using Terrier (http://terrier.org) as an experimentation platform. Written in Java, Terrier provides an open-source, feature-rich, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2022

3569 pages

ISBN:9781450387323

DOI:10.1145/3477495

General Chairs:
Enrique Amigo
UNED
,
Pablo Castells
UAM and Amazon
,
Julio Gonzalo
UNED
,
Program Chairs:
Ben Carterette
Spotify
,
J. Shane Culpepper
RMIT University
,
Gabriella Kazai
Waseda University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Deutsche Forschungsgemeinschaft

Conference

SIGIR '22

Sponsor:

SIGIR

SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 11 - 15, 2022

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
177
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)7

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Thakur NBonifacio LFröbe MBondarenko AKamalloo EPotthast MHagen MLin JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIRProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657861(1420-1430)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657861
Anand ASaha SSen PMitra M(2023)Explainability of Text Processing and Retrieval MethodsProceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3632754.3632944(153-157)Online publication date: 15-Dec-2023
https://dl.acm.org/doi/10.1145/3632754.3632944
Anand ASen PSaha SVerma MMitra MChen HDuh WHuang HKato MMothe JPoblete B(2023)Explainable Information RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3594249(3448-3451)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3594249

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten