ABSTRACT
As information retrieval (IR) systems, such as search engines and conversational agents, become ubiquitous in various domains, the need for transparent and explainable systems grows to ensure accountability, fairness, and unbiased results. Despite many recent advances toward explainable AI and IR techniques, there is no consensus on what it means for a system to be explainable. Although a growing body of literature suggests that explainability is comprised of multiple subfactors [2, 5, 6], virtually all existing approaches treat it as a singular notion. Additionally, while neural retrieval models (NRMs) have become popular for their ability to achieve high performance[3, 4, 7, 8], research on the explainability of NRMs has been largely unexplored until recent years. Numerous questions remain unanswered regarding the most effective means of comprehending how these intricate models arrive at their decisions and the extent to which these methods will function efficiently for both developers and end-users.
This research aims to develop effective methods to evaluate and advance explainable retrieval systems toward the broader research field goal of creating techniques to make potential biases more identifiable. Specifically, I aim to investigate the following:
RQ1: How do we quantitatively measure explainability?
RQ2: How can we develop a set of inherently explainable NRMs using feature attributions that are robust across different retrieval domain contexts?
RQ3: How can we leverage knowledge about influential training instances to better understand NRMs and promote more efficient search practices?
In future work, I plan to address RQ2 and RQ3 by investigating two avenues of attribution methods, feature-based and instance-based, to develop a suite of explainable NRMs. While much work has been done on investigating the interpretability of deep neural network architectures in the general ML field, particularly in vision and language domains, creating inherently explainable neural architectures remains largely unexplored in IR. Thus, I intend to draw on previous work in the broader fields of NLP and ML to develop methods that offer deeper insights into the inner workings of NRMs and how ranking decisions are made.
By developing explainable IR systems, we can facilitate users' comprehension of the intricate, non-linear mechanisms that link their search queries to highly ranked content. If applied correctly, this research has the potential to benefit society in a broad range of applications, such as disinformation detection and clinical decision support. Given their critical importance in modern society, these areas demand robust solutions to combat the escalating dissemination of false information. By enhancing the transparency and accountability of these systems, explainable systems can play a crucial role in curbing this trend.
- Catherine Chen and Carsten Eickhoff. 2022. Evaluating Search Explainability with Psychometrics and Crowdsourcing. arXiv preprint arXiv:2210.09430 (2022).Google Scholar
- Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
- Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM international on conference on information and knowledge management. 55--64.Google ScholarDigital Library
- Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W Bruce Croft, and Xueqi Cheng. 2020. A deep look into neural ranking models for information retrieval. Information Processing & Management, Vol. 57, 6 (2020), 102067.Google ScholarCross Ref
- Zachary C Lipton. 2018. The Mythos of Model Interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, Vol. 16, 3 (2018), 31--57.Google ScholarDigital Library
- Meike Nauta, Jan Trienes, Shreyasi Pathak, Elisa Nguyen, Michelle Peters, Yasmin Schmitt, Jörg Schlötterer, Maurice van Keulen, and Christin Seifert. 2022. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI. arXiv preprint arXiv:2201.08164 (2022).Google Scholar
- Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).Google Scholar
- Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text matching as image recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.Google ScholarCross Ref
- Yinglong Zhang, Jin Zhang, Matthew Lease, and Jacek Gwizdka. 2014. Multidimensional relevance modeling via psychometrics and crowdsourcing. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 435--444.Google ScholarDigital Library
Index Terms
- Quantifying and Advancing Information Retrieval System Explainability
Recommendations
Designing Explainability of an Artificial Intelligence System
TechMindSociety '18: Proceedings of the Technology, Mind, and SocietyExplainability and accuracy of the machine learning algorithms usually laid on a trade-off relationship. Several algorithms such as deep-learning artificial neural networks have high accuracy but low explainability. Since there were only limited ways to ...
Quo Vadis, Explainability? – A Research Roadmap for Explainability Engineering
Requirements Engineering: Foundation for Software QualityAbstract[Context and motivation] In our modern society, software systems are highly integrated into our daily life. Quality aspects such as ethics, fairness, and transparency have been discussed as essential for trustworthy software systems and ...
Machine Learning Explainability and Robustness: Connected at the Hip
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningThis tutorial examines the synergistic relationship between explainability methods for machine learning and a significant problem related to model quality: robustness against adversarial perturbations. We begin with a broad overview of approaches to ...
Comments