Abstract
LambdaMART, a potent black-box Learning-to-Rank (LTR) model, has been shown to outperform neural network models across tabular ranking benchmark datasets. However, its lack of transparency challenges its application in many real-world domains. Local list-wise explanation techniques provide scores that explain the importance of the features in a list of documents associated with a query to the prediction of black-box LTR models. This study investigates which list-wise explanation techniques provide the most faithful explanations for LambdaMART models. Several local explanation techniques are evaluated for this, i.e., Greedy Score, RankLIME, EXS, LIRME, LIME, and SHAP. Moreover, a non-LTR explanation technique is applied, called Permutation Importance (PMI) to obtain list-wise explanations of LambdaMART. The techniques are compared based on eight evaluation metrics, i.e., Consistency, Completeness, Validity, Fidelity, ExplainNCDG@10, (In)fidelity, Ground Truth, and Feature Frequency similarity. The evaluation is performed on three benchmark datasets: Yahoo, Microsoft Bing Search (MSLR-WEB10K), and LETOR 4 (MQ2008), along with a synthetic dataset. The experimental results show that no single explanation technique is faithful across all datasets and evaluation metrics. Moreover, the explanation techniques tend to be faithful for different subsets of the evaluation metrics; for example, RankLIME out-performs other explanation techniques with respect to Fidelity and ExplainNCDG, while PMI provides the most faithful explanations with respect to Validity and Completeness. Moreover, we show that explanation sample size and the normalization of feature importance scores in explanations can largely affect the faithfulness of explanation techniques across all datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Other types of local explanations include rules, counterfactual explanations, and examples, which, however, fall outside the focus of our study; see [25] for more details on such types of explanation.
- 2.
- 3.
For brevity, we refer to KernelSHAP as SHAP.
- 4.
- 5.
Concordant pairs are the pairs of documents that maintain the same relative rank between two ranked lists.
- 6.
For a detailed description of the sampling algorithm, see [15].
- 7.
For more details about the datasets, see the corresponding studies.
References
Agarwal, C., et al.: Openxai: towards a transparent evaluation of model explanations. Adv. Neural. Inf. Process. Syst. 35, 15784–15799 (2022)
Akhavan Rahnama, A.H.: The blame problem in evaluating local explanations and how to tackle it. In: Nowaczyk, S., et al. (eds.) ECAI 2023, vol. 1947, pp. 66–86. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-50396-2_4
Alsulmi, M., Carterette, B.: Improving medical search tasks using learning to rank. In: 2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pp. 1–8. IEEE (2018)
Arias-Duart, A., Parés, F., Garcia-Gasulla, D., Gimenez-Abalos, V.: Focus! rating xai methods and finding biases. In: 2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–8. IEEE (2022)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Burges, C.J.C.: From ranknet to lambdarank to lambdamart: an overview. Learning 11(23–581), 81 (2010)
Chapelle, O., Chang, Y.: Yahoo! learning to rank challenge overview. In: Proceedings of the Learning to Rank Challenge, pp. 1–24. PMLR (2011)
Chapelle, O., Chang, Y., Liu, T.Y.: Future directions in learning to rank. In: Proceedings of the Learning to Rank Challenge, pp. 91–100. PMLR (2011)
Chen, H., Zhang, H., Boning, D., Hsieh, C.J.: Robust decision trees against adversarial examples. In: International Conference on Machine Learning, pp. 1122–1131. PMLR (2019)
Chen, T., et al. Xgboost: extreme gradient boosting. R package version 0.4-2 1(4):1–4 (2015)
Chen, W., Liu, T.Y., Lan, Y., Ma, Z.M., Li, H.: Ranking measures and loss functions in learning to rank. Adv. Neural Inf. Process. Syst. 22 (2009)
Chowdhury, T., Rahimi, R., Allan, J.: Rank-lime: local model-agnostic feature attribution for learning to rank. arXiv preprint arXiv:2212.12722 (2022)
Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019)
Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explorat. Newsl. 15(1), 1–10 (2014)
Garreau, D., von Luxburg, U.: Looking deeper into tabular lime. arXiv preprint arXiv:2008.11092 (2020)
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)
Hedström, A., et al.: An explainable AI toolkit for responsible evaluation of neural network explanations and beyond. J. Mach. Learn. Res. 24(34), 1–11 (2023)
Hsieh, C.Y., et al.: Evaluations and methods for explanation through robustness analysis. In: Proceedings of International Conference on Learning Representations (2020)
Jain, A., Ravula, M., Ghosh, J.: Biased models have biased explanations. arXiv preprint arXiv:2012.10986 (2020)
Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 30 (2017)
Liu, T.Y.: Learning to Rank for Information Retrieval. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-14267-3
Liu, Y., Khandagale, S., White, C., Neiswanger, W.: Synthetic benchmarks for scientific research in explainable machine learning. arXiv preprint arXiv:2106.12543 (2021)
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017)
Lyu, L., Anand, A.: Listwise explanations for ranking models using multiple explainers. In: Kamps, J., et al. (eds.) European Conference on Information Retrieval, vol. 13890, pp. 653–668. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-28244-7_41
Molnar, C.: Interpretable machine learning (202). https://www.lulu.com/
Molnar, C., et al.: General pitfalls of model-agnostic interpretation methods for machine learning models. In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Muller, K.R., Samek, W. (eds.) xxAI 2020. LNCS, vol. 13200, pp. 39–68. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-04083-2_4
Qin, T., Liu, T.-Y., Jun, X., Li, H.: Letor: a benchmark collection for research on learning to rank for information retrieval. Inf. Retr. 13(4), 346–374 (2010)
Qin, Z., et al.: Are neural rankers still outperformed by gradient boosted decision trees? In: The International Conference on Learning Representations (ICLR) (2021)
Rahnama, A.H.A., Bütepage, J., Geurts, P., Boström, H.: Can local explanation techniques explain linear additive models? Data Min. Knowl. Disc. 38(1), 237–280 (2024)
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)
Singh, J., Anand, A.: EXS: explainable search using local model agnostic interpretability. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 770–773 (2019)
JSingh, J., Khosla, M., Zhenye, W., Anand, A.: Extracting per query valid explanations for blackbox learning-to-rank models. In: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval, pp. 203–210 (2021)
Hoeve, M.T., Schuth, A., Odijk, D., de Rijke, M.: Faithfully explaining rankings in a news recommender system. arXiv preprint arXiv:1805.05447 (2018)
Verma, M., Ganguly, D.: Lirme: locally interpretable ranking model explanation. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1281–1284 (2019)
Yeh, C.K., Hsieh, C.Y., Suggala, A., Inouye, D.I., Ravikumar, P.K.: On the (in) fidelity and sensitivity of explanations. Adv. Neural Inf. Process. Syst. 32 (2019)
Yu, P., Rahimi, R., Allan, J.: Towards explainable search results: a listwise explanation generator. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 669–680 (2022)
Zehlike, M., Yang, K., Stoyanovich, J.: Fairness in ranking, part i: score-based ranking. ACM Comput. Surv. 55(6), 1–36 (2022)
Zhang, C., Zhang, H., Hsieh, C.-J.: An efficient adversarial attack for tree ensembles. Adv. Neural. Inf. Process. Syst. 33, 16165–16176 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rahnama, A.H.A., Bütepage, J., Boström, H. (2024). Local List-Wise Explanations of LambdaMART. In: Longo, L., Lapuschkin, S., Seifert, C. (eds) Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2154. Springer, Cham. https://doi.org/10.1007/978-3-031-63797-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-63797-1_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63796-4
Online ISBN: 978-3-031-63797-1
eBook Packages: Computer ScienceComputer Science (R0)