skip to main content
10.1145/3477495.3532067acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Public Access

Towards Explainable Search Results: A Listwise Explanation Generator

Published: 07 July 2022 Publication History


It has been shown that the interpretability of search results is enhanced when query aspects covered by documents are explicitly provided. However, existing work on aspect-oriented explanation of search results explains each document independently. These explanations thus cannot describe the differences between documents. This issue is also true for existing models on query aspect generation. Furthermore, these models provide a single query aspect for each document, even though documents often cover multiple query aspects. To overcome these limitations, we propose LiEGe, an approach that jointly explains all documents in a search result list. LiEGe provides semantic representations at two levels of granularity -- documents and their tokens -- using different interaction signals including cross-document interactions. These allow listwise modeling of a search result list as well as the generation of coherent explanations for documents. To appropriately explain documents that cover multiple query aspects, we introduce two settings for search result explanation: comprehensive and novelty explanation generation. LiEGe is trained and evaluated for both settings. We evaluate LiEGe on datasets built from Wikipedia and real query logs of the Bing search engine. Our experimental results demonstrate that LiEGe outperforms all baselines, with improvements that are substantial and statistically significant.

Supplementary Material

MP4 File (SIGIR22-fp0990.mp4)
Presentation video


Wasi Uddin Ahmad, Kai-Wei Chang, and Hongning Wang. 2019. Context attentive document ranking and query suggestion. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 385--394.
Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W Bruce Croft. 2019. Asking clarifying questions in open-domain information-seeking conversations. In Proceedings of the 42nd international acm sigir conference on research and development in information retrieval. 475--484.
Milad Alshomary, Nick Düsterhus, and Henning Wachsmuth. 2020. Extractive snippet generation for arguments. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval . 1969--1972.
Doug Beeferman and Adam Berger. 2000. Agglomerative clustering of a search engine query log. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining . 407--416.
Sumit Bhatia, Debapriyo Majumdar, and Prasenjit Mitra. 2011. Query suggestions in the absence of query logs. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval . 795--804.
Arbi Bouchoucha, Jing He, and Jian-Yun Nie. 2013. Diversified query expansion using conceptnet. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management . 1861--1864.
Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval . 335--336.
Ben Carterette and Praveen Chandar. [n.d.]. Probabilistic Models of Ranking Novel Documents for Faceted Topic Retrieval. In CIKM'09 .
Jun Chen, Xiaoming Zhang, Yu Wu, Zhao Yan, and Zhoujun Li. 2018c. Keyphrase Generation with Correlation Constraints. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 4057--4066.
Wanyu Chen, Fei Cai, Honghui Chen, and Maarten de Rijke. 2018a. Attention-based hierarchical neural query suggestion. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . 1093--1096.
Wei-Fan Chen, Matthias Hagen, Benno Stein, and Martin Potthast. 2018b. A user study on snippet generation: Text reuse vs. paraphrases. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . 1033--1036.
Wei-Fan Chen, Shahbaz Syed, Benno Stein, Matthias Hagen, and Martin Potthast. 2020. Abstractive Snippet Generation. In Proceedings of The Web Conference 2020 . 1309--1319.
Nachshon Cohen, Oren Kalinsky, Yftah Ziser, and Alessandro Moschitti. 2021. WikiSum: Coherent Summarization Dataset for Efficient Human-Evaluation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 212--219.
Van Dang and Bruce W Croft. 2010. Query reformulation using anchor text. In Proceedings of the third ACM international conference on Web search and data mining. 41--50.
Zhicheng Dou, Sha Hu, Kun Chen, Ruihua Song, and Ji-Rong Wen. 2011. Multi-dimensional search result diversification. In Proceedings of the fourth ACM international conference on Web search and data mining. 475--484.
Zi-Yi Dou, Pengfei Liu, Hiroaki Hayashi, Zhengbao Jiang, and Graham Neubig. 2021. GSum: A General Framework for Guided Neural Abstractive Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 4830--4842.
Maarten Grootendorst. 2020. KeyBERT: Minimal keyword extraction with BERT.
Florian Haag, Qi Han, Markus John, and Thomas Ertl. 2014. Aspect Grid: A Visualization for Iteratively Refining Aspect-Based Queries on Document Collections. In GI-Jahrestagung. 655--660.
John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), Vol. 28, 1 (1979), 100--108.
Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2021. Learning Multiple Intent Representations for Search Queries. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management . 669--679.
Taher H Haveliwala. 2002. Topic-sensitive PageRank. In Proceedings of the 11th international conference on World Wide Web. 517--526.
Jiyin He, Vera Hollink, and Arjen de Vries. 2012. Combining implicit and explicit topic representations for result diversification. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval . 851--860.
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. Advances in neural information processing systems, Vol. 28 (2015), 1693--1701.
Mayu Iwata, Tetsuya Sakai, Takehiro Yamamoto, Yu Chen, Yi Liu, Ji-Rong Wen, and Shojiro Nishio. 2012. Aspectiles: Tile-based visualization of diversified web search results. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval. 85--94.
Bernard J Jansen, Amanda Spink, Judy Bateman, and Tefko Saracevic. 1998. Real life information retrieval: A study of user queries on the web. In ACM Sigir Forum, Vol. 32. ACM New York, NY, USA, 5--17.
Zhengbao Jiang, Zhicheng Dou, and Ji-Rong Wen. 2016. Generating query facets using knowledge bases. IEEE transactions on knowledge and data engineering, Vol. 29, 2 (2016), 315--329.
Zhengbao Jiang, Ji-Rong Wen, Zhicheng Dou, Wayne Xin Zhao, Jian-Yun Nie, and Ming Yue. 2017. Learning to diversify search results via subtopic attention. In Proceedings of the 40th international ACM SIGIR Conference on Research and Development in Information Retrieval. 545--554.
Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web. 387--396.
Weize Kong and James Allan. 2013. Extracting query facets from search results. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval . 93--102.
Weize Kong and James Allan. 2014. Extending faceted search to the general web. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management . 839--848.
Weize Kong and James Allan. 2016. Precision-oriented query facet extraction. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management . 1433--1442.
Reiner Kraft and Jason Zien. 2004. Mining anchor text for query refinement. In Proceedings of the 13th international conference on World Wide Web. 666--674.
Victor Lavrenko and W Bruce Croft. 2001. Relevance based language models. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval . 120--127.
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871--7880.
Chenliang Li, Weiran Xu, Si Li, and Sheng Gao. 2018. Guiding generation for abstractive text summarization based on key information guide network. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 55--60.
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out . 74--81.
Hui Liu, Qingyu Yin, and William Yang Wang. 2019. Towards Explainable NLP: A Generative Explanation Framework for Text Classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5570--5581.
Jiongnan Liu, Zhicheng Dou, Xiaojie Wang, Shuqi Lu, and Ji-Rong Wen. 2020. DVGAN: A Minimax Game for Search Result Diversification Combining Explicit and Implicit Features. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval . 479--488.
Yang Liu and Mirella Lapata. 2019. Hierarchical Transformers for Multi-Document Summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . 5070--5081.
Ilya Loshchilov and Frank Hutter. 2018. Decoupled Weight Decay Regularization. In International Conference on Learning Representations .
Sean MacAvaney, Craig Macdonald, Roderick Murray-Smith, and Iadh Ounis. 2021. IntenT5: Search Result Diversification using Causal Language Models. arXiv preprint arXiv:2108.04026 (2021).
Rada Mihalcea and Paul Tarau. 2004. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing. 404--411.
Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, cC aug lar Gulcc ehre, and Bing Xiang. 2016. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. 280--290.
Shashi Narayan, Shay B Cohen, and Mirella Lapata. 2018. Don't give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. arXiv preprint arXiv:1808.08745 (2018).
Preksha Nema, Mitesh M Khapra, Anirban Laha, and Balaraman Ravindran. 2017. Diversity driven attention model for query-based abstractive summarization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1063--1072.
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).
Liang Pang, Jun Xu, Qingyao Ai, Yanyan Lan, Xueqi Cheng, and Jirong Wen. 2020. Setrank: Learning a permutation-invariant ranking model for information retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 499--508.
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics . 311--318.
Rama Kumar Pasumarthi, Honglei Zhuang, Xuanhui Wang, Michael Bendersky, and Marc Najork. 2020. Permutation equivariant document interaction network for neural learning to rank. In Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval. 145--148.
Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, and Jianfeng Gao. 2021. Data Augmentation for Abstractive Query-Focused Multi-Document Summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 13666--13674.
Matt Post. 2018. A Call for Clarity in Reporting BLEU Scores. In Proceedings of the Third Conference on Machine Translation: Research Papers. 186--191.
Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, and Ruslan Salakhutdinov. 2021. Focused Attention Improves Document-Grounded Generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4274--4287.
Xubo Qin, Zhicheng Dou, and Ji-Rong Wen. 2020. Diversifying Search Results using Self-Attention Network. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management . 1265--1274.
Filip Radlinski, Martin Szummer, and Nick Craswell. 2010. Inferring query intent from reformulations and clicks. In Proceedings of the 19th international conference on World wide web. 1171--1172.
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, Vol. 21, 140 (2020), 1--67.
Razieh Rahimi, Youngwoo Kim, Hamed Zamani, and James Allan. 2021. Explaining Documents' Relevance to Search Queries. arXiv preprint arXiv:2111.01314 (2021).
Daniël Rennings, Felipe Moraes, and Claudia Hauff. 2019. An axiomatic approach to diagnosing neural IR models. In European Conference on Information Retrieval. Springer, 489--503.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016).
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2015. Search Result Diversification. Found. Trends Inf. Retr., Vol. 9, 1 (March 2015), 1--90.
Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1073--1083.
Jaspreet Singh and Avishek Anand. 2019. Exs: Explainable search using local model agnostic interpretability. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining . 770--773.
Jaspreet Singh and Avishek Anand. 2020. Model agnostic interpretability of rankers via intent modelling. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency . 618--628.
Jaspreet Singh, Megha Khosla, Wang Zhenye, and Avishek Anand. 2021. Extracting per Query Valid Explanations for Blackbox Learning-to-Rank Models. In Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval. 203--210.
Zhan Su, Zhicheng Dou, Yutao Zhu, Xubo Qin, and Ji-Rong Wen. 2021. Modeling Intent Graph for Search Result Diversification .ACM, 736--746.
Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. 2017. Abstractive document summarization with a graph-based attentional neural model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) . 1171--1181.
Paul Thomas, Bodo Billerbeck, Nick Craswell, and Ryen W. White. 2019. Investigating Searchers' Mental Models to Inform Search Explanations., Vol. 38, 1, Article 10 (Dec. 2019), bibinfonumpages25 pages.
TREC. 2000. Text REtrieval Conference (TREC) Data - English Relevance Judgements. .
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
Manisha Verma and Debasis Ganguly. 2019. LIRME: locally interpretable ranking model explanation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1281--1284.
Michael Völske, Alexander Bondarenko, Maik Fröbe, Benno Stein, Jaspreet Singh, Matthias Hagen, and Avishek Anand. 2021. Towards Axiomatic Explanations for Neural Ranking Models. In Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval . 13--22.
Qinglei Wang, Yanan Qian, Ruihua Song, Zhicheng Dou, Fan Zhang, Tetsuya Sakai, and Qinghua Zheng. 2013. Mining subtopics from text fragments for a web query. Information retrieval, Vol. 16, 4 (2013), 484--503.
Xuanhui Wang and ChengXiang Zhai. 2008. Mining term association patterns from search logs for effective query reformulation. In Proceedings of the 17th ACM conference on Information and knowledge management . 479--488.
Yumo Xu and Mirella Lapata. 2020. Coarse-to-fine query focused multi-document summarization. In Proceedings of the 2020 Conference on empirical methods in natural language processing (EMNLP) . 3632--3645.
Le Yan, Zhen Qin, Rama Kumar Pasumarthi, Xuanhui Wang, and Michael Bendersky. 2021. Diversification-Aware Learning to Rank using Distributed Representation. In Proceedings of the Web Conference 2021. 127--136.
Jin-ge Yao, Xiaojun Wan, and Jianguo Xiao. 2017. Recent advances in document summarization. Knowledge and Information Systems, Vol. 53, 2 (2017), 297--336.
Puxuan Yu, Razieh Rahimi, Zhiqi Huang, and James Allan. 2020. Learning to Rank Entities for Set Expansion from Unstructured Data. In Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval . 21--28.
Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020 a. Generating clarifying questions for information retrieval. In Proceedings of The Web Conference 2020. 418--428.
Hamed Zamani, Gord Lueck, Everest Chen, Rodolfo Quispe, Flint Luu, and Nick Craswell. 2020 b. Mimics: A large-scale data collection for search clarification. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management . 3189--3196.
Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N Bennett, Nick Craswell, and Susan T Dumais. 2020 c. Analyzing and learning from user interactions for search clarification. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1181--1190.
Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, and Xueqi Cheng. 2019 a. Outline generation: Understanding the inherent content structure of documents. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 745--754.
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. 2019 b. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations .
Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Alexander Grushetsky, Yonghui Wu, Petr Mitrichev, Ethan Sterling, Nathan Bell, Walker Ravina, and Hai Qian. 2021. Interpretable Ranking with Generalized Additive Models. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (Virtual Event, Israel) (WSDM '21). 499--507.

Cited By

View all
  • (2024)CFE2: Counterfactual Editing for Search Result ExplanationProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672508(145-155)Online publication date: 2-Aug-2024
  • (2024)Causal Probing for Dual EncodersProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679556(2292-2303)Online publication date: 21-Oct-2024
  • (2024)Evaluating Search System Explainability with Psychometrics and CrowdsourcingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657796(1051-1061)Online publication date: 10-Jul-2024
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2022
3569 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022


Request permissions for this article.

Check for updates

Author Tags

  1. explainable search
  2. novelty and diversity
  3. query aspects


  • Research-article

Funding Sources



Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)435
  • Downloads (Last 6 weeks)61
Reflects downloads up to 28 Feb 2025

Other Metrics


Cited By

View all
  • (2024)CFE2: Counterfactual Editing for Search Result ExplanationProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672508(145-155)Online publication date: 2-Aug-2024
  • (2024)Causal Probing for Dual EncodersProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679556(2292-2303)Online publication date: 21-Oct-2024
  • (2024)Evaluating Search System Explainability with Psychometrics and CrowdsourcingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657796(1051-1061)Online publication date: 10-Jul-2024
  • (2024)Dissecting users' needs for search result explanationsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642059(1-17)Online publication date: 11-May-2024
  • (2024)Interactions with Generative Information Retrieval SystemsInformation Access in the Era of Generative AI10.1007/978-3-031-73147-1_3(47-71)Online publication date: 12-Sep-2024
  • (2024)Conclusions and Open ChallengesTechnical and Regulatory Perspectives on Information Retrieval and Recommender Systems10.1007/978-3-031-69978-8_6(143-146)Online publication date: 24-Oct-2024
  • (2024)Privacy and SecurityTechnical and Regulatory Perspectives on Information Retrieval and Recommender Systems10.1007/978-3-031-69978-8_5(103-141)Online publication date: 24-Oct-2024
  • (2024)TransparencyTechnical and Regulatory Perspectives on Information Retrieval and Recommender Systems10.1007/978-3-031-69978-8_4(69-102)Online publication date: 24-Oct-2024
  • (2024)Biases, Fairness, and Non-discriminationTechnical and Regulatory Perspectives on Information Retrieval and Recommender Systems10.1007/978-3-031-69978-8_3(29-67)Online publication date: 24-Oct-2024
  • (2024)Regulatory InitiativesTechnical and Regulatory Perspectives on Information Retrieval and Recommender Systems10.1007/978-3-031-69978-8_2(11-27)Online publication date: 24-Oct-2024
  • Show More Cited By

View Options

View options


View or Download as a PDF file.



View online with eReader.


Login options






Share this Publication link

Share on social media