ABSTRACT
Search result diversification focuses on reducing redundancy and improving subtopic richness in the results for a given query. Most existing approaches measure document diversity mainly based on text or pre-trained representations. However, some underlying relationships between the query and documents are difficult for the model to capture only from the content. Given that the knowledge base can offer well-defined entities and explicit relationships between entities, we exploit knowledge to model the relationship between documents and the query and propose a knowledge-enhanced search result diversification approach KEDIV. Concretely, we build a query-specific relation graph to model the complicated query-document relationship from an entity view. Then a graph neural network and node weight adjust algorithm are applied to the relation graph to obtain context-aware entity representations and document representations at each selection step. The diversity features are derived from the updated node representations of the relation graph. In this way, we can take advantage of entities' abundant information to model document's diversity in search result diversification. Experimental results on commonly used datasets show that our proposed approach can outperform the state-of-the-art methods.
Supplemental Material
- Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In Proc. of SIGMOD.Google ScholarDigital Library
- Antoine Bordes, Nicolas Usunier, Alberto García-Durá n, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Proc. of NeurIPS.Google Scholar
- Jamie Callan, Mark Hoy, Changkuk Yoo, and Le Zhao". 2009. Clueweb09 data set. https://boston.lti.cs.cmu.edu/Data/clueweb09/Google Scholar
- Jaime G. Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proc. of SIGIR.Google ScholarDigital Library
- David Carmel, Haggai Roitman, and Naama Zwerdling. 2009. Enhancing Cluster Labeling Using Wikipedia. In Proc. of SIGIR.Google ScholarDigital Library
- Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected reciprocal rank for graded relevance. In Proc. of CIKM.Google ScholarDigital Library
- Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. 2008. Novelty and diversity in information retrieval evaluation. In Proc. of SIGIR.Google ScholarDigital Library
- Charles L. A. Clarke, Maheedhar Kolla, and Olga Vechtomova. 2009. An Effectiveness Measure for Ambiguous and Underspecified Queries. In Proc. of ICTIR.Google ScholarDigital Library
- Van Dang and W. Bruce Croft. 2012. Diversity by proportionality: an election-based approach to search result diversification. In Proc. of SIGIR.Google Scholar
- Van Dang and W. Bruce Croft. 2013. Term level search result diversification. In Proc. of SIGIR.Google Scholar
- Yue Feng, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, and Xueqi Cheng. 2018. From Greedy Selection to Exploratory Decision-Making: Diverse Ranking with Policy-Value Networks. In Proc. of SIGIR.Google ScholarDigital Library
- Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis. In Proc. of IJCAI.Google Scholar
- Sreenivas Gollapudi and Aneesh Sharma. 2009. An Axiomatic Approach for Result Diversification. In Proc. of WWW.Google ScholarDigital Library
- Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg. 2015. Entity Linking in Queries: Tasks and Evaluation. In Proc. of ICTIR.Google ScholarDigital Library
- Sha Hu, Zhicheng Dou, Xiao-Jie Wang, Tetsuya Sakai, and Ji-Rong Wen. 2015. Search Result Diversification Based on Hierarchical Intents. In Proc. of CIKM.Google ScholarDigital Library
- Zhengbao Jiang, Zhicheng Dou, and Ji-Rong Wen. 2017a. Generating Query Facets Using Knowledge Bases. IEEE Trans. Knowl. Data Eng. (2017).Google ScholarDigital Library
- Zhengbao Jiang, Ji-Rong Wen, Zhicheng Dou, Wayne Xin Zhao, Jian-Yun Nie, and Ming Yue. 2017b. Learning to Diversify Search Results via Subtopic Attention. In Proc. of SIGIR .Google ScholarDigital Library
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In Proc. of ICLR.Google Scholar
- Jiongnan Liu, Zhicheng Dou, Xiao-Jie Wang, Shuqi Lu, and Ji-Rong Wen. 2020. DVGAN: A Minimax Game for Search Result Diversification Combining Explicit and Implicit Features. In Proc. of SIGIR .Google ScholarDigital Library
- Zhenghao Liu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2018. Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval. In Proc. of ACL.Google ScholarCross Ref
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In Proc. of ICLR.Google Scholar
- Shuqi Lu, Zhicheng Dou, Chenyan Xiong, Xiaojie Wang, and Ji-Rong Wen. 2020. Knowledge Enhanced Personalized Search.Google Scholar
- Rada Mihalcea and Andras Csomai. 2007. Wikify! Linking Documents to Encyclopedic Knowledge. In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management.Google ScholarDigital Library
- Simone Paolo Ponzetto and Roberto Navigli. 2010. Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems. In Proc. of ACL.Google Scholar
- Xubo Qin, Zhicheng Dou, and Ji-Rong Wen. 2020. Diversifying Search Results using Self-Attention Network. In Proc. of CIKM.Google ScholarDigital Library
- Celina Santamaría, Julio Gonzalo, and Javier Artiles. 2010. Wikipedia as Sense Inventory to Improve Diversity in Web Search Results. In Proc. of ACL.Google Scholar
- Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2010. Exploiting query reformulations for web search result diversification. In Proc. of WWW.Google ScholarDigital Library
- Zhan Su, Zhicheng Dou, Yutao Zhu, Xubo Qin, and Ji-Rong Wen. 2021. Modeling Intent Graph for Search Result Diversification. In Proc. of SIGIR.Google ScholarDigital Library
- Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, and Zheng Zhang. 2020. CoLAKE: Contextualized Language and Knowledge Embedding. In Proc. of COLING.Google ScholarCross Ref
- Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. ERNIE: Enhanced Representation through Knowledge Integration. CoRR (2019).Google Scholar
- Zareen Saba Syed, Tim Finin, and Anupam Joshi. 2008. Wikipedia as an Ontology for Describing Documents. In Proc. of ICWSM.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proc. of NeurIPS.Google Scholar
- Wiebke Wagner. 2010. Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit - O'Reilly Media, Beijing, 2009, ISBN 978-0-596-51649-9. Lang. Resour. Evaluation (2010).Google Scholar
- Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. Listwise approach to learning to rank: theory and algorithm. In Proc. of ICML.Google ScholarDigital Library
- Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. 2015. Learning Maximal Marginal Relevance Model via Directly Optimizing Diversity Evaluation Measures. In Proc. of SIGIR.Google ScholarDigital Library
- Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. 2016. Modeling Document Novelty with Neural Tensor Network for Search Result Diversification. In Proc. of SIGIR.Google ScholarDigital Library
- Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, and Xueqi Cheng. 2017. Adapting Markov Decision Process for Search Result Diversification. In Proc. of SIGIR.Google ScholarDigital Library
- Chenyan Xiong, Jamie Callan, and Tie-Yan Liu. 2017a. Word-Entity Duet Representations for Document Ranking. In Proc. of SIGIR.Google ScholarDigital Library
- Chenyan Xiong, Russell Power, and Jamie Callan. 2017b. Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding.Google ScholarDigital Library
- Chenyan Xiong, Russell Power, and Jamie Callan. 2017c. Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding. In Proc. of WWW.Google ScholarDigital Library
- Jun Xu, Zeng Wei, Long Xia, Yanyan Lan, Dawei Yin, Xueqi Cheng, and Ji-Rong Wen. 2020. Reinforcement Learning to Rank with Pairwise Policy Gradient.Google Scholar
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In Proc. of ICLR.Google Scholar
- Le Yan, Zhen Qin, Rama Kumar Pasumarthi, Xuanhui Wang, and Michael Bendersky. 2021. Diversification-Aware Learning to Rank Using Distributed Representation.Google Scholar
- Jun Yu, Sunil Mohan, Duangmanee Putthividhya, and Weng-Keen Wong. 2014. Latent dirichlet allocation based diversified retrieval for e-commerce search. In Proc. of WSDM.Google ScholarDigital Library
- Yisong Yue and Thorsten Joachims. 2008. Predicting diverse subsets using structural SVMs. In Proc. of ICML.Google ScholarDigital Library
- ChengXiang Zhai, William W. Cohen, and John D. Lafferty. 2003. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proc. of SIGIR.Google Scholar
- Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proc. of ACL.Google ScholarCross Ref
- Jianghong Zhou, Eugene Agichtein, and Surya Kallumadi. 2020 a. Diversifying Multi-Aspect Search Results Using Simpson's Diversity Index.Google Scholar
- Jianghong Zhou, Eugene Agichtein, and Surya Kallumadi. 2020 b. Diversifying Multi-Aspect Search Results Using Simpson's Diversity Index.Google Scholar
- Yadong Zhu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng, and Shuzi Niu. 2014. Learning for search result diversification. In Proc. of SIGIR.Google ScholarDigital Library
Index Terms
- Knowledge Enhanced Search Result Diversification
Recommendations
Modeling Intent Graph for Search Result Diversification
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalSearch result diversification aims to offer diverse documents that cover as many intents as possible. Most existing implicit diversification approaches model diversity through the similarity of document representation, which is indirect and unnatural. To ...
Search Result Diversification Based on Hierarchical Intents
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementA large percentage of queries issued to search engines are broad or ambiguous. Search result diversification aims to solve this problem, by returning diverse results that can fulfill as many different information needs as possible. Most existing intent-...
Multidimensional search result diversification: diverse search results for diverse users
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalHundreds of millions of people today rely on Web based Search Engines to satisfy their information needs. In order to meet the expectations of this vast and diverse user population, the search engine should present a list of results such that the ...
Comments