Diversify Search Results Through Graph Attentive Document Interaction

Xu, Xianghong; Ouyang, Kai; Zheng, Yin; Lu, Yanxiong; Zheng, Hai-Tao; Kim, Hong-Gee

doi:10.1007/978-3-031-00123-9_51

Xianghong Xu¹⁶,
Kai Ouyang¹⁶,
Yin Zheng¹⁷,
Yanxiong Lu¹⁷,
Hai-Tao Zheng^16,18 &
…
Hong-Gee Kim¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13245))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

3020 Accesses
1 Citations

Abstract

The goal of search result diversification is to retrieve diverse documents to meet as many different information needs as possible. Graph neural networks provide a feasible way to capture the sophisticated relationship between candidate documents, while existing graph-based diversification methods require an extra model to construct the graph, which will bring about the problem of error accumulation. In this paper, we propose a novel model to address this problem. Specifically, we maintain a document interaction graph for the candidate documents of each query to model the diverse information interactions between them. To extract latent diversity features, we adopt graph attention networks (GATs) to update the representation of each document by aggregating its neighbors with learnable weights, which enables our model not dependent on knowing the graph structure in advance. Finally, we simultaneously compute the ranking score of each candidate document with the extracted latent diversity features and the traditional relevance features, and the ranking can be acquired by sorting the scores. Experimental results on TREC Web Track benchmark datasets show that the proposed model outperforms existing state-of-the-art models.

X. Xu and K. Ouyang—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://boston.lti.cs.cmu.edu/Data/clueweb09/.
2.
Lemur service: http://boston.lti.cs.cmu.edu/Services/clueweb09_batch.

References

Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: WSDM, pp. 5–14 (2009)
Google Scholar
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR, pp. 335–336 (1998)
Google Scholar
Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: CIKM, pp. 621–630 (2009)
Google Scholar
Clarke, C.L., et al.: Novelty and diversity in information retrieval evaluation. In: SIGIR, pp. 659–666 (2008)
Google Scholar
Clarke, C.L.A., Kolla, M., Vechtomova, O.: An effectiveness measure for ambiguous and underspecified queries. In: Azzopardi, L., et al. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 188–199. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04417-5_17
Chapter Google Scholar
Dang, V., Croft, B.W.: Term level search result diversification. In: SIGIR, pp. 603–612 (2013)
Google Scholar
Dang, V., Croft, W.B.: Diversity by proportionality: an election-based approach to search result diversification. In: SIGIR, pp. 65–74 (2012)
Google Scholar
Dang, V., Xue, X., Croft, W.B.: Inferring query aspects from reformulations using clustering. In: CIKM, pp. 2117–2120 (2011)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL, pp. 4171–4186 (2019)
Google Scholar
Goswami, A., Zhai, C., Mohapatra, P.: Learning to diversify for e-commerce search with multi-armed bandit. In: SIGIR Workshop (2019)
Google Scholar
Hu, S., Dou, Z., Wang, X., Sakai, T., Wen, J.R.: Search result diversification based on hierarchical intents. In: CIKM, pp. 63–72 (2015)
Google Scholar
Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Inf. Process. Manag. 36(2), 207–227 (2000)
Article Google Scholar
Jiang, Z., Wen, J.R., Dou, Z., Zhao, W.X., Nie, J.Y., Yue, M.: Learning to diversify search results via subtopic attention. In: SIGIR, pp. 545–554 (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196. PMLR (2014)
Google Scholar
Liu, J., Dou, Z., Wang, X., Lu, S., Wen, J.R.: DVGAN: a minimax game for search result diversification combining explicit and implicit features. In: SIGIR, pp. 479–488 (2020)
Google Scholar
Nguyen, T.N., Kanhabua, N.: Leveraging dynamic query subtopics for time-aware search result diversification. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 222–234. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_19
Chapter Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford InfoLab (1999)
Google Scholar
Qin, X., Dou, Z., Wen, J.R.: Diversifying search results using self-attention network. In: CIKM, pp. 1265–1274 (2020)
Google Scholar
Rafiei, D., Bharat, K., Shukla, A.: Diversifying web search results. In: WWW, pp. 781–790 (2010)
Google Scholar
Santos, R.L., Macdonald, C., Ounis, I.: Exploiting query reformulations for web search result diversification. In: WWW, pp. 881–890 (2010)
Google Scholar
Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. In: ACM SIGIR Forum, vol. 33, pp. 6–12. ACM New York (1999)
Google Scholar
Song, R., Luo, Z., Wen, J.R., Yu, Y., Hon, H.W.: Identifying ambiguous queries in web search. In: WWW, pp. 1169–1170 (2007)
Google Scholar
Su, Z., Dou, Z., Zhu, Y., Qin, X., Wen, J.R.: Modeling intent graph for search result diversification. In: SIGIR (2021)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, pp. 5998–6008 (2017)
Google Scholar
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: ICLR (2018)
Google Scholar
Wang, C.J., Lin, Y.W., Tsai, M.F., Chen, H.H.: Mining subtopics from different aspects for diversifying search results. Inf. Retrieval 16(4), 452–483 (2013)
Article Google Scholar
Xia, F., Liu, T.Y., Wang, J., Zhang, W., Li, H.: Listwise approach to learning to rank: theory and algorithm. In: ICML, pp. 1192–1199 (2008)
Google Scholar
Xia, L., Xu, J., Lan, Y., Guo, J., Cheng, X.: Learning maximal marginal relevance model via directly optimizing diversity evaluation measures. In: SIGIR, pp. 113–122 (2015)
Google Scholar
Xia, L., Xu, J., Lan, Y., Guo, J., Cheng, X.: Modeling document novelty with neural tensor network for search result diversification. In: SIGIR, pp. 395–404 (2016)
Google Scholar
Yue, Y., Joachims, T.: Predicting diverse subsets using structural SVMs. In: ICML, pp. 1224–1231 (2008)
Google Scholar
Zheng, W., Fang, H., Yao, C.: Exploiting concept hierarchy for result diversification. In: CIKM, pp. 1844–1848 (2012)
Google Scholar
Zhu, Y., Lan, Y., Guo, J., Cheng, X., Niu, S.: Learning for search result diversification. In: SIGIR, pp. 293–302 (2014)
Google Scholar

Download references

Acknowledgement

This research is supported by National Natural Science Foundation of China (Grant No. 6201101015), Beijing Academy of Artificial Intelligence (BAAI), Natural Science Foundation of Guangdong Province (Grant No. 2021A1515012640), the Basic Research Fund of Shenzhen City (Grand No. JCYJ20210324120012033 and JCYJ20190813165003837), Overseas Cooperation Research Fund of Tsinghua Shenzhen International Graduate School (Grant No. HW2021008), and research fund of Tsinghua University - Tencent Joint Laboratory for Internet Innovation Technology.

Author information

Authors and Affiliations

Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
Xianghong Xu, Kai Ouyang & Hai-Tao Zheng
Department of Search and Application, Weixin Group, Tencent, Beijing, China
Yin Zheng & Yanxiong Lu
Pengcheng Laboratory, Shenzhen, 518055, China
Hai-Tao Zheng
Seoul National University, Seoul, South Korea
Hong-Gee Kim

Authors

Xianghong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Kai Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Yin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yanxiong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Hai-Tao Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Gee Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hai-Tao Zheng .

Editor information

Editors and Affiliations

Indian Institute of Technology Kanpur, Kanpur, India
Arnab Bhattacharya
National University of Singapore, Singapore, Singapore
Janice Lee Mong Li
University of California, Santa Barbara, Santa Barbara, CA, USA
Divyakant Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
Indraprastha Institute of Information Technology Delhi, New Delhi, India
Mukesh Mohania
Ashoka University, Sonepat, Haryana, India
Anirban Mondal
Indraprastha Institute of Information Technology Delhi, New Delhi, India
Vikram Goyal
University of Aizu, Aizu, Japan
Rage Uday Kiran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, X., Ouyang, K., Zheng, Y., Lu, Y., Zheng, HT., Kim, HG. (2022). Diversify Search Results Through Graph Attentive Document Interaction. In: Bhattacharya, A., et al. Database Systems for Advanced Applications. DASFAA 2022. Lecture Notes in Computer Science, vol 13245. Springer, Cham. https://doi.org/10.1007/978-3-031-00123-9_51

Download citation

DOI: https://doi.org/10.1007/978-3-031-00123-9_51
Published: 08 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-00122-2
Online ISBN: 978-3-031-00123-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics