research-article

Unify Graph Learning with Text: Unleashing LLM Potentials for Session Search

Authors:

Rui YanAuthors Info & Claims

WWW '24: Proceedings of the ACM Web Conference 2024

Pages 1509 - 1518

https://doi.org/10.1145/3589334.3645574

Published: 13 May 2024 Publication History

Abstract

Session search involves a series of interactive queries and actions to fulfill user's complex information need. Current strategies typically prioritize sequential modeling for deep semantic understanding, overlooking the graph structure in interactions. While some approaches focus on capturing structural information, they use a generalized representation for documents, neglecting the word-level semantic modeling. In this paper, we propose Symbolic Graph Ranker (SGR), which aims to take advantage of both text-based and graph-based approaches by leveraging the power of recent Large Language Models (LLMs). Concretely, we first introduce a set of symbolic grammar rules to convert session graph into text. This allows integrating session history, interaction process, and task instruction seamlessly as inputs for the LLM. Moreover, given the natural discrepancy between LLMs pre-trained on textual corpora, and the symbolic language we produce using our graph-to-text grammar, our objective is to enhance LLMs' ability to capture graph structures within a textual format. To achieve this, we introduce a set of self-supervised symbolic learning tasks including link prediction, node content generation, and generative contrastive learning, to enable LLMs to capture the topological information from coarse-grained to fine-grained. Experiment results and comprehensive analysis on two benchmark datasets, AOL and Tiangong-ST, confirm the superiority of our approach. Our paradigm also offers a novel and effective methodology that bridges the gap between traditional search strategies and modern LLMs.

Supplemental Material

MP4 File

Supplemental video

Download
7.27 MB

References

[1]

Wasi Uddin Ahmad, Kai-Wei Chang, and Hongning Wang. 2018. Multi-Task Learning for Document Ranking and Query Suggestion. In Proc. of ICLR.

[2]

Wasi Uddin Ahmad, Kai-Wei Chang, and Hongning Wang. 2019. Context Attentive Document Ranking and Query Suggestion. In Proc. of SIGIR.

Digital Library

[3]

Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, and Tong Wang. 2018. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. arxiv: 1611.09268 [cs.CL]

[4]

Luiz Henrique Bonifacio, Vitor Jeronymo, Hugo Queiroz Abonizio, Israel Campiotti, Marzieh Fadaee, Roberto Lotufo, and Rodrigo Nogueira. 2021. mMARCO: A Multilingual Version of MS MARCO Passage Ranking Dataset. arxiv: 2108.13897 [cs.CL]

[5]

Ralph Allan Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika, Vol. 39 (1952), 324. https://api.semanticscholar.org/CorpusID:125209808

[6]

Lei Cai and Shuiwang Ji. 2020. A multi-scale approach for graph link prediction. In Proc. of AAAI.

[7]

Haonan Chen, Zhicheng Dou, Qiannan Zhu, Xiaochen Zuo, and Ji-Rong Wen. 2023. Integrating Representation and Interaction for Context-Aware Document Ranking. ACM Trans. Inf. Syst. (2023).

[8]

Haonan Chen, Zhicheng Dou, Yutao Zhu, Zhao Cao, Xiaohua Cheng, and Ji-Rong Wen. 2022b. Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search. In Proceedings of the 31st ACM International Conference on Information andamp; Knowledge Management.

Digital Library

[9]

Haonan Chen, Zhicheng Dou, Yutao Zhu, Zhao Cao, Xiaohua Cheng, and Ji-Rong Wen. 2022c. Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search. In Proc. of CIKM.

Digital Library

[10]

Jia Chen, Jiaxin Mao, Yiqun Liu, Fan Zhang, Min Zhang, and Shaoping Ma. 2021. Towards a better understanding of query reformulation behavior in web search. In Proceedings of the web conference 2021. 743--755.

Digital Library

[11]

Jia Chen, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2019. TianGong-ST: A new dataset with large-scale refined real-world web search sessions. In Proc. of CIKM.

Digital Library

[12]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proc. of ICML.

[13]

Xiuying Chen, Hind Alamro, Mingzhe Li, Shen Gao, Rui Yan, Xin Gao, and Xiangliang Zhang. 2022a. Target-aware Abstractive Related Work Generation with Contrastive Learning. In Proc. of SIGIR.

Digital Library

[14]

Xiuying Chen, Mingzhe Li, Shen Gao, Rui Yan, Xin Gao, and Xiangliang Zhang. 2022d. Scientific Paper Extractive Summarization Enhanced by Citation Graphs. In Proc. of EMNLP.

[15]

Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, and Daniel S Weld. 2020. SPECTER: Document-level Representation Learning using Citation-informed Transformers. In Proc. of ACL.

[16]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of NAACL.

[17]

Shen Gao, Zhengliang Shi, Minghang Zhu, Bowen Fang, Xin Xin, Pengjie Ren, Zhumin Chen, Jun Ma, and Zhaochun Ren. 2024. Confucius: Iterative tool learning from introspection feedback by easy-to-difficult curriculum. In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence (AAAI'24). AAAI Press.

[18]

Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proc. of EMNLP.

[19]

Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2022. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In Proceedings of the 16th ACM Conference on Recommender Systems.

Digital Library

[20]

Xueting Han, Zhenhuan Huang, Bang An, and Jing Bai. 2021. Adaptive transfer learning on graph neural networks. In Proc. of KDD.

Digital Library

[21]

Edward J Hu, yelong shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. In Proc. of ICLR.

[22]

Ziniu Hu, Yuxiao Dong, Kuansan Wang, Kai-Wei Chang, and Yizhou Sun. 2020. Gpt-gnn: Generative pre-training of graph neural networks. In Proc. of KDD.

Digital Library

[23]

Xunqiang Jiang, Yuanfu Lu, Yuan Fang, and Chuan Shi. 2021. Contrastive pre-training of GNNs on heterogeneous graphs. In Proc. of CIKM.

Digital Library

[24]

Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations. In Proc. of SIGIR.

Digital Library

[25]

Jiduan Liu, Jiahao Liu, Yang Yang, Jingang Wang, Wei Wu, Dongyan Zhao, and Rui Yan. 2022. GNN-encoder: Learning a Dual-encoder Architecture via Graph Neural Networks for Dense Passage Retrieval. arxiv: 2204.08241 [cs.IR]

[26]

Ilya Loshchilov and Frank Hutter. 2017. Fixing Weight Decay Regularization in Adam. ArXiv (2017).

[27]

Yuanfu Lu, Xunqiang Jiang, Yuan Fang, and Chuan Shi. 2021. Learning to pre-train graph neural networks. In Proc. of AAAI.

[28]

Shengjie Ma, Chong Chen, Jiaxin Mao, Qi Tian, and Xuhui Jiang. 2023. Session Search with Pre-Trained Graph Classification Model.

[29]

Guanglin Niu, Bo Li, Yongfei Zhang, and Shiliang Pu. 2022. CAKE: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion. In Proc. of ACL.

[30]

Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Document Ranking with a Pretrained Sequence-to-Sequence Model. In Proc. of EMNLP Findings.

[31]

Liangming Pan, Alon Albalak, Xinyi Wang, and William Yang Wang. 2023. Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning. arXiv preprint arXiv:2305.12295 (2023).

[32]

Greg Pass, Abdur Chowdhury, and Cayley Torgeson. 2006. A picture of search. In Proceedings of the 1st international conference on Scalable information systems.

Digital Library

[33]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).

[34]

Zhen Qin, Rolf Jagerman, Kai Hui, Honglei Zhuang, Junru Wu, Jiaming Shen, Tianqi Liu, Jialu Liu, Donald Metzler, Xuanhui Wang, and Michael Bendersky. 2023. Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting. arxiv: 2306.17563 [cs.IR]

[35]

Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang. 2020. Gcc: Graph contrastive coding for graph neural network pre-training. In Proc. of KDD.

Digital Library

[36]

Chen Qu, Chenyan Xiong, Yizhe Zhang, Corby Rosset, W. Bruce Croft, and Paul Bennett. 2020. Contextual Re-Ranking with Behavior Aware Transformers. In Proc. of SIGIR.

Digital Library

[37]

Stephen Robertson, Hugo Zaragoza, et al. 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval (2009).

[38]

Corbin Rosset, Chenyan Xiong, Xia Song, Daniel Campos, Nick Craswell, Saurabh Tiwary, and Paul Bennett. 2020. Leading conversational search by suggesting useful questions. In Proceedings of the web conference 2020. 1160--1170.

Digital Library

[39]

Xuehua Shen, Bin Tan, and ChengXiang Zhai. 2005. Context-Sensitive Information Retrieval Using Implicit Feedback. In Proc. of SIGIR.

Digital Library

[40]

Hongda Sun, Quan Tu, Jinpeng Li, and Rui Yan. 2023 a. Convntm: conversational neural topic model. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 13609--13617.

Digital Library

[41]

Weiwei Sun, Lingyong Yan, Xinyu Ma, Pengjie Ren, Dawei Yin, and Zhaochun Ren. 2023 b. Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent. arxiv: 2304.09542 [cs.CL]

[42]

Shubo Tian, Qiao Jin, Lana Yeganova, Po-Ting Lai, Qingqing Zhu, Xiuying Chen, Yifan Yang, Qingyu Chen, Won Kim, Donald C Comeau, et al. 2023. Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health. arXiv preprint arXiv:2306.10070 (2023).

[43]

Zhiliang Tian, Wei Bi, Zihan Zhang, Dongkyu Lee, Yiping Song, and Nevin L Zhang. 2021. Learning from my friends: few-shot personalized conversation systems via social networks. In Proc. of AAAI.

[44]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. arxiv: 2307.09288 [cs.CL]

[45]

Christophe Van Gysel and Maarten de Rijke. 2018. Pytrec_eval: An extremely fast python interface to trec_eval. In Proc. of SIGIR.

[46]

Shuting Wang, Zhicheng Dou, and Yutao Zhu. 2023. Heterogeneous Graph-Based Context-Aware Document Ranking. In Proc. of WSDM.

Digital Library

[47]

Ryen W. White, Paul N. Bennett, and Susan T. Dumais. 2010. Predicting Short-Term Interests Using Activity-Based Search Context. In Proc. of CIKM.

[48]

Likang Wu, Zhaopeng Qiu, Zhi Zheng, Hengshu Zhu, and Enhong Chen. 2023. Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations. arxiv: 2307.05722 [cs.AI]

[49]

Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, and Hang Li. 2010. Context-Aware Ranking in Web Search. In Proc. of SIGIR.

Digital Library

[50]

Han Xie, Da Zheng, Jun Ma, Houyu Zhang, Vassilis N Ioannidis, Xiang Song, Qing Ping, Sheng Wang, Carl Yang, Yi Xu, et al. 2023. Graph-Aware Language Model Pre-Training on a Large Graph Corpus Can Help Multiple Graph Applications. arXiv preprint arXiv:2306.02592 (2023).

[51]

Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, JunTao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, Weipeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, and Zhiying Wu. 2023. Baichuan 2: Open Large-scale Language Models. arxiv: 2309.10305 [cs.CL]

[52]

Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. Proc. of NeurIPS (2018).

[53]

Yuan Zhang, Dong Wang, and Yan Zhang. 2019. Neural IR meets graph embedding: A ranking model for product search. In The World Wide Web Conference. 2390--2400.

Digital Library

[54]

Haiteng Zhao, Shengchao Liu, Chang Ma, Hannan Xu, Jie Fu, Zhi-Hong Deng, Lingpeng Kong, and Qi Liu. 2023. GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning. arxiv: 2306.13089 [cs.LG]

[55]

Juexiao Zhou, Xiaonan He, Liyuan Sun, Jiannan Xu, Xiuying Chen, Yuetan Chu, Longxi Zhou, Xingyu Liao, Bin Zhang, and Xin Gao. 2023. SkinGPT-4: An Interactive Dermatology Diagnostic System with Visual Large Language Model.

[56]

Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, and Mohamed Elhoseiny. 2023. Minigpt-4: Enhancing vision-language understanding with advanced large language models. arXiv preprint arXiv:2304.10592 (2023).

[57]

Jason Zhu, Yanling Cui, Yuming Liu, Hao Sun, Xue Li, Markus Pelger, Tianqi Yang, Liangjie Zhang, Ruofei Zhang, and Huasha Zhao. 2021a. TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search. In Proc. of WWW.

Digital Library

[58]

Yutao Zhu, Jian-Yun Nie, Zhicheng Dou, Zhengyi Ma, Xinyu Zhang, Pan Du, Xiaochen Zuo, and Hao Jiang. 2021b. Contrastive Learning of User Behavior Sequence for Context-Aware Document Ranking. In Proceedings of the 30th ACM International Conference on Information andamp; Knowledge Management.

Digital Library

[59]

Yutao Zhu, Jian-Yun Nie, Zhicheng Dou, Zhengyi Ma, Xinyu Zhang, Pan Du, Xiaochen Zuo, and Hao Jiang. 2021c. Contrastive learning of user behavior sequence for context-aware document ranking. In Proc. of CIKM.

Digital Library

[60]

Xiaochen Zuo, Zhicheng Dou, and Ji-Rong Wen. 2022. Improving Session Search by Modeling Multi-Granularity Historical Query Change. In Proc. of WSDM.

Digital Library

Cited By

Wu STu QZhong MLiu HXu JGu JYan RSerra ESpezzano F(2024)Bridge the Gap between Past and Future: Siamese Model Optimization for Context-Aware Document RankingProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679661(2564-2574)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679661

Index Terms

Unify Graph Learning with Text: Unleashing LLM Potentials for Session Search
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Users' search tasks have become increasingly complicated, requiring multiple queries and interactions with the results. Recent studies have demonstrated that modeling the historical user behaviors in a session can help understand the current search ...
Dynamic Interaction-Driven Intent Evolver with Semantic Probability Distributions
WSDM '25: Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining

Accurately capturing a user's dynamic search intent based on her/his interactions with the system is crucial for improving the performance of session-based search. Existing methods often require the entire interaction sequence within a session to be ...
Session Search with Pre-trained Graph Classification Model
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Session search is a widely adopted technique in search engines that seeks to leverage the complete interaction history of a search session to better understand the information needs of users and provide more relevant ranking results. The vast majority of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '24: Proceedings of the ACM Web Conference 2024

May 2024

4826 pages

ISBN:9798400701719

DOI:10.1145/3589334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '24

Sponsor:

SIGWEB

WWW '24: The ACM Web Conference 2024

May 13 - 17, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
428
Total Downloads

Downloads (Last 12 months)428
Downloads (Last 6 weeks)33

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu STu QZhong MLiu HXu JGu JYan RSerra ESpezzano F(2024)Bridge the Gap between Past and Future: Siamese Model Optimization for Context-Aware Document RankingProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679661(2564-2574)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679661

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten