research-article

Enhancing Asymmetric Web Search through Question-Answer Generation and Ranking

Authors:

Jie Liu,

Jin MaAuthors Info & Claims

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 6127 - 6136

https://doi.org/10.1145/3637528.3671517

Published: 24 August 2024 Publication History

Get Access

Abstract

This paper addresses the challenge of the semantic gap between user queries and web content, commonly referred to as asymmetric text matching, within the domain of web search. By leveraging BERT for reading comprehension, current algorithms enable significant advancements in query understanding, but still encounter limitations in effectively resolving the asymmetrical ranking problem due to model comprehension and summarization constraints.

To tackle this issue, we propose the QAGR (Question-Answer Generation and Ranking) method, comprising an offline module called QAGeneration and an online module called QARanking. The QAGeneration module utilizes large language models (LLMs) to generate high-quality question-answering pairs for each web page. This process involves two steps: generating question-answer pairs and performing verification to eliminate irrelevant questions, resulting in high-quality questions associated with their respective documents. The QARanking module combines and ranks the generated questions and web page content. To ensure efficient online inference, we design the QARanking model as a homogeneous dual-tower model, incorporating query intent to drive score fusion while balancing keyword matching and asymmetric matching. Additionally, we conduct a preliminary screening of questions for each document, selecting only the top-N relevant questions for further relevance calculation.

Empirical results demonstrate the substantial performance improvement of our proposed method in web search. We achieve over 8.7% relative offline relevance improvement and over 8.5% online engagement gain compared to the state-of-the-art web search system. Furthermore, we deploy QAGR to online web search engines and share our deployment experience, including production considerations and ablation experiments. This research contributes to advancing the field of asymmetric web search and provides valuable insights for enhancing search engine performance.

Supplemental Material

MP4 File - Enhancing Asymmetric Web Search through Question-Answer Generation and Ranking

This paper addresses the challenge of the semantic gap between user queries and web content, commonly referred to as asymmetric text matching, within the domain of web search. we propose the QAGR (Question-Answer Generation and Ranking) method, comprising an offline module called QAGeneration and an online module called QARanking. The QAGeneration module utilizes large language models (LLMs) to generate high-quality question-answering pairs for each web page. The QARanking module combines and ranks the generated questions and web page content.Empirical results demonstrate the substantial performance improvement of our proposed method in web search. Furthermore, we deploy QAGR to online web search engines and share our deployment experience, including production considerations and ablation experiments.

Download
3.69 MB

References

[1]

Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, and Rodrigo Nogueira. 2022. Inpars: Unsupervised dataset generation for information retrieval. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2387--2392.

Abstract

Supplemental Material

References

Index Terms

Recommendations

Identifying popular search goals behind search queries to improve web search ranking

Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs

Explore click models for search ranking

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations