Hierarchical neural query suggestion with an attention mechanism☆
Introduction
Modern search engines offer query suggestions to help users express their information needs effectively. Previous work on query suggestion, such as probabilistic models and learning to rank techniques, mainly relys on features indicating dependencies between queries and users, such as clicks and dwell time (Chen, Cai, Chen, & de Rijke, 2017). However, the structure of those dependencies is usually modeled manually. As a result, hidden relationships between queries and a user’s behavior may be ignored. Recurrent neural network (RNN)-based approaches have been proposed to tackle these challenges. A query log can be treated as sequential data that can be modeled to predict the next input query. However, existing neural based methods only consider so-called current sessions (in which a query suggestion is being generated) as the search context for query suggestion (Onal et al., 2018).
Our research goal is to develop a neural query suggestion method that is able to capture the user’s search intent by capturing both their short-term interests, as manifested during an ongoing search session, and their long-term interests, as manifested during earlier sessions. To this end we propose an AHNQS model that applies a user attention mechanism inside a hierarchical neural structure for query suggestion. The hierarchical structure contains two parts: a session-level RNN and a user-level RNN. The first captures queries in the current session and is used to model the user’s short-term search context to predict their next query. The second captures the past search sessions for a given user and is applied to model their long-term search behavior to output a user state vector representing their preferences. We use the hidden state of the session-level RNN as the input to the user-level RNN; the user state of the latter is then used to initialize the first hidden state of the next session-level RNN.
In addition, we apply an attention mechanism inside the hierarchical structure of AHNQS that is meant to capture a user’s preference towards different queries in a session. This addition is based on the assumption that different queries in the same session may express different aspects of the user’s search intent (Bahdanau, Cho, & Bengio, 2015), e.g., queries with subsequent click behavior are more likely to represent the user’s information need than those without. An attention mechanism can automatically assign different weights to hidden states of queries in the session-level RNN. The attentive hidden states together compose the session state, which we regard as a local session state. The local session state has the advantage of adaptively focusing on more important queries to capture users’ main purpose in the current session. Besides, we also consider the final hidden state of the session-level RNN as a global session state, which acts as a vertical summary of the full sequence behavior. Then we use a combination of the global and local session state as the input for the user-level RNN.
We compare the performance of AHNQS against a state-of-the-art query suggestion baseline and variants of RNN-based query suggestion methods using the AOL query log. In terms of query suggestion ranking accuracy we establish improvements of AHNQS over the best baseline model of up to 9.66% and 12.51% in terms of Recall@10 and MRR@10, respectively. In addition, we investigate the impact on query suggestion performance of different session states, i.e., global vs. local vs. combined. The results show the effectiveness of the AHNQS model with the combined session state. Furthermore, we test the scalability of the AHNQS model across users with different numbers of sessions in their interaction history. The experimental results show that the performance of AHNQS is better than the best baseline model for users with varying degrees of activity.
Our contributions in this paper are:
- 1.
We tackle the challenge of query suggestion in a novel way by proposing an Attention-based Hierarchical Neural Query Suggestion model, i.e., AHNQS, which adopts a hierarchical structure containing a user attention mechanism to better capture the user’s search intent.
- 2.
We analyse the impact of session length on query suggestion performance and find that AHNQS consistently yields the best performance, especially with short search contexts.
- 3.
We examine the performance of AHNQS with different numbers of users’ sessions. We find that AHNQS always yields better performance over the best baseline model, especially for users with few search sessions.
We describe related work in Section 2. Details of the attention-based hierarchical query suggestion model are described in Section 3. Section 4 presents our experimental setup. In Section 5, we report and discuss our results. Finally, we conclude in Section 6, where we also suggest future research directions.
Section snippets
Related work
Query suggestion can support users of search engines during their search tasks. A significant amount of work has gone into methods for formulating a better understandable query submitted by users (Cai et al., 2016, Cai, de Rijke, 2016a, Cai, de Rijke, 2016c, Smith, Gwizdka, Feild, 2017, Vidinli, Ozcan, 2016a, Vidinli, Ozcan, 2016b). In recent years, deep learning techniques have been applied to a range of information retrieval tasks, often leading to a better understanding of user’s search
Approach
Before introducing the AHNQS model, we introduce a neural query suggestion (NQS) model with session-level RNNs, and a hierarchical neural query suggestion (HNQS) model with hierarchical user-session RNNs.
Experiments
We conduct our experiments on the AOL dataset to examine the effectiveness of AHNQS. We first list the research questions and the models used for comparison. After that, the datasets and experimental setup are described.
Performance of query suggestion models
To answer RQ1, we examine the query suggestion performance of the baselines as well as the AHNQSlocal and AHNQScombined models. Table 4 presents the results.
As shown in Table 4, amongst the baselines, ADJ outperforms NQS, with 9.74% and 12.58% improvements in terms of Recall@10 and MRR@10, respectively. This may be due to the fact that the NQS model (without knowing about individual users) fails to capture information from the past search history. HNQS shows improvements over ADJ of up to
Conclusions and future work
We have proposed an attention-based hierarchical neural query suggestion model (AHNQS) that combines a hierarchical user-session RNN with an attention mechanism. The hierarchical structure, which incorporates a session-level and a user-level RNN, can model both the user’s short-term and long-term search behavior effectively, while the attention mechanism captures a user’s preference towards certain queries over others. For the session-level RNN, a combined session state is applied to capture
Acknowledgements
We would like to thank our anonymous reviewers for their helpful comments and valuable suggestions.
This research was supported by the National Natural Science Foundation of China under No. 61702526, the Defense Industrial Technology Development Program under No. JCKY2017204B064, the National Advanced Research Project under No. 6141B0801010b, Ahold Delhaize, the VSNU Vereniging van Universiteiten, the China Scholarship Council under No. 201803170244, All content represents the opinion of the
References (43)
- et al.
Learning from homologous queries and semantically related terms for query auto completion
Information Processing and Management
(2016) - et al.
A query term re-weighting approach using document similarity
Information Processing and Management
(2016) - et al.
Mapping queries to the linking open data cloud: A case study using dbpedia
Journal of Web Semantics
(2011) - et al.
The use of query auto-completion over the course of search sessions with multifaceted information needs
Information Processing and Management
(2017) - et al.
New query suggestion framework and algorithms: A case study for an educational search engine
Information Processing and Management
(2016) - et al.
New query suggestion framework and algorithms: A case study for an educational search engine
Information Processing and Management
(2016) - et al.
Neural machine translation by jointly learning to align and translate
ICLR’15
(2015) - et al.
A neural click model for web search
WWW’16
(2016) - et al.
A click sequence model for web search
SIGIR 2018: 41st International ACMSIGIR conference on research and development in information retrieval
(2018) - et al.
Selectively personalizing query auto-completion
A survey of query auto completion in information retrieval
Foundations and Trends in Information Retrieval
Diversifying query auto-completion
ACM Transactions on Information Systems
Context-aware query suggestion by mining click-through and session data
KDD’08
Personalized query suggestion diversification
SIGIR’17
Attention-based hierarchical neural query suggestion
SIGIR’18
Learning phrase representations using RNN encoder-decoder for statistical machine translation
EMNLP’14
Multi-view random walk framework for search task discovery from click-through log
CIKM’11
Personalized neural language models for real-world query auto completion
Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 3 (industry papers)
Intent-aware query similarity
CIKM’11
Web query recommendation via sequential query prediction
2009 IEEE 25th international conference on data engineering
NAIS: Neural attentive item similarity model for recommendation
IEEE Transactions on Knowledge and Data Engineering
Cited by (12)
AugPrompt: Knowledgeable augmented-trigger prompt for few-shot event classification
2023, Information Processing and ManagementCitation Excerpt :Existing event classification methods can be roughly categorized into two classes by the available annotated data volume, i.e., data-rich event classification and few-shot event classification. Typical data-rich event classification mainly relies on various types of neural networks (Chen, Cai, Chen, et al., 2020; Chen, Cai et al., 2021; Li, Sun, Han, et al., 2022). For example, Lai, Dernoncourt et al. (2020) perform the first study on few-shot learning for event classification, where the context-bypassing brought by the trigger bias is ignored (Wang, Xu et al., 2021).
Incorporating rich syntax information in Grammatical Error Correction
2022, Information Processing and ManagementCitation Excerpt :Several works have also proposed models that induce parse tree structures (and thereby accomplish unsupervised syntax parsing) (Kim et al., 2019; Kleenankandy & Abdul Nazeer, 2020; Shen, Lin, wei Huang, & Courville, 2018; Shen, Tan, Sordoni, & Courville, 2019; Yaushian Wang & Chen, 2019). Others incorporated trees into self-attention (Chen, Cai, Chen, & de Rijke, 2020; Hao, Wang, Shi, Zhang, & Tu, 2019; Harer, Reale, & Chin, 2019; Strubell, Verga, Andor, Weiss, & McCallum, 2018). Hewitt and Manning (2019) investigated the structure of BERT (Devlin, Chang, Lee, & Toutanova, 2019) and demonstrated that dependency semantics are already embedded in its layers, though Yaushian Wang and Chen (2019) suggested that BERT may not naturally embed constituency semantics.
An efficient hybrid query recommendation using shingling and hashing techniques
2022, Information SystemsCitation Excerpt :Moreover, their comparisons over different query lengths showed that this method performs more effectively over longer queries. Recently, Chen et al. have introduced a hierarchical neural network to capture user intention [15]. This hierarchical structure contains a so called session-level RNN which uses the users’ current session to predict their next queries.
Text summarization using topic-based vector space model and semantic measure
2021, Information Processing and ManagementCitation Excerpt :On the other hand, after the topic vector is generated in the proposed method, each sentence is compared with the topic vector only once. The recent success of machine learning techniques in text analysis field (Chen, Cai, Chen, & de Rijke, 2019; Liu & Jansen, 2017; Ma et al., 2016; Salminen et al., 2020) played a significant role in the field of text summarization also. The summarization techniques can use either a supervised or unsupervised approach.
Exploration of Informatization Teaching Mode of College Students’ Psychological Education Courses under Cognitive Behavioral Theory
2024, Applied Mathematics and Nonlinear SciencesApplying burst-tries for error-tolerant prefix search
2022, Information Retrieval Journal
- ☆
A preliminary version of this paper appeared in the proceedings of SIGIR 2018 (Chen et al., 2018). In this extension, we (1) extend the neural query suggestion approach to model users’ preference better by combining local (attention-based) and global session states; (2) investigate the performance of AHNQS with different session states, i.e., global vs. local vs. combined; (3) investigate the performance of our model with different numbers of users’ search sessions, as we find that the majority of users in the AOL dataset only have a small number of sessions; and (4) include more related work and provide a more detailed analysis of the approach and experimental results.
- 1
These authors contributed equally and are both corresponding authors.