Session-based recommendation with temporal convolutional network to balance numerical gaps
Introduction
With the continuous development of information technology and the rapidly increasing information, customers need more and more time to find the information that meets their needs. This problem is called information overload [1]. A very promising way to solve this problem is recommendation system [2], which recommends interesting information and products to users according to their information needs.
The implementation of traditional recommendation system methods depends on the interaction records between users and items. Based on these records, users’ preferences are modeled to learn the long-term static preferences of users. This type of approach generally focuses only on long-term preferences and assumes equal importance for all interactions at all time nodes. Of actual application scenario, however, many users like to browse and buy goods without registration or login in, causing the user’s identity is unknown. In addition, the traditional methods used in recommendation system ignore the dependence of user interaction sequence and the time-sensitive context. Based on the above reasons, the recommendation results of traditional methods have a lower accuracy. To address these problems, session-based recommendation systems, using limited information to obtain user preferences in short sessions, are attracting more and more attention from academics and industries [3].
Session-based recommendation systems can predict the next item that a user clicks through the current click sequence. Considering their high performance in practical applications, many people begin to do a lot of work on session-based recommendation system. The studies in [4], [5], [6], [7] use the classical method of Markov chain, which calculates user preferences by calculating the similarity between items in the session. The next click in the Markov chain is only related to the last one or several clicks, which makes it completely ignore the global order information in the session. Therefore, the methods based on Markov chain cannot take advantage of the interaction between each click in the whole session sequence.
With the development of neural network methods and the high performance of Recurrent Neural Network (RNN) on sequential data, RNN and its variants are applied to session-based recommendation systems [8], [9]. Hidasi et al. [8] initially used Gated Recurrent Unit (GRU) to predict the user’s next click. On this basis, Li et al. [9] proposed Neural Attention Recommendation Machine (NARM), which used GRU to learn the global and local interests of the session and achieved good results. However, Since RNN has the problem that the earlier the input, the smaller the impact, it cannot effectively learn the long-term dependence of session sequence, and its performance will be reduced on long session sequences. Similar to NARM, Liu et al. [10] proposed the Short Term Attention/ Memory Priority Model (STAMP) to learn users’ long-term interests and current interests by using the multi-layer perceptron and attention network.After that, Wu et al. [11] added neighbor session encoder to STAMP to enable the model to use the information of neighborhood sessions and improve the effect of the model.
Graph Neural Network (GNN) transforms a session into graph structure and provide rich local context information by encoding edge or node attribute features. By doing so, it could outperform RNN for session-based recommendation. Wu et al. [12] proposed SR-GNN, which obtains stronger expression ability by using GNN to learn the session representation. Xu et al. [13] proposed Graph Contextualized Self-Attention Network (GC-SAN), which combines the GNN and self-attention adopted in Transformer to capture complex transitions of items. Although GNN achieves good results, it only learns the dependencies between adjacent items in the session graph, which is difficult to capture the dependencies between non-adjacent project nodes in the session graph.
In recent years, Temporal Convolutional Network (TCN) [14] has also been applied to the recommendation systems and achieved excellent results [15]. Due to the causal convolution and dilated convolution structure used in TCN, TCN can achieve faster computing speed in parallel, by accepting variable length sequences as inputs similar to RNN. Compared with CNN, TCN is able to obtain a large enough receptive field and reduce the number of parameters when the number of layers is small. At the same time, this receptive field can also obtain some long-term dependence information of the project. However, the long-term dependence information obtained by TCN is constrained by the size of receptive field, which is very important in the prediction tasks. Another recently emerged sequential model, transformer [16], has achieved remarkable results in the fields of machine translation [17], [18], event extraction [19], and recommendation systems [20]. Transformer is composed of multi-layer self-attention networks and linear layers. The self-attention mechanism can fully considers each item in the session sequence through operations such as weighted average, to capture the one-to-one interaction of each item, which is able to obtain the long-term dependence of the item. However, the self-attention mechanism distracts the distribution of attention, making it difficult for the model to capture the local dependence between short-distance items, which limits the ability of the model to learn the context information of the item.
Although the above methods achieved excellent results in predicting the next click in session-based recommendation, there are still some limitations. Previous models use the embedding layer to generate the high-dimensional vector of unique items in all sessions, and then obtain the high-dimensional representation of the session by looking up the representation of each item in the embedding results of the session sequence. However, their embedding methods will lead to the gaps of the numerical distribution of each dimension in the embedding results - some dimensions have non-zero mean or very large values. These embedding gaps will increase the variance of the gradient and make the interpretability of the model worse, hindering the parameter optimization, and resulting in the final prediction results inaccurate.
To address the above problems, we propose a Session-based Recommendation based on Temporal Convolutional Network to Balance Numerical Gaps model, called TCNBNG for short. Firstly, we limit the values of each dimension of the embedding results to a range of checks and balanced by normalization. In this way, the dimensions with large values and those with small values in the embedding results can be constrained to a unit hypersphere to reduce the impact of the embedded value gaps. Then, we use TCN and self-attention mechanism to learn session features complementary to each other. TCN can use a large receptive field to comprehensively learn the session representation, and obtain the short-range item dependence that is missing because the self-attention mechanism distracts the attention distribution. Self-attention obtains the one-to-one relationship between items by calculating the weight of each item, and obtains the dependence information between long-distance items that TCN cannot obtain due to the limitation of receptive field. Unlike the previous models [8], [9], [10], [12], [13], based on the current preference and the obtained global preference through the attention mechanism, our model uses the last item embedding obtained by aggregating all conversation items using the self-attention mechanism to jointly represent the current and global preference. Because the value of each dimension of the session representation is constrained in a unit hypersphere, using cosine similarity can effectively calculate the similarity between the final session representation and each item representation, we can obtain a more reasonable recommendation score. Finally, we apply cosine similarity calculation to replace the inner product method commonly used in the previous methods to calculate the similarity score. The contributions of our work are summarized as follows:
- •
To reduce the negative impact of biased item embedding on session recommendation, we propose a Session-based Recommendation with Temporal Convolutional Network to Balance Numerical Gaps(TCNBNG). TCNBNG normalizes the embedding results of session data with L2 normalization. Using the simple but most effective method to solve the problem of unstable gradient caused by the high value difference of each dimension of the embedding layer results, which has not been considered by previous methods, and eliminate the negative impact of this problem on prediction accuracy.
- •
To make full use of the information of session sequence and accurately learn the representation of session features, we use TCN and self-attention mechanism to learn session features. TCN can comprehensively learn the session representation and learn the short-distance item dependency that cannot be completely learned by the self attention mechanism. The self-attention mechanism is capable of learning the relationship between each item in the session and obtain the dependency information between long-distance items that TCN cannot obtain due to the limitation of receptive field.
- •
Finally, we compare our TCNBNG model with other state-of-the-art methods on two benchmark datasets. The experimental results show that compared with the state-of-the-art method, our proposed TCNBNG model has made 2.7% improvement on Diginetica data set, 5.4% improvement on RetailRocket data set and 1.3% improvement on Yoochoose1/64 data set. These results verify that our TCNBNG is more effective and superior than the existing methods.
The remaining structure of this paper is as follows. In Section 2, we introduce the related work. In Section 3, we analyze the embedding results of the dataset and discuss how the embedding gaps have a negative impact on the results of the model. In Section 4, we introduce the structure and mathematical model of TCNBNG. In Section 5, we introduce the setup of relevant experiments and discuss the results, and prove the validity of the model. Finally, in Section 6, we summarize the discussion of this paper and introduce the direction of future work.
Section snippets
Related Work
Since session-based recommendation only uses anonymous user click sequences data, the available information is very limited. Thus, the session-based recommendation is still a challenging work. In this section, we briefly review the traditional methods and neural network-based methods for solving this problem.
Conventional Methods: Since the lake of available user information for session-based recommendation, the traditional simple matrix decomposition [21], [22] and Item-KNN [23] cannot use the
Large gaps in values of each dimension in embedded results
In this section, we analyze the datasets used in this paper to show the observation results of large gaps in values of each dimension in embedded results, and discuss its negative impact on the model results.
To alleviate the problem of gradient disappearance and gradient explosion that often appears during the training of deep learning models, people usually use zero-centered, independently homogeneously distributed parameters when initializing the model parameters. However, if the output of
Proposed Method
In this section, we will introduce our Session-based Recommendation based on Temporary Convolutional Network to Balance Numerical Gaps (TCNBNG) model (as shown in Fig. 3). We first briefly describe the problem of session-based recommendation, and then describe the specific structure of the model in details.
Datasets and Evaluation Metrics
Datasets:We conducted experiments on three real-world datasets that have also been used in previous works[8], [10], [11], [12], [13] to validate the effectiveness of our proposed TCNBNG model: (1) Yoochoose1, (2) Diginetica2, (3) RetailRocket3. Yoochoose dataset was downloaded from the RecSys challenge 2015, containing the click streams of users on e-commerce
Conclusions
In this work, we propose a Session-based Recommendation with Temporal Convolutional Network to Balance Numerical Gaps (TCNBNG). We first use L2 normalization to normalize the embedding results so that the values of each dimension are constrained in a mutually balanced interval. Then, Temporal Convolutional Network and self-attention network are used to learn the session representation complementary. Finally, the recommendation score is calculated by optimizing cosine similarity instead of inner
CRediT authorship contribution statement
Weinan Li: Conceptualization, Methodology, Software, Investigation, Formal analysis, Writing - original draft. Jin Gou: Supervision, Project administration, Validation. Zongwen Fan: Visualization, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Weinan Li is currently a master student in the College of Computer Science and Technology, Huaqiao University, Xiamen, China. He received the B.S. degree in Food Science and Technology from the Northeast Agricultural University, Harbin, China, in 2018. His research interests include recommendation systems and data mining.
References (31)
- et al.
Leveraging neighborhood session information with dual attentive neural network for session-based recommendation
Neurocomputing
(2021) Information overload
Wiley Encyclopedia of Management
(2015)- J.A. Jacobi, E.A. Benson, G.D. Linden, Recommendation system, US...
- et al.
A survey on session-based recommender systems
ACM Comput. Surv.
(2021) - et al.
Web path recommendations based on page ranking and markov models
- et al.
Modeling sequential preferences with dynamic user and context factors
Machine Learning and Knowledge Discovery in Databases
(2016) - et al.
Factorizing personalized markov chains for next-basket recommendation
- et al.
Efficient hybrid web recommendations based on markov clickstream models and implicit search
- B. Hidasi, A. Karatzoglou, L. Baltrunas, D. Tikk, Session-based recommendations with recurrent neural networks, arXiv...
- et al.
Neural attentive session-based recommendation
Session-based recommendation with graph neural networks
Graph contextualized self-attention network for session-based recommendation
Cited by (3)
Weinan Li is currently a master student in the College of Computer Science and Technology, Huaqiao University, Xiamen, China. He received the B.S. degree in Food Science and Technology from the Northeast Agricultural University, Harbin, China, in 2018. His research interests include recommendation systems and data mining.
Jin Gou was born in 1978, He received the Ph.D. degree in computer science and technology from Zhejiang University, China, in 2006. He is currently a Professor with Huaqiao University, Xiamen, China. His main research interests include knowledge fusion and artificial intelligence.
Zongwen Fan received both his B.S. and M.Sc. degrees from Huaqiao University, China, in 2014 and 2017, respectively. He is currently a Ph.D candidate with the University of Newcastle, Australia. His main research interests include machine learning, data mining, and fuzzy computing.