Elsevier

Neurocomputing

Volume 493, 7 July 2022, Pages 166-175
Neurocomputing

Session-based recommendation with temporal convolutional network to balance numerical gaps

https://doi.org/10.1016/j.neucom.2022.04.069Get rights and content

Abstract

A session-based recommendation system recommends the next possible item for users by learning the click session sequences of anonymous users. Considering the session data used is anonymous and few background information is available, it is very challenging to solve these problems. Recently, although the session-based recommendation systems based on neural network have achieved gratifying results, there are still two problems in the existing methods: (1) The value of each dimension in the embedded layer result is a non-zero mean distribution and the numerical gaps are very large. Such numerical gaps will increase the variance of the gradient, hindering the parameter optimization, and thus leading to the final prediction results inaccurate; (2) The previous models cannot effectively learn the long-term dependency information and capture the dependencies between non-adjacent items in the session sequence. To solve the above problems, we propose a Session-based Recommendation with Temporal Convolutional Network to Balance Numerical Gaps model. Specifically, we first normalize the embedded layer results, then constrain the embedded results in the unit hypersphere to reduce their impact on gradient calculation, and finally use Temporal Convolution Network (TCN) complements the multi-layer self-attention network to learn the session sequence.The TCN can obtain large enough receptive fields to fully learn session representation and the short-range item dependence that is missing due to the distraction of attention distribution by the self-attention mechanism, and the self-attention method capture the one-to-one interaction of each item, and obtain the long-term dependence of the item. We have conducted a large number of experiments on three real-world datasets. The results show that, in most cases, our proposed method outperforms the state-of-the-arts methods.

Introduction

With the continuous development of information technology and the rapidly increasing information, customers need more and more time to find the information that meets their needs. This problem is called information overload [1]. A very promising way to solve this problem is recommendation system [2], which recommends interesting information and products to users according to their information needs.

The implementation of traditional recommendation system methods depends on the interaction records between users and items. Based on these records, users’ preferences are modeled to learn the long-term static preferences of users. This type of approach generally focuses only on long-term preferences and assumes equal importance for all interactions at all time nodes. Of actual application scenario, however, many users like to browse and buy goods without registration or login in, causing the user’s identity is unknown. In addition, the traditional methods used in recommendation system ignore the dependence of user interaction sequence and the time-sensitive context. Based on the above reasons, the recommendation results of traditional methods have a lower accuracy. To address these problems, session-based recommendation systems, using limited information to obtain user preferences in short sessions, are attracting more and more attention from academics and industries [3].

Session-based recommendation systems can predict the next item that a user clicks through the current click sequence. Considering their high performance in practical applications, many people begin to do a lot of work on session-based recommendation system. The studies in [4], [5], [6], [7] use the classical method of Markov chain, which calculates user preferences by calculating the similarity between items in the session. The next click in the Markov chain is only related to the last one or several clicks, which makes it completely ignore the global order information in the session. Therefore, the methods based on Markov chain cannot take advantage of the interaction between each click in the whole session sequence.

With the development of neural network methods and the high performance of Recurrent Neural Network (RNN) on sequential data, RNN and its variants are applied to session-based recommendation systems [8], [9]. Hidasi et al. [8] initially used Gated Recurrent Unit (GRU) to predict the user’s next click. On this basis, Li et al. [9] proposed Neural Attention Recommendation Machine (NARM), which used GRU to learn the global and local interests of the session and achieved good results. However, Since RNN has the problem that the earlier the input, the smaller the impact, it cannot effectively learn the long-term dependence of session sequence, and its performance will be reduced on long session sequences. Similar to NARM, Liu et al. [10] proposed the Short Term Attention/ Memory Priority Model (STAMP) to learn users’ long-term interests and current interests by using the multi-layer perceptron and attention network.After that, Wu et al. [11] added neighbor session encoder to STAMP to enable the model to use the information of neighborhood sessions and improve the effect of the model.

Graph Neural Network (GNN) transforms a session into graph structure and provide rich local context information by encoding edge or node attribute features. By doing so, it could outperform RNN for session-based recommendation. Wu et al. [12] proposed SR-GNN, which obtains stronger expression ability by using GNN to learn the session representation. Xu et al. [13] proposed Graph Contextualized Self-Attention Network (GC-SAN), which combines the GNN and self-attention adopted in Transformer to capture complex transitions of items. Although GNN achieves good results, it only learns the dependencies between adjacent items in the session graph, which is difficult to capture the dependencies between non-adjacent project nodes in the session graph.

In recent years, Temporal Convolutional Network (TCN) [14] has also been applied to the recommendation systems and achieved excellent results [15]. Due to the causal convolution and dilated convolution structure used in TCN, TCN can achieve faster computing speed in parallel, by accepting variable length sequences as inputs similar to RNN. Compared with CNN, TCN is able to obtain a large enough receptive field and reduce the number of parameters when the number of layers is small. At the same time, this receptive field can also obtain some long-term dependence information of the project. However, the long-term dependence information obtained by TCN is constrained by the size of receptive field, which is very important in the prediction tasks. Another recently emerged sequential model, transformer [16], has achieved remarkable results in the fields of machine translation [17], [18], event extraction [19], and recommendation systems [20]. Transformer is composed of multi-layer self-attention networks and linear layers. The self-attention mechanism can fully considers each item in the session sequence through operations such as weighted average, to capture the one-to-one interaction of each item, which is able to obtain the long-term dependence of the item. However, the self-attention mechanism distracts the distribution of attention, making it difficult for the model to capture the local dependence between short-distance items, which limits the ability of the model to learn the context information of the item.

Although the above methods achieved excellent results in predicting the next click in session-based recommendation, there are still some limitations. Previous models use the embedding layer to generate the high-dimensional vector of unique items in all sessions, and then obtain the high-dimensional representation of the session by looking up the representation of each item in the embedding results of the session sequence. However, their embedding methods will lead to the gaps of the numerical distribution of each dimension in the embedding results - some dimensions have non-zero mean or very large values. These embedding gaps will increase the variance of the gradient and make the interpretability of the model worse, hindering the parameter optimization, and resulting in the final prediction results inaccurate.

To address the above problems, we propose a Session-based Recommendation based on Temporal Convolutional Network to Balance Numerical Gaps model, called TCNBNG for short. Firstly, we limit the values of each dimension of the embedding results to a range of checks and balanced by normalization. In this way, the dimensions with large values and those with small values in the embedding results can be constrained to a unit hypersphere to reduce the impact of the embedded value gaps. Then, we use TCN and self-attention mechanism to learn session features complementary to each other. TCN can use a large receptive field to comprehensively learn the session representation, and obtain the short-range item dependence that is missing because the self-attention mechanism distracts the attention distribution. Self-attention obtains the one-to-one relationship between items by calculating the weight of each item, and obtains the dependence information between long-distance items that TCN cannot obtain due to the limitation of receptive field. Unlike the previous models [8], [9], [10], [12], [13], based on the current preference and the obtained global preference through the attention mechanism, our model uses the last item embedding obtained by aggregating all conversation items using the self-attention mechanism to jointly represent the current and global preference. Because the value of each dimension of the session representation is constrained in a unit hypersphere, using cosine similarity can effectively calculate the similarity between the final session representation and each item representation, we can obtain a more reasonable recommendation score. Finally, we apply cosine similarity calculation to replace the inner product method commonly used in the previous methods to calculate the similarity score. The contributions of our work are summarized as follows:

  • To reduce the negative impact of biased item embedding on session recommendation, we propose a Session-based Recommendation with Temporal Convolutional Network to Balance Numerical Gaps(TCNBNG). TCNBNG normalizes the embedding results of session data with L2 normalization. Using the simple but most effective method to solve the problem of unstable gradient caused by the high value difference of each dimension of the embedding layer results, which has not been considered by previous methods, and eliminate the negative impact of this problem on prediction accuracy.

  • To make full use of the information of session sequence and accurately learn the representation of session features, we use TCN and self-attention mechanism to learn session features. TCN can comprehensively learn the session representation and learn the short-distance item dependency that cannot be completely learned by the self attention mechanism. The self-attention mechanism is capable of learning the relationship between each item in the session and obtain the dependency information between long-distance items that TCN cannot obtain due to the limitation of receptive field.

  • Finally, we compare our TCNBNG model with other state-of-the-art methods on two benchmark datasets. The experimental results show that compared with the state-of-the-art method, our proposed TCNBNG model has made 2.7% improvement on Diginetica data set, 5.4% improvement on RetailRocket data set and 1.3% improvement on Yoochoose1/64 data set. These results verify that our TCNBNG is more effective and superior than the existing methods.

The remaining structure of this paper is as follows. In Section 2, we introduce the related work. In Section 3, we analyze the embedding results of the dataset and discuss how the embedding gaps have a negative impact on the results of the model. In Section 4, we introduce the structure and mathematical model of TCNBNG. In Section 5, we introduce the setup of relevant experiments and discuss the results, and prove the validity of the model. Finally, in Section 6, we summarize the discussion of this paper and introduce the direction of future work.

Section snippets

Related Work

Since session-based recommendation only uses anonymous user click sequences data, the available information is very limited. Thus, the session-based recommendation is still a challenging work. In this section, we briefly review the traditional methods and neural network-based methods for solving this problem.

Conventional Methods: Since the lake of available user information for session-based recommendation, the traditional simple matrix decomposition [21], [22] and Item-KNN [23] cannot use the

Large gaps in values of each dimension in embedded results

In this section, we analyze the datasets used in this paper to show the observation results of large gaps in values of each dimension in embedded results, and discuss its negative impact on the model results.

To alleviate the problem of gradient disappearance and gradient explosion that often appears during the training of deep learning models, people usually use zero-centered, independently homogeneously distributed parameters when initializing the model parameters. However, if the output of

Proposed Method

In this section, we will introduce our Session-based Recommendation based on Temporary Convolutional Network to Balance Numerical Gaps (TCNBNG) model (as shown in Fig. 3). We first briefly describe the problem of session-based recommendation, and then describe the specific structure of the model in details.

Datasets and Evaluation Metrics

Datasets:We conducted experiments on three real-world datasets that have also been used in previous works[8], [10], [11], [12], [13] to validate the effectiveness of our proposed TCNBNG model: (1) Yoochoose1, (2) Diginetica2, (3) RetailRocket3. Yoochoose dataset was downloaded from the RecSys challenge 2015, containing the click streams of users on e-commerce

Conclusions

In this work, we propose a Session-based Recommendation with Temporal Convolutional Network to Balance Numerical Gaps (TCNBNG). We first use L2 normalization to normalize the embedding results so that the values of each dimension are constrained in a mutually balanced interval. Then, Temporal Convolutional Network and self-attention network are used to learn the session representation complementary. Finally, the recommendation score is calculated by optimizing cosine similarity instead of inner

CRediT authorship contribution statement

Weinan Li: Conceptualization, Methodology, Software, Investigation, Formal analysis, Writing - original draft. Jin Gou: Supervision, Project administration, Validation. Zongwen Fan: Visualization, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Weinan Li is currently a master student in the College of Computer Science and Technology, Huaqiao University, Xiamen, China. He received the B.S. degree in Food Science and Technology from the Northeast Agricultural University, Harbin, China, in 2018. His research interests include recommendation systems and data mining.

References (31)

  • Y. Wu et al.

    Leveraging neighborhood session information with dual attentive neural network for session-based recommendation

    Neurocomputing

    (2021)
  • C. Elie-Dit-Cosaque

    Information overload

    Wiley Encyclopedia of Management

    (2015)
  • J.A. Jacobi, E.A. Benson, G.D. Linden, Recommendation system, US...
  • S. Wang et al.

    A survey on session-based recommender systems

    ACM Comput. Surv.

    (2021)
  • M. Eirinaki et al.

    Web path recommendations based on page ranking and markov models

  • D.T. Le et al.

    Modeling sequential preferences with dynamic user and context factors

    Machine Learning and Knowledge Discovery in Databases

    (2016)
  • S. Rendle et al.

    Factorizing personalized markov chains for next-basket recommendation

  • Z. Zhang et al.

    Efficient hybrid web recommendations based on markov clickstream models and implicit search

  • B. Hidasi, A. Karatzoglou, L. Baltrunas, D. Tikk, Session-based recommendations with recurrent neural networks, arXiv...
  • J. Li et al.

    Neural attentive session-based recommendation

  • Q. Liu, Y. Zeng, R. Mokhosi, H. Zhang, Stamp: Short-term attention/memory priority model for session-based...
  • S. Wu et al.

    Session-based recommendation with graph neural networks

  • C. Xu et al.

    Graph contextualized self-attention network for session-based recommendation

  • S. Bai, J.Z. Kolter, V. Koltun, An empirical evaluation of generic convolutional and recurrent networks for sequence...
  • J. You, Y. Wang, A. Pal, P. Eksombatchai, C. Rosenburg, J. Leskovec, Hierarchical temporal convolutional networks for...
  • Cited by (3)

    Weinan Li is currently a master student in the College of Computer Science and Technology, Huaqiao University, Xiamen, China. He received the B.S. degree in Food Science and Technology from the Northeast Agricultural University, Harbin, China, in 2018. His research interests include recommendation systems and data mining.

    Jin Gou was born in 1978, He received the Ph.D. degree in computer science and technology from Zhejiang University, China, in 2006. He is currently a Professor with Huaqiao University, Xiamen, China. His main research interests include knowledge fusion and artificial intelligence.

    Zongwen Fan received both his B.S. and M.Sc. degrees from Huaqiao University, China, in 2014 and 2017, respectively. He is currently a Ph.D candidate with the University of Newcastle, Australia. His main research interests include machine learning, data mining, and fuzzy computing.

    View full text