SDDRS: Stacked Discriminative Denoising Auto-Encoder based Recommender System
Introduction
Recently, with the development of online platform, the available information is explosively increasing. For keeping users from being overwhelmed by the huge amount of information, the recommender system is of significant importance. The goal of recommender system is to filter the useless information based on users’ preference, so as to help users to find items they are really interested in Deng, Huang, Wang, Lai, and Yu (2019). For instance, in Zhang et al. (2016), Zhang et al. propose a recommendation model to predict the potential side effect of drugs so as to filter the unsuitable drugs for users, which can greatly help improve the efficiency of treatment. In Hu et al., 2017, Hu et al., 2019, Hu et al. propose an item-oriented recommender system which can effectively enhance the revenue of manufacturers. Collaborative filtering (CF) (Balabanovic & Shoham, 1997) based recommender system is one of the most successful recommendation methods, which models the user and item features based on the rating information. The CF model assumes that users with similar features will share similar preference to items, so as to predict the attitude of users to untouched items with the help of other similar users. Since the Netflix Prize (Koren, Bell, & Volinsky, 2009), the matrix factorization (MF) based collaborative filtering model is widely adopted in many works like Bengio et al., 2013, Li et al., 2017, Xu et al., 2017, Zhao et al., 2017, Zhao et al., 2016, Zhao et al., 2017, etc. Unlike the traditional CF models like Balabanovic and Shoham, 1997, Sarwar et al., 2001, which directly adopt the vectorized rating information to represent the features of users and items, the MF based model assumes that those features can be represented by some latent factors in low-dimensional space.
Rating information plays an important role in CF models (Zhao, Wang, & Lai, 2016). There are two kinds of rating information, which are explicit rating and implicit rating (Xue, Dai, Zhang, Huang, & Chen, 2017). The explicit ratings are multi-value, representing the preference of users or popularity of items, while the implicit ratings are binary, standing for whether users are interested in items. Both of the two kinds of rating information can provide useful semantic and sentimental knowledge when modeling the user and item feature. However, at most of time, each user can only interact with a small ratio of items, which usually causes the data sparsity problem (Burke, 2002). In order to alleviate this problem, side information1 is widely utilized to enrich the data sources of the recommender system. The side information is defined to be the relevant information from other places like the review of users or the text introduction of items (de Campos et al., 2010, Li et al., 2017, Wang et al., 2018). However, many works directly utilize the side information without any preprocessing. Although side information contains rich semantic knowledge, it involves other kinds of knowledge irrelevant to either the attitude of users or the popularity of items, which will add noises into the recommendation model. Fortunately, deep learning methods perform well in extracting denoised and effective features from noisy data. Therefore, recently, more and more works focus on combining deep learning methods with recommendation model, like Salakhutdinov et al., 2007, Wang et al., 2015, Wei et al., 2017. Among many deep learning methods, one of the most popular models adopted in recommender system is the Stacked Denoising Auto-Encoder (SDAE) (Vincent, Larochelle, Lajoie, Bengio, & Manzagol, 2010) due to its convinced result in learning features from text data.2 For example, Wang et al., 2015, Wei et al., 2017, Dong et al., 2017 choose SDAE to extract the content features of items, which empirically validates the effectiveness by comparing with several state-of-the-art models. However, it’s clear that both explicit rating and implicit rating can provide useful information for training recommender system. Unfortunately, most of the existing works can integrate side information with only one of them, while the other is wasted, which causes the low efficiency of data utilization.
In this paper, we propose an SDAE based classification model and incorporate it with the MF based recommender system to tightly couple the above three kinds of information. Due to the discriminative property of classification model, the proposed SDAE based classification model is named Stacked Discriminative Denoising Auto-Encoder (SDDAE), and our recommender system is named Stacked Discriminative Denoising Auto-Encoder based Recommender System (SDDRS). In following sections, for the ease of description, we firstly introduce two submodels, one of which introduces the combination of explicit rating and side information, while another illustrates the combination of implicit rating and side information. After that, both submodels are integrated to construct the SDDRS.
In summary, the contributions of this paper can be listed as follows:
- •
To the best of our knowledge, SDDRS is the first model which can smoothly integrate three kinds of information into the trained features. Besides, SDDRS can be considered as a more general framework, which contains three components to process the explicit rating information, implicit rating information, and side information. Therefore, when facing with different problems, it can be extended to be more proper models by using different submodels at each component.
- •
By adopting both explicit rating and implicit rating information, the efficiency of rating data utilization is improved.
- •
Extensive experiments on three real-world datasets show that SDDRS outperforms state-of-the-art recommender systems.
The organization of this paper is as follows. In Section 2, we briefly introduce related works of this paper. In Section 3, the SDAE model is briefly introduced at first, and then we present the proposed SDDRS model. In Section 4 we present experimental results and provide detailed analysis. At last, we draw conclusions and present the future work in Section 5.
The main results in this paper were first presented in Wang, Xu, Huang, Wang, and Lai (in press).
Section snippets
Matrix factorization model
Recently, the MF based models are widely adopted in recommender system. In Salakhutdinov and Mnih (2007), Salakhutdinov et al. propose the PMF model by assuming both the rating matrix and factorized features obey the Gaussian distribution in prior, which can be regarded as a Bayesian version of the SVD model. In Koren et al. (2009), Koren et al. propose SVD++ by adding the bias factors into the original SVD method, which is much more suitable for CF based recommender system. In Ning and Karypis
Stacked discriminative denoising auto-encoder for recommender system
In this section, the proposed SDDRS model is presented. For the ease of description, we will briefly introduce the SDAE model. Then, we detailedly describe the newly proposed SDDRS method. In SDDRS, the MF based recommender system is integrated to form a global model by using the hidden layer and item latent feature vector as bridges.
For unifying the notation, in this paper, we use the bold uppercase or bold uppercase with a single subscript to denote the matrix variable, like the rating matrix
Experiments
In this section, we conduct experiments on three datasets, and demonstrate the performance of our model by comparing with several state-of-the-art models.
Conclusion
There are three kinds of information commonly used in recommender system, which are explicit rating, implicit rating, and side information. Among the above information, the side information helps alleviate the data sparsity problem, the implicit rating directly shows whether users are interested in items, and the explicit rating reflects the preference of users and the popularity of items. The goal of this paper is to tightly couple these three kinds of information. To this end, we propose a
Acknowledgments
This work was supported by NSFC (61876193), Guangdong Natural Science Funds for Distinguished Young Scholar (2016A030306014), Tip-top Scientific and Technical Innovative Youth Talents of Guangdong special support program (2016TQ03X542).
Conflicts of interests
None.
References (47)
- et al.
A hybrid content-based and item-based collaborative filtering approach to recommend TV programs enhanced with singular value decomposition
Information Sciences
(2010) - et al.
Combining content-based and collaborative recommendations: A hybrid approach based on bayesian networks
International Journal of Approximate Reasoning
(2010) - et al.
Item orientated recommendation by multi-view intact space learning with overlapping
Knowledge-Based Systems
(2019) - et al.
An item orientated recommendation algorithm from the multi-view perspective
Neurocomputing
(2017) - et al.
Collaborative filtering and deep learning based recommendation system for cold start items
Expert Systems with Applications
(2017) - et al.
Predicting potential side effects of drugs by recommender methods and ensemble learning
Neurocomputing
(2016) - et al.
Recommendation in feature space sphere
Electronic Commerce Research and Applications
(2017) - et al.
Tag-aware recommender systems based on deep neural networks
Neurocomputing
(2016) - et al.
Content-based, collaborative recommendation
Communications of the ACM
(1997) - Bengio, Y., Yao, L., Alain, G., & Vincent, P. (2013). Generalized denoising auto-encoders as generative models. In NIPS...
Hybrid recommender systems: Survey and experiments
User Modeling and User-Adapted Interaction
Marginalized denoising autoencoders for domain adaptation
Learning deep binary descriptor with multi-quantization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Why does unsupervised pre-training help deep learning?
Journal of Machine Learning Research
Reducing the dimensionality of data with neural networks
Science
Auto-encoding variational Bayes
Matrix factorization techniques for recommender systems
IEEE Computer
Cited by (17)
An exploration of user–facet interaction in collaborative-based personalized multiple facet selection
2020, Knowledge-Based SystemsCitation Excerpt :The embedded input is later fed into a deep learning model. Autoencoder-based models were carried out in this scheme, such as AutoRec [11], Multi-VAE [12], CDAE [13], and SDDRS [14]. On the other hand, in the Late interaction scheme, the interaction of users and items were created after embedding process by allowing each input (users and items) to be embedded independently before fusing the embedded inputs into a common representation space.
Deep variational models for collaborative filtering-based recommender systems
2023, Neural Computing and ApplicationsA Model-Bias Matrix Factorization Approach for Course Score Prediction
2022, Neural Processing LettersEnhancing Context-Aware Recommendation Using Hesitant Fuzzy Item Clustering by Stacked Autoencoder Based Smoothing Technique
2022, International Journal of Uncertainty, Fuzziness and Knowldege-Based SystemsPotent Real-Time Recommendations Using Multimodel Contextual Reinforcement Learning
2022, IEEE Transactions on Computational Social Systems