Kernel meets recommender systems: A multi-kernel interpolation for matrix completion

https://doi.org/10.1016/j.eswa.2020.114436Get rights and content

Highlights

  • Propose a kernelized matrix completion framework via multi-kernel interpolation.

  • Learn an effective low-dimensional representation in an infinite Hilbert space.

  • Provide a feasible solution to make the raw input data linearly separable.

  • Present an auto-weighted method for multi-kernel representation and fusion.

Abstract

A primary research direction for recommender systems is matrix completion, which attempts to recover the missing values in a user–item rating matrix. There are numerous approaches for rating tasks, which are mainly classified into latent factor models and neighborhood-based models. Most neighborhood-based models seek similar neighbors by computing similarities in the original data space for final predictions. In this paper, we propose a new neighborhood-based interpolation model with a kernelized matrix completion framework, with the impact weights provided by neighbors computed in a new Hilbert space containing more features. In our model, the kernel function is combined with a similarity measurement to achieve better approximation for unknown ratings. Furthermore, we extend our model with a non-linear multi-kernel framework which learns weights automatically to improve the model. Finally, we conduct extensive experiments on several real-world datasets. The outcomes show that the proposed methods work effectively and improve the performance of the rating prediction task compared to both the traditional and state-of-the-art approaches.

Introduction

Recommender systems are widely employed in various spheres and have become a popular research topic in decades (Hwang et al., 2016, Qian et al., 2019, Wang, Zhou and Lu, 2019). For instance, the famous media-service provider Netflix held the Netflix Prize competition to explore algorithms to predict user ratings of movies. This task is also considered as matrix completion issue that retrieves missing ratings in a rating matrix. Table 1 provides a simple example of the user–item rating matrix waiting to be completed. Most of the ratings in this table are missing. In real-world datasets, the rating matrices are even more sparse, which leads to the cold-start problem in recommender systems. The primary target of matrix completion is to retrieve the missing ratings in the user–item rating matrix. A classical solution for matrix completion is nonnegative matrix factorization (NMF) (Lee & Seung, 1999), which factorizes the incomplete matrix into two low-rank matrices. A variety of methods have been proposed for matrix completion. Kang et al. (2016) completed the rating matrix based on low-rank assumption, which adopted a nonconvex rank relaxation to achieve a better rank approximation. Xue et al. (2017) leveraged two parallel deep neural networks to factorize a user–item interaction matrix and predict the unknown ratings. Inspired by word embedding models, Liang et al. (2016) jointly factorized the user–item interaction matrix and the item–item co-occurrence matrix with shared item latent factors.

Collaborative filtering (CF) has been widely investigated by many researchers and is a mature application that has been utilized extensively in industry (Chen et al., 2017, Chen et al., 2019, Wang, Zhou, Chen et al., 2019). The algorithm attempts to determine the hidden relationships between users and items in a data-driven method and recommends similar items to users with the same interests. There are two primary types of CF, latent factor models (LFMs) and neighborhood-based models (NBMs). LFMs discover the latent features of users or items and project them into feature vectors that are generally of low-rank. Matrix factorization (MF) is a typical method of LFMs, which factorizes the raw rating matrix into two low-rank matrices known as the user latent matrix and the item latent matrix. The unknown ratings are predicted by the dot product of the corresponding latent vectors. A large number of MF-based models have been proposed in decades. For example, Koren (2008) applied a singular value decomposition (SVD) based model named SVD++ that considered the influence of the neighborhood. Ning and Karypis (2011) presented a sparse linear model (SLIM) that explored an item–item similarity matrix by factorizing the original user–item interaction matrix. Wang et al. (2018) employed a confidence-aware MF framework to optimize both the precision of rating estimation and prediction confidence.

Different from LFMs, NBMs aim to explore similar users or items by computing the similarities among them and make estimations by considering the influence or contribution of each neighbor. A classic algorithm of NBMs is the k-nearest neighbors (KNN) approach (Sarwar et al., 2001). The item-based KNN models calculate the similarities among items and then sort them for top-k recommendations. For example, Park et al. (2015) proposed a KNN-based CF model named reversed CF, which utilized a KNN graph to locate the KNN of the rated items. As for rating tasks, models commonly compute the unknown ratings with a weighted average of other existing ratings. Different neighbors contribute to the final estimations of target ratings based on the similarities between them. Fig. 1 illustrates this schema. By measuring the similarities or distance between the predicting point and existing points, we can select the most relative points to estimate the value of unknown point. Accordingly, a smaller interval between data points should lead to stronger influence, which means that similar points work better on the recovery of unknown data.

Kernel learning is a technique that applies kernel functions to map the raw data into a high-dimensional space without computing the corresponding projection functions. It is best known in support vector machines (SVMs), which make raw linearly inseparable data separable in a high-dimensional space. Among various kernel functions, radial basis function (RBF) kernels such as the Gaussian kernel are the most widely used, which are often leveraged to train RBF networks. RBF kernels have been extensively applied in recommender systems and improve the performance of MF-based approaches (Liu et al., 2016, Pal and Jenamani, 2018, Zhou et al., 2012). Because RBF kernels are able to calculate the similarities among samples, and the performance of NBMs is closely related to the similarity metric, we also consider it as a powerful technique for improving NBMs. Actually, RBF kernels have been applied in many fields like feature selection (Kuo et al., 2013), clustering (Cruz et al., 2016) and image processing (Romani et al., 2019) due to the ability to measure similarities. Nevertheless, to our knowledge, limited studies have been devoted to the application of RBF kernels in NBMs for recommender system databases.

In this paper, we propose a new kernel-based matrix completion (KMC) framework for recommender systems, which aims to solve the rating tasks with NBMs for a user–item interaction matrix. The model applies RBF kernels that are reformulated by similarity measures and provides estimation for a user on a specific item. Inspired by the interpolation condition, the proposed KMC is a closed-form solution calculated by kernel matrices. This speeds up the rating predictions for a specific user or item. Moreover, we improve this model with a multi-kernel framework for KMC (M-KMC) to merge different features in different latent spaces generated by diverse kernels. Different from extensively used linear combination of kernels, M-KMC applied a non-linear auto-weighted strategy to merge different kernels. In summary, our contributions are as follows:

  • 1.

    We propose a kernelized model with a closed-form solution for matrix completion, which applies the interpolation method for rating prediction.

  • 2.

    In our proposed model, the similarity metric is combined with the Gaussian kernel to compute the weights of neighbors, which generates a more precise approximation for unknown ratings.

  • 3.

    M-KMC is presented with the multi-kernel framework, which adaptively adjusts the weights of the multiple kernel functions and improves the performance of KMC.

  • 4.

    We conduct rich experiments on KMC and M-KMC and discuss the effect of different parameters. Our model achieves the performance that is competitive with or superior to the traditional and state-of-the-art models.

Section snippets

Neighborhood-based models

NBMs are commonly used techniques in recommender systems. These models compute the similarities or correlations among different users or items, based on rating records or extracted latent features. A common metric is known as the cosine similarity, which calculates the cosine value between two vectors. For a user-based similarity measure, assume that Iuv={1,,n} is the item set that both user u and user v have co-rated, then vector Yu={yu1,,yun} and vector Yv={yv1,,yvn} are the rating vectors

Our proposed models

Before the description of our proposed methods, we first provide explanations for primary mathematical notations used in this section. The set Rm×n is the space of m×n dimensional real matrix. Assume there are m users and n items in the dataset, all observed user–item rating pairs are stored in set Ω={(u,i)|yuiisobserved}, and set Ω̄ represents the pairs where ratings are missing. Then Yui=yui(u,i)Ωnull(u,i)Ω̄.denotes the user–item interaction matrix. The similarity between two data points is

Experiments and analysis

In this section, we conduct several experiments for our proposed KMC and M-KMC on real-world datasets from different recommendation environments. The performances of different parameter settings are compared to analyze the parameter sensitivity. Finally, we compare our proposed models with both the traditional and state-of-the-art methods via the same metrics to prove the feasibility of our models.

Conclusion

In this paper, we proposed a kernel-based framework KMC for neighborhood-based recommender systems, which aimed to retrieve missing ratings in the user–item interaction matrix. The model projected the original data into a new Hilbert space with RBF kernels and realized the local matrix approximation. Associated with the interpolation condition, the weights of different neighbors were computed by kernel matrices, so that the final estimations were conducted by the dot product of weight vectors

CRediT authorship contribution statement

Zhaoliang Chen: Conceptualization, Formal analysis, Methodology, Writing - original draft. Wei Zhao: Conceptualization, Formal analysis, Methodology, Writing - revision. Shiping Wang: Funding acquisition, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (Nos. U1705262 and 61672159), the Technology Innovation Platform Project of Fujian Province, China under Grant (Nos. 2014H2005 and 2009J1007), the Fujian Collaborative Innovation Center for Big Data Application in Governments, the Fujian Engineering Research Center of Big Data Analysis and Processing.

References (33)

  • FanJ. et al.

    Polynomial matrix completion for missing data imputation and transductive learning

  • Kang, Z., Peng, C., & Cheng, Q. (2016). Top-n recommender system via matrix completion. In Proceedings of the 30th AAAI...
  • Koren, Y. (2008). Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of...
  • Kuchaiev, O., & Ginsburg, B. (2018). Training deep autoencoders for recommender systems. In Proceedings of the 6th...
  • KuoB.-C. et al.

    A kernel-based feature selection method for svm with rbf kernel for hyperspectral image classification

    IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

    (2013)
  • Lee, J., Kim, S., Lebanon, G., & Singer, Y. (2013). Local low-rank matrix approximation. In Proceedings of the 30th...
  • Cited by (12)

    • Static and incremental robust kernel factorization embedding graph regularization supporting ill-conditioned industrial data recovery

      2023, Expert Systems with Applications
      Citation Excerpt :

      Therefore, completing missing data has become an urgent problem to be solved (Zhao et al., 2022). To exploit the high-rank nonlinear completion issues, some research work tried to estimate missing data from multi-subspaces or nonlinear models based on kernel space (Chen, Zhao et al., 2021) and deep learning (Zhang, Zuo et al., 2018). For instance, Fan and Chow (2017) used an auto-encoder framework of partially observed data to learn and construct a nonlinear potential variable estimation model with missing data.

    • Diversity embedding deep matrix factorization for multi-view clustering

      2022, Information Sciences
      Citation Excerpt :

      Many studies have verified the effectiveness of matrix factorization in clustering tasks [1,2]. Matrix factorization compresses the original high-dimensional data by finding a set of basis to improve the performance of many machine learning tasks, such as matrix completion [3,4], recommender systems [5,6], information retrieval [7], community detection [8,9] and image recognition [10]. Nonnegative matrix factorization (NMF) is one of the most widely utilized dimensionality reduction techniques, which decomposes a nonnegative data matrix into two nonnegative matrices.

    • Matrix completion on learnt graphs: Application to collaborative filtering

      2021, Expert Systems with Applications
      Citation Excerpt :

      One set of benchmarks consists of matrix factorization (GRMF) (Gu et al., 2010) and nuclear norm minimization (GRNNM) (Mongia & Majumdar, 2019) on graphs, where the graphs are fixed. We have also compared with state-of-the-art techniques – deep neural network based recommendation (DNNRec) (Kiran et al., 2020), kernel matrix completion (KMC) (Chen et al., 2021), and graph convolutional neural network (GCNN) (Chen et al., 2020); these studies have been published in the past year. As the evaluation metric, we show results on all the standard ones – mean absolute error (MAE), root mean squared error (RMSE), precision, recall, and F1 score in top 20 recommendations.

    View all citing articles on Scopus
    View full text