Improving neighbor-based collaborative filtering by using a hybrid similarity measurement

https://doi.org/10.1016/j.eswa.2020.113651Get rights and content

Abstract

Memory-based collaborative filtering is one of the recommendation system methods used to predict a user’s rating or preference by exploring historic ratings, but without incorporating any content information about users or items. It can be either item-based or user-based. Taking item-based Collaborative Filtering (CF) as an example, the way it makes predictions is accomplished in 2 steps: first, it selects based on pair-wise similarities a number of most similar items to the predicting item from those that the user has already rated on. Second, it aggregates the user’s opinions on those most similar items to predict a rating on the predicting item. Thus, similarity measurement determines which items are similar, and plays an important role on how accurate the predictions are. Many studies have been conducted on memory-based CFs to improve prediction accuracy, but none of them have achieved better prediction accuracy than state-of-the-art model-based CFs. In this paper, we proposed a new approach that combines both structural and rating-based similarity measurement. We found that memory-based CF using combined similarity measurement can achieve better prediction accuracy than model-based CFs in terms of lower MAE and reduce memory and time by using less neighbors than traditional memory-based CFs on MovieLens and Netflix datasets.

Introduction

Recommendation systems are a subclass of information filtering systems that aim to predict a user’s opinion or preference of a topic or item, thereby providing personalized recommendations to users by exploiting historic data. They are widely used in e-commerces such as Amazon.com (Linden et al., 2003), online movie streaming companies such as Netflix (Bennett et al., 2007), and social media networks such as Facebook (Maja Kabiljo, 2015). With a large amount and diversity of products, a recommendation system could also help streaming service providers or online vendors provide users with recommendations that are specific their preference. This could improve user experience in searching for items or services and potentially lead them to make more purchases, watch more movies, or subscribe to more services. For examples, data gathered for three weeks in the summer of 2001 showed that between 20% and 40% of sales on Amazon are due to recommended products that do not belong to the shop’s 100,000 most sold products (Brynjolfsson et al., 2003), and 60% of movies rented by Netflix are selected based on personalized recommendations.1 Furthermore, a recommendation system could generate not only more direct revenue, but also additional revenue by introducing shoppers to new categories (Dias et al., 2008). Hence, a recommendation system can significantly impact a company’s revenue (Lü et al., 2012). Note that 1% improvement in average on MAE(Mean Absolute Error) and RMSE(Root Mean Squared Error) may be a small number, but could result in a significant difference in the ranking of the “top-10” most recommended movies for an individual user (Koren, 2007).

Recommendation systems typically generate a list of recommendations to users in one of three ways (Su and Khoshgoftaar, 2009) – (1) collaborative filtering, (2) content-based filtering, or (3) a hybrid of those two approaches. Collaborative filtering (CF) analyzes historical data about user behavior to predict what they might like by learning interactions between users and items. Content-based filtering predicts by learning descriptions of items or profiles of users. Collaborative filtering can be memory-based or model-based. Memory-based approaches rely on pairwise similarities between vectors of ratings, while model-based approaches rely on factorizing the entire rating matrix. Both memory-based and model-based CFs can be implemented from an item-based or user-based perspective. A comparison is given in Section 1.1.

This paper takes the approach of item-based and memory-based collaborative filtering due to its simplicity, efficiency, and ability to produce accurate recommendations (Desrosiers and Karypis, 2011). The way it makes predictions is accomplished in two steps: first, it selects K of the most similar items based on a pair-wise similarity measurement to the predicting item, from the items that the particular user has already rated. Second, it combines this user’s ratings on those K items to predict a rating on the predicting item (more details are given in Section 2). In this work, we introduce a framework to combine similarity measurements between items. We compare the proposed algorithm against state-of-the-art collaborative filtering techniques using MovieLens (Herlocker et al., 1999) and Netflix datasets (Bennett et al., 2007). Our results indicate that the prediction accuracy of the proposed algorithm performed better than state-of-the-art collaborative filtering techniques in terms of a lower MAE, while also requiring less wall time and computer memory.

As stated above, there are three typical approaches for recommendation systems: content-based filtering, collaborative filtering or a hybrid of those two approaches (Adomavicius and Tuzhilin, 2005; Su and Khoshgoftaar, 2009; Lü et al., 2012):

  • 1.

    Content-based filtering. These approaches use keywords or phrases to describe the contents of items, build user profiles to indicate the types of contents each user prefers using those keywords(phrases), and then recommend a list of items that fits each user’s preference. Several techniques have been studied such as pLSA(Probabilistic latent semantic analysis) (Hofmann, 1999; Hofmann, 2004), LDA (Latent Dirichlet Allocation) (Blei et al., 2003), etc. While content-based methods incorporate descriptive information from items by characterizing using keywords, they do not necessarily incorporate interactions between other individuals. Recommendations are made based solely on the content information of objects that the target user has rated in the past (Lü et al., 2012). Content-based filterings are widely used in a variety of domains ranging from recommending webpages, news articles, restaurants, etc (Pazzani and Billsus, 2007). For examples: Pandora Radio2 recommends users with songs that share similar characteristics (Casey et al., 2008), and Rotten Tomatoes3 recommends users with movies that share similar cast and storyline. However, Content-based filtering is unable to make a good recommendation that matches a user’s preference if the profiles of users or descriptions of items do not contain sufficient information to tell if the user likes or dislikes the item (Pazzani and Billsus, 2007).

  • 2.

    Collaborative filtering (CF). These approaches analyze historical data on user activities, and use it to predict what they might like based on their similarity to other users, or to items that are similar to the ones the user is known to like Aggarwal et al. (2016). A key advantage of this approach is that CFs study only the interactions between individuals without incorporating feature or attribute information of items and users. Thus, they do not require knowledge about the actual context of the data in order to make recommendations. That is, they can make a prediction without “understanding” a movie, a friend, or a music, etc. Thus, this approach can be applied broadly regardless of the contents of the data. However, when users have not rated a sufficient number of items, these methods may not perform very well (known as the cold start problem (Rubens et al., 2015; Schein et al., 2002)). Collaborative Filtering could also suffer from scalability or sparsity issues (Sarwar et al., 2001; details are listed in Section 1.2).

    Collaborative filtering can be implemented in two ways: user-based or item-based. To make a prediction of how a user u would rate an item i, User-based CF aggregates the opinions about item i from users that are similar to user u. It assumes that if two persons share similar opinions on some items, they are likely to hold similar opinions on other items as well. On the other hand, item-based CF aggregates the opinions about the items that are similar to item i, and have been rated by user u. It assumes that users are likely to hold the same opinions on similar items. For both item-based and user-based CFs, two approaches have been studied:

    • (a)

      Memory-based. This approach calculates similarities between items or users and uses them as weights on ratings to represent how much the opinions on items a user has rated can represent a user’s opinion on the unrated item. In general, it is effective and easy to implement. A typical example of this approach is K-Nearest-Neighbor (KNN) (Goldberg et al., 1992; Sarwar et al., 2001). First, item-based memory-based CFs find similar items by calculating pairwise similarities between the predicting item and all other items the user has rated. These pairwise similarities are used to rank how representative of the predicting item each other item is. Therefore, the prediction accuracy of collaborative filtering algorithms is highly dependent on how accurate the similarity measurement is. Second, based on pair-wise similarity measurement, K most similar items to this predicting item are selected from the items that this user has already rated on and then it combines the user’s opinions of those K items by weighted average or weighted sum to predict a rating for the unrated item from the user. Such approaches are widely used due to their simplicity, explainability and effectiveness (Desrosiers and Karypis, 2011), and predictions can be made in real time as new rating data is added. However, its prediction accuracy decreases when few items have been rated. Its scalability is also limited for large datasets (Lü et al., 2012).

    • (b)

      Model-based. This approach uses data mining and machine learning algorithms to develop and train predictive models. There are many different algorithms such as Singular-Value Decomposition (SVD) (Koren, 2008), and Principal component analysis (PCA) (Tipping and Bishop, 1999), which use matrix factorization techniques. Dimensionality reduction is usually used through model-based approach to improve the scalability and accuracy. It addresses the sparsity and scalability problems and performs better in prediction accuracy in comparison to memory-based CFs (Su and Khoshgoftaar, 2009). But, the models usually use iterative methods to approximate the parameters for the models, which take more time to build and train compared to memory-based approaches. Moreover, they could lose useful information as a result of dimensionality reduction (Su and Khoshgoftaar, 2009).

  • 3.

    Hybrid. This strategy is to combine two or more techniques (Burke, 2002; Su and Khoshgoftaar, 2006; Melville et al., 2002; Su et al., 2007), or with other techniques such as deep learning (Wang and Wang, 2014) or clustering (Xue et al., 2005) to overcome the limitations of their individuals and improve the performance such as prediction accuracy, scalability. However, it increases the computational complexity such as time and resources required (Su and Khoshgoftaar, 2009).

Recent years, deep neural networks yield immense success on computer vision and natural language processing. There are also successful works on applying Graph Convolutional Networks (GCNs) to recommendation systems. The basic idea of GCNs is to iteratively train the model over multiple layers through two steps at each layer: 1) node embedding with convolutional neighborhood aggregation; 2) non-linear transformation of node embeddings parameterized by a neural network (Chen et al., 2020) A general framework of Neural network-based Collaborative Filtering (NCF) is proposed to learn user-item interaction via a multi-layer perceptron (He et al., 2017). Neural Graph Collaborative Filtering (NGCF) is proposed embedding propagation layer to leverage high-order connectivities in user-item integration graph of model-based CF (Wang et al., 2019). Multi-Component graph convolutional Collaborative Filtering (MCCF) approach is proposed to distinguish the latent purchasing motivations (Wang et al., 2019). They have shown improvements over the state-of-the-art methods in prediction accuracy. There are also works on improving the scalability of Graph Convolutional Network (GCN) algorithms by reducing the complexity, such as PinSage, which combines random walks and graph convolutions to incorporate both graph structure and node feature information (Ying et al., 2018); Simple Graph Convolution (SGC) (Wu et al., 2019) and Linear Residual Graph Convolutional Collaborative Filtering (LR-GCCF) (Chen et al., 2020) removes the non-linearities. Although most GCN based approaches including GCN based recommendation models achieve the best performance with two layers (Kipf and Welling, 2017, Hamilton et al., 2017, Ying et al., 2018). It still requires training the model with all user-item interaction information iteratively as model-based approaches.

We focus our approach on memory-based collaborative filtering since it can be generalized to be used on any relational data without knowing the content of the data. However, there are some key fundamental challenges of collaborative filtering to predict an accurate rating in real time.

In practice, recommendation systems are used with very large datasets such as those from Amazon (Linden et al., 2003), Facebook (Maja Kabiljo, 2015) or Netflix (Bennett et al., 2007). There are usually at least tens of thousands of items and millions of users, but most users only review few items with regard to the total number of items. One of the typical challenges that could be introduced by data sparsity is the Cold Start Problem (Rubens et al., 2015; Schein et al., 2002). Since collaborative filtering methods make recommendations based on users’ past preferences, new users need to rate a sufficient number of items in order to let the recommendation system learn their preferences and make reliable recommendations. Collaborative filtering methods are generally unable to make accurate recommendations if users have only rated very few items.

As the number of users and items can grow extremely large, traditional collaborative filtering methods will suffer scalability problems. In order to react to new user ratings in real time to make an updated recommendation, it would be challenging for model-based approaches since they use the entire dataset to train. However, in practice, most users have only reviewed relatively few items relative to the total number of items (Linden et al., 2003), and memory-based methods can react to new ratings and make a prediction in real time even for extremely large datasets (Sarwar et al., 2001; Linden et al., 2003). While model-based approaches can mitigate scalability problems by using dimensionality reduction techniques such as SVD (Billsus and Pazzani, 1998), they suffer from computationally expensive matrix factorization and may lose useful information in the process. Thus, there are tradeoffs between scalability and performance for model-based approaches (Su and Khoshgoftaar, 2009).

Collaborative filtering needs to calculate similarities between items or users in order to identify similar items or users. Those pairwise similarities are calculated in high-dimensions since there are many users and items. With a fixed size of training samples, the predictive power reduces as the dimensionality increases, which is known as the Hughes Phenomenon (Hughes, 1968). Two outcomes may result (Nanopoulos et al., 2009).

  • 1.

    Concentration. Similarities between all users or items become the same, then memory-based CF is unable to find the most similar items or users, thus will not be able to make a reliable prediction.

  • 2.

    Hubness. Some items occur more frequently in other items’ nearest neighbor lists. Those items are usually high rated popular items and are not contributing any personal preference information for recommendations since they may be liked by many users. They can behave like noise making memory-based CF not able to make accurate predictions.

Memory-based Collaborative filtering methods that use cosine-like similarity measurements to calculate pairwise similarities suffer hubness and concentration problems caused by high dimensionality of the data. Model-based methods with dimension reduction methods such as SVD cannot solve those problems either (Nanopoulos et al., 2009). Hubness starts reducing only when intrinsic dimensionality is reached, where further reduction may incur loss of information (Nanopoulos et al., 2009). The concentration and hubness are inherent properties of high dimensionality, not proprieties like sparsity or skewness of the distribution of ratings (Nanopoulos et al., 2009). Both phenomena can negatively affect the accuracy of predictions since they impact the representativeness of nearest neighbor lists (Nanopoulos et al., 2009). While reducing hubness by using mutual proximity as a similarity measurement can increase the performance of prediction, the accuracy cannot rival the state-of-the-art of model-based approaches (Schnitzer et al., 2012; Knees et al., 2014).

In this paper, we review traditional memory-based collaborative filtering methods and propose a new approach. We discuss problems that traditional similarity measurements try to solve, and problems each hasn’t overcome (Section 3). We study how hubness appears in nearest neighbor list using rating-based and structural similarity measurement alone. The main contributions of this paper are:

  • 1.

    We propose a similarity measurement framework that combines rating-based similarity measurements and structural similarity measurements to overcome the limitations of using either of those two measurements alone, and the problems of hubness (Section 3).

  • 2.

    We compare two benchmarks for prediction accuracy evaluation: MAE (Mean Absolute Error) and RMSE (Root Mean Squared Error) (Section 4.2) and and provide experimental results on three most widely referenced datasets: MovieLens100K, MovieLens1M, and Netflix Challenge datasets (Section 4).

  • 3.

    We show: (1) Our method outperforms state-of-the-art collaborative filterings in terms of lower MAE with 1/3 to 1/2 number of neighbors compared to traditional memory-based CFs on MovieLens 100 K, 1 M and Netflix datasets (Section 5.1); (2) Memory-based CF with the proposed similarity measurement uses 1/2 to 1/39 wall time compared to state-of-the-art model-based CFs on MovieLens 1 M dataset (Section 5.2) and (3) Our method can achieve 3% lower MAE and RMSE compared to traditional memory-based CFs on non-cold start users on MovieLens 100 K dataset (Section 5.3).

Section snippets

Background

As briefly mentioned in Section 1.1, memory-based collaborative filtering can be item-based or user-based. There are 2 steps for both approaches: similarity calculation and preference prediction. Without loss of generality, taking item-based approach, the detailed steps are in Sections 2.2 Item-based similarity computation, 2.3 Preference prediction. Before that, let’s define some terminologies and notations in Section 2.1.

Proposed approach

We propose a new similarity measurement to quantify the similarity between items. Unlike traditional CFs that use rating-based similarity measurement, such as adjusted cosine, Pearson correlation coefficient, or structural similarity measurement alone, we propose to combine both measurements. Rating-based similarity measurements.

Srating(ii,ic) can tell us whether two items are positively or negatively correlated based on opinions of users Cic who have rated both items, while structural

Data sets

For comparison purposes, the MovieLens and Netflix datasets are used as they are the most widely referenced in literature (see Table 3).

The distribution of ratings are shown in Fig. 2. We can see all three datasets have similar shape of distribution. They are all right skewed with peak at rating of 4.

Combined similarity measurement is expected to work better with the degree distribution of items of those datasets shown in Fig. 3. Although MovieLens datasets are not as skewed as Netflix dataset,

Experimental results

Experiments over MovieLens dataset (ml-100 K and ml-1 M) were conducted using a 4 GHz Intel i7 CPU, 16 GB 1600 MHz DDR3 memory iMac with OSX 10.13. Experiments over Netflix dataset were conducted on computing clusters of two 12-Core Intel Xeon Gold CPU, 96 GB memory, 793.2 TeraFLOPS. Time measured were in wall time. Memory-based CFs are implemented in C++. Model-based CFs implemented by MyMediaLite library (Gantner et al., 2011) are written in C#. Hyperparameters of model-based CFs suggested by

Prediction accuracy

The proposed similarity measurement can improve the prediction accuracy of traditional KNN method for the MovieLens and Netflix datasets in terms of lower MAE as shown in Section 5.1. Since the number of co-rated users between each pair of items |Cic| and the degree distribution among items varies as shown in Fig. 3, the structural similarity Sstruct(ii,ic) varies among the K nearest neighbors as well. Thus, the MAE of CF using the proposed similarity measurement Scombined(ii,ic) would be lower

Conclusion

In this paper, we proposed a general framework to combine similarity measurement to be used in KNN memory-based collaborative filtering methods that improves the prediction accuracy over state-of-the-art CF methods. This is accomplished without losing the advantages that memory based CF methods require less memory and time compared to model-based CF methods. Furthermore, with the proposed similarity measurement, KNN memory-based CFs use fewer neighbors to predict compared to traditional KNN

CRediT authorship contribution statement

Mario Ventresca: Conceptualization, Methodology, Formal analysis, Writing - review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

Thanks to David Zage for valuable initial discussion.

References (60)

  • G. Adomavicius et al.

    Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions

    IEEE Transactions on Knowledge and Data Engineering

    (2005)
  • C.C. Aggarwal

    Recommender systems

    (2016)
  • Bennett, J., Lanning, S. & et al. (2007). The netflix prize. In Proceedings of KDD cup and workshop (Vol. 2007, p. 35)....
  • Billsus, D., & Pazzani, M.J. (1998). Learning collaborative information filters. In Icml (Vol. 98, pp....
  • D.M. Blei et al.

    Latent dirichlet allocation

    The Journal of Machine Learning Research

    (2003)
  • Breese, J.S., Heckerman, D. & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative...
  • E. Brynjolfsson et al.

    Consumer surplus in the digital economy: Estimating the value of increased product variety at online booksellers

    Management Science

    (2003)
  • R. Burke

    Hybrid recommender systems: Survey and experiments

    User Modeling and User-Adapted Interaction

    (2002)
  • M.A. Casey et al.

    Content-based music information retrieval: Current directions and future challenges

    Proceedings of the IEEE

    (2008)
  • Chen, L., Wu, L., Hong, R., Zhang, K. & Wang, M. (2020). Revisiting graph based collaborative filtering: A linear...
  • C. Desrosiers et al.

    A comprehensive survey of neighborhood-based recommendation methods

  • Dias, M. B., Locher, D., Li, M., El-Deredy, W. & Lisboa, P.J. (2008). The value of personalised recommender systems to...
  • J. Feng et al.

    An improved collaborative filtering method based on similarity

    PloS One

    (2018)
  • Gantner, Z., Rendle, S., Freudenthaler, C. & Schmidt-Thieme, L. (2011). Mymedialite: A free recommender system library....
  • D. Goldberg et al.

    Using collaborative filtering to weave an information tapestry

    Communications of the ACM

    (1992)
  • Hamilton, W., Ying, Z. & Leskovec, J. (2017). Inductive representation learning on large graphs. In Advances in neural...
  • F.M. Harper et al.

    The movielens datasets: History and context

    ACM Transactions on Interactive Intelligent Systems (TIIS)

    (2016)
  • X. He et al.

    Neural collaborative filtering

  • Herlocker, J. L., Konstan, J. A., Borchers, A. & Riedl, J. (1999). An algorithmic framework for performing...
  • Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR...
  • Cited by (0)

    View full text