ROUND: Walking on an object–user heterogeneous network for personalized recommendations
Introduction
Information overload accompanying the rapid growth of the world-wide-web has been challenging both business applications and academic studies for the effectively extraction of useful information from a vast amount of online resources (Huang, Chung, & Chen, 2004). To alleviate this problem, search engines have been designed to help people screening stuffs that they are interested in according to keyword-based queries. However, such a way, acting in a passive manner, typically overlooks historical data that characterize preferences of the user, and thus results in information loss and unsatisfied performance (Al-Masri and Mahmoud, 2008, Brin and Page, 1998). To overcome these limitations and improve user experience, recommender systems have been proposed to provide personalized screening of useful resources in an active way (Jeong, Lee, & Cho, 2010), leading to a variety of successful applications such as the online recommendation of movies (Nie, Xia, & Li, 2009), bookmarks (Bogers & van den Bosch, 2011), news (Prawesh & Padmanabhan, 2012), tourism (Gavalas, Konstantopoulos, Mastakas, & Pantziou, 2014), taxi (Hwang, Hsueh, & Chen, 2015) and many other resources (Adomavicius & Tuzhilin, 2005).
A recommender system is usually designed according to the collaborative filtering principle (Barragans-Martinez et al., 2010, Biau, Cadre and Rouvière, 2010, Deshpande and Karypis, 2004, Moreno et al., 2014, Sarwar, Karypis, Konstan and Reidl, 2001, Shi, Larson and Hanjalic, 2014). Particularly, a user-based scheme assumes that users share common preferences in history will also have similar tastes in the future. Therefore, such an approach uses historical data to characterize similarities between users, then calculates discriminant scores for candidate objects accordingly (Biau, Cadre, & Rouvière, 2010). As its counterpart, an item-based design is formally equivalent to a user-based one by simply exchange the roles of user and objects (Deshpande and Karypis, 2004, Sarwar, Karypis, Konstan and Reidl, 2001). In the case that object properties are available, a content-based design is able to characterize similarities between objects according to their properties and then makes recommendations accordingly (Barragans-Martinez et al., 2010). In order to promote respective advantages of the historical data and object properties, hybrid approaches have also been proposed (Moreno et al., 2014). Recently, model-based approaches have achieved great successes through exploring latent relations behind users and objects, accompanying with such approaches as matrix factorization (NMF) (Kabbur and Karypis, 2014, Koren, Bell and Volinsky, 2009) and singular value decomposition (SVD) (Paterek, 2007).
In the recent years, recommendation approaches relying on the concept of complex networks have been widely adopted (Adomavicius and Tuzhilin, 2005, Chiang, Liou and Wang, 2013, Fouss, Luh, Pirotte and Saerens, 2006, Li et al., 2014, Medo, 2013). These methods often exhibit excellent performance when compared with traditional collaborative filtering approaches. For example, it has been shown that the simulation of a resource allocating process on an object–user bipartite network (ProbS) improves the recommendation accuracy (Zhou, Ren, Medo, & Zhang, 2007), while the simulation of a heat-spreading process (HeatS) enhances the recommendation diversity (Zhou et al., 2010). More recently, social relationships between users have also been incorporated into object–user networks, enabling the adaptation of sophisticated random walk model for recommendations of academic resources and urban points of interest (Chen et al., 2015, Tian and Jing, 2013, Ying, Kuo, Tseng and Lu, 2014). Nevertheless, these methods have the following common weaknesses. First, none of them consider to formulate the recommendation problem using a unified framework from the viewpoint of network inference. Consequently a variety of ad hoc strategies have been adopted for constructing object–user networks, making the design of a network-based recommendation approach like the arts instead of sciences. Second, as we have pointed out previously, the existence of popular objects may introduce unreliable connections in an object–user network (Gan and Jiang, 2013a, Zhang, Zeng and Shang, 2013). However, none of these existing network-based methods take this into consideration. Third, although efforts have been made to incorporate relationships among users along with bipartite relationships between objects and users, there still lacks an approach that adopts a single model to simultaneously integrate (i) relationships among objects, (ii) relationships among users, and (iii) relationships between objects and users.
With these understandings, we first introduce a general framework for top-N recommendation from the viewpoint of network inference and interpret several existing methods under this framework. Then, we propose a novel method named ROUND (Random walk with restart on an Object–User Network towards personalized recommenDations) to overcome the weaknesses of existing network-based approaches. Our method constructs an object–user heterogeneous network from historical data and then resorts to a random walk with restart model to characterize the strength of associations between users and objects, thereby providing a means of personalized recommendations. We validate our approach via large-scale 10-fold cross-validation experiments across two real datasets (MovieLens and Netflix), and we show the superiority of our method over such state-of-the-art methods as NMF, SVD and ProbS. Main contributions of our paper are summarized as follows.
- 1.
We propose a general framework for top-N recommendation from the viewpoint of network inference. This framework not only improves the understanding of existing recommendation methods, but also opens a door to borrow abundant theories and techniques in network analysis for the development of new recommender systems.
- 2.
We propose to integrate i) relationships between objects, ii) relationships between objects, and iii) relationships between objects and users in a single object–user network. This work is the first that explores recommendation method with this heterogeneous network formulation.
- 3.
We propose a k-nearest neighbor strategy for filtering out unreliable connections in constructing the object–user network by effectively removing weak similarities between objects and those between users. Hence, it significantly improves the quality of the resulting heterogeneous network.
- 4.
We propose a random walk model for characterizing the strength of associations between objects and users in the constructed heterogeneous network. This model provides more accurate estimates for unknown object–user relationships than such widely used measures as the method of shortest path.
Section snippets
A general framework of network-based top-N recommendation
In general, there are two branches in studies of recommender systems, ranking and rating prediction. In our work, we try to solve the ranking problem, known as top-N recommendations. Specifically, a method targeting on top-N recommendations takes a query user and a set of candidate objects as input and produces a ranking list of the candidate objects as output. To accomplish this, a method based on historical data resorts to known associations between objects and users to calculate prediction
Overview of ROUND
The proposed method ROUND was designed based on the unserstanding that relationships between objects and those between users would contribute together to the recommendation of candidate objects to a query user. Briefly, this approach is composed of three sequential steps, as illustrated in Fig. 1. First, in the network construction step, we rely on historical data to construct an object–user network, which is composed of an object layer, a user layer, and interconnections between the two
Data sources
We used two large-scale datasets validate the proposed approach. First, we downloaded from the GroupLens web site (http://www.grouplens.org) a dataset named MovieLens, which contains more than 10 million ratings given by 69,878 users for 10,677 movies. Each rating had 10 values, ranging from 0.5 (worst) to 5.0 (best) with step 0.5. We down-sampled at random 5000 users from the original data and retained 5903 movies rated by at least 5 of such users. Then, we followed the literature (Zhou et
Conclusions and discussion
We have proposed a random walk with restart approach, termed ROUND, on an object–user network towards personalized recommendation and analyzed its performance via 10-fold cross-validation experiments. Results show the superior performance of this method over existing state-of-the-art methods across two independent datasets of different size in not only recommendation accuracy and diversity, but also retrieval performance.
The outstanding performance of our method can be attributed to its
Acknowledgment
This work was partly supported by the National Natural Science Foundation of China under Grants No. 71101010 and No. 71471016.
References (31)
- et al.
A hybrid content-based and item-based collaborative filtering approach to recommend TV programs enhanced with singular value decomposition
Information Sciences
(2010) - et al.
The anatomy of a large-scale hypertextual Web search engine
Computer Networks and ISDN Systems
(1998) - et al.
Constructing a user similarity network to remove adverse influence of popular objects for personalized recommendation
Expert Systems with Applications
(2013) - et al.
Mobile recommender systems in tourism
Journal of Network and Computer Applications
(2014) - et al.
An effective taxi recommender system based on a spatio-temporal factor analysis model
Information Sciences
(2015) - et al.
Improving memory-based collaborative filtering via similarity updating and prediction modulation
Information Sciences
(2010) - et al.
Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions
IEEE Transactions on Knowledge and Data Engineering
(2005) - et al.
Investigating web services on the world wide web
- et al.
Statistical analysis of k-nearest neighbor collaborative recommendation
The Annals of Statistics
(2010) - et al.
Fusing recommendations for social bookmarking web sites
International Journal of Electronic Commerce
(2011)