Elsevier

Expert Systems with Applications

Volume 42, Issue 22, 1 December 2015, Pages 8791-8804
Expert Systems with Applications

ROUND: Walking on an object–user heterogeneous network for personalized recommendations

https://doi.org/10.1016/j.eswa.2015.07.032Get rights and content

Highlights

  • Walking on the heterogeneous network outperforms walking on a single network.

  • Constructed network removes weak relationships with adversely influence.

  • The proposed method achieves significant improvements over existing methods.

  • The method is ready to be used on historical data or other accessible information.

Abstract

The rapid growth of the world-wide-web has been challenging information sciences for the effective screening of useful information from a vast amount of online resources. Although recent studies have suggested that recommendation approaches relying on the concept of complex networks usually exhibit excellent performance, there still lacks a unified framework to guide the design of a recommender system from the viewpoint of network inference. Besides, two critical questions for a network-based approach, the quality of the object–user network and the measure of the strength of association between an object node and a user node in such a network, are still not systematically explored in existing studies. Aiming to answer these questions, here we introduce a general framework for network-based top-N recommendation and propose a novel method named ROUND that integrates (i) relationships among objects, (ii) relationships among users, and (iii) relationships between objects and users, in a single network model. We adopt a k-nearest neighbor strategy to filter out unreliable connections in the network, and we use a random walk with restart model to characterize the strength of associations between object nodes and user nodes, thereby making significant progress in addressing the critical questions in network-based recommendation. We demonstrate the effectiveness of our method via large-scale cross-validation experiments across two real datasets (MovieLens and Netflix) and show the superiority of our method over such state-of-the-art approaches as non-negative matrix factorization and singular value decomposition in terms of not only recommendation accuracy and diversity but also retrieval performance.

Introduction

Information overload accompanying the rapid growth of the world-wide-web has been challenging both business applications and academic studies for the effectively extraction of useful information from a vast amount of online resources (Huang, Chung, & Chen, 2004). To alleviate this problem, search engines have been designed to help people screening stuffs that they are interested in according to keyword-based queries. However, such a way, acting in a passive manner, typically overlooks historical data that characterize preferences of the user, and thus results in information loss and unsatisfied performance (Al-Masri and Mahmoud, 2008, Brin and Page, 1998). To overcome these limitations and improve user experience, recommender systems have been proposed to provide personalized screening of useful resources in an active way (Jeong, Lee, & Cho, 2010), leading to a variety of successful applications such as the online recommendation of movies (Nie, Xia, & Li, 2009), bookmarks (Bogers & van den Bosch, 2011), news (Prawesh & Padmanabhan, 2012), tourism (Gavalas, Konstantopoulos, Mastakas, & Pantziou, 2014), taxi (Hwang, Hsueh, & Chen, 2015) and many other resources (Adomavicius & Tuzhilin, 2005).

A recommender system is usually designed according to the collaborative filtering principle (Barragans-Martinez et al., 2010, Biau, Cadre and Rouvière, 2010, Deshpande and Karypis, 2004, Moreno et al., 2014, Sarwar, Karypis, Konstan and Reidl, 2001, Shi, Larson and Hanjalic, 2014). Particularly, a user-based scheme assumes that users share common preferences in history will also have similar tastes in the future. Therefore, such an approach uses historical data to characterize similarities between users, then calculates discriminant scores for candidate objects accordingly (Biau, Cadre, & Rouvière, 2010). As its counterpart, an item-based design is formally equivalent to a user-based one by simply exchange the roles of user and objects (Deshpande and Karypis, 2004, Sarwar, Karypis, Konstan and Reidl, 2001). In the case that object properties are available, a content-based design is able to characterize similarities between objects according to their properties and then makes recommendations accordingly (Barragans-Martinez et al., 2010). In order to promote respective advantages of the historical data and object properties, hybrid approaches have also been proposed (Moreno et al., 2014). Recently, model-based approaches have achieved great successes through exploring latent relations behind users and objects, accompanying with such approaches as matrix factorization (NMF) (Kabbur and Karypis, 2014, Koren, Bell and Volinsky, 2009) and singular value decomposition (SVD) (Paterek, 2007).

In the recent years, recommendation approaches relying on the concept of complex networks have been widely adopted (Adomavicius and Tuzhilin, 2005, Chiang, Liou and Wang, 2013, Fouss, Luh, Pirotte and Saerens, 2006, Li et al., 2014, Medo, 2013). These methods often exhibit excellent performance when compared with traditional collaborative filtering approaches. For example, it has been shown that the simulation of a resource allocating process on an object–user bipartite network (ProbS) improves the recommendation accuracy (Zhou, Ren, Medo, & Zhang, 2007), while the simulation of a heat-spreading process (HeatS) enhances the recommendation diversity (Zhou et al., 2010). More recently, social relationships between users have also been incorporated into object–user networks, enabling the adaptation of sophisticated random walk model for recommendations of academic resources and urban points of interest (Chen et al., 2015, Tian and Jing, 2013, Ying, Kuo, Tseng and Lu, 2014). Nevertheless, these methods have the following common weaknesses. First, none of them consider to formulate the recommendation problem using a unified framework from the viewpoint of network inference. Consequently a variety of ad hoc strategies have been adopted for constructing object–user networks, making the design of a network-based recommendation approach like the arts instead of sciences. Second, as we have pointed out previously, the existence of popular objects may introduce unreliable connections in an object–user network (Gan and Jiang, 2013a, Zhang, Zeng and Shang, 2013). However, none of these existing network-based methods take this into consideration. Third, although efforts have been made to incorporate relationships among users along with bipartite relationships between objects and users, there still lacks an approach that adopts a single model to simultaneously integrate (i) relationships among objects, (ii) relationships among users, and (iii) relationships between objects and users.

With these understandings, we first introduce a general framework for top-N recommendation from the viewpoint of network inference and interpret several existing methods under this framework. Then, we propose a novel method named ROUND (Random walk with restart on an Object–User Network towards personalized recommenDations) to overcome the weaknesses of existing network-based approaches. Our method constructs an object–user heterogeneous network from historical data and then resorts to a random walk with restart model to characterize the strength of associations between users and objects, thereby providing a means of personalized recommendations. We validate our approach via large-scale 10-fold cross-validation experiments across two real datasets (MovieLens and Netflix), and we show the superiority of our method over such state-of-the-art methods as NMF, SVD and ProbS. Main contributions of our paper are summarized as follows.

  • 1.

    We propose a general framework for top-N recommendation from the viewpoint of network inference. This framework not only improves the understanding of existing recommendation methods, but also opens a door to borrow abundant theories and techniques in network analysis for the development of new recommender systems.

  • 2.

    We propose to integrate i) relationships between objects, ii) relationships between objects, and iii) relationships between objects and users in a single object–user network. This work is the first that explores recommendation method with this heterogeneous network formulation.

  • 3.

    We propose a k-nearest neighbor strategy for filtering out unreliable connections in constructing the object–user network by effectively removing weak similarities between objects and those between users. Hence, it significantly improves the quality of the resulting heterogeneous network.

  • 4.

    We propose a random walk model for characterizing the strength of associations between objects and users in the constructed heterogeneous network. This model provides more accurate estimates for unknown object–user relationships than such widely used measures as the method of shortest path.

Section snippets

A general framework of network-based top-N recommendation

In general, there are two branches in studies of recommender systems, ranking and rating prediction. In our work, we try to solve the ranking problem, known as top-N recommendations. Specifically, a method targeting on top-N recommendations takes a query user and a set of candidate objects as input and produces a ranking list of the candidate objects as output. To accomplish this, a method based on historical data resorts to known associations between objects and users to calculate prediction

Overview of ROUND

The proposed method ROUND was designed based on the unserstanding that relationships between objects and those between users would contribute together to the recommendation of candidate objects to a query user. Briefly, this approach is composed of three sequential steps, as illustrated in Fig. 1. First, in the network construction step, we rely on historical data to construct an object–user network, which is composed of an object layer, a user layer, and interconnections between the two

Data sources

We used two large-scale datasets validate the proposed approach. First, we downloaded from the GroupLens web site (http://www.grouplens.org) a dataset named MovieLens, which contains more than 10 million ratings given by 69,878 users for 10,677 movies. Each rating had 10 values, ranging from 0.5 (worst) to 5.0 (best) with step 0.5. We down-sampled at random 5000 users from the original data and retained 5903 movies rated by at least 5 of such users. Then, we followed the literature (Zhou et

Conclusions and discussion

We have proposed a random walk with restart approach, termed ROUND, on an object–user network towards personalized recommendation and analyzed its performance via 10-fold cross-validation experiments. Results show the superior performance of this method over existing state-of-the-art methods across two independent datasets of different size in not only recommendation accuracy and diversity, but also retrieval performance.

The outstanding performance of our method can be attributed to its

Acknowledgment

This work was partly supported by the National Natural Science Foundation of China under Grants No. 71101010 and No. 71471016.

References (31)

  • ChenZ. et al.

    AVER: Random walk based academic venue recommendation

  • ChiangM. et al.

    Exploring heterogeneous information networks and random walk with restart for academic search

    Knowledge and Information Systems

    (2013)
  • DeshpandeM. et al.

    Item-based top-N recommendation algorithms

    ACM Transactions on Information Systems

    (2004)
  • FoussF. et al.

    An experimental investigation of graph kernels on a collaborative recommendation task

  • HuangZ. et al.

    A graph model for e-commerce recommender systems

    Journal of the American Society for Information Science and Technology

    (2004)
  • Cited by (0)

    View full text