Elsevier

Knowledge-Based Systems

Volume 111, 1 November 2016, Pages 144-158
Knowledge-Based Systems

Estimating user behavior toward detecting anomalous ratings in rating systems

https://doi.org/10.1016/j.knosys.2016.08.011Get rights and content

Abstract

Online rating system plays a crucial role in collaborative filtering recommender systems (CFRSs). However, CFRSs are highly vulnerable to “shilling” attacks in reality. How to quickly and effectively spot and remove anomalous ratings before recommendation also is a big challenge. In this paper, we propose an unsupervised method to detect the attacks, which consists of three stages. Firstly, an undirected user-user graph is constructed from original user profiles. Based on the graph, a graph mining method is employed to estimate the similarity between vertices for creating a reduced graph. Then, similarity analysis is used to distinguish the difference between the vertices in order to rule out a part of genuine users. Finally, the remained genuine users are further filtered out by analyzing target items and the attackers can be detected. Extensive experiments on the MovieLens datasets demonstrate the effectiveness of the proposed method as compared to benchmark methods.

Introduction

Personalization collaborative filtering recommender systems (CFRSs) become more popular in the well-known E-commerce services such as Amazon, eBay, and etc. [6], [8], [24], [29], [51], [60], [62]. Abundant rating records are generated by customers on products or services. However, CFRSs are highly vulnerable to “profile injection” attacks (a.k.a. “shilling” attacks) [3], [9], [12], [19], [21], [25], [35], [43], [52], [55], [62]. It is a common occurrence that attackers contaminate the recommender systems with malicious ratings [14], [15]. They either demote a target item with the lowest rating (called nuke attack) or promote a target item with highest rating (called push attack) in order to achieve their attack intentions or decrease the quality of recommendation [6], [22], [31], [32], [33], [58]. Thus, developing an effective detection method to detect and remove the attackers before recommendation is crucial.

Detection methods based on the attacks have received much attention. Since the similarities between attackers are higher than genuine users, some of them have been presented based on calculating similarity between users [29], [41], [56]. Traditional similarity metrics including Pearson Correlation Coefficient (PCC), Cosine Similarity, and etc., can effectively capture the concerned attackers in some extent. However, the detection performance of these methods is largely relying on similarity calculation. How to reduce the time consumption of calculating similarity also is a hard issue, especially when facing large-scale datasets. Furthermore, some attackers mimic the rating details of some genuine users to improve their reliability. Only using similarity is difficult to fully discriminate. To address these challenges, a more effective detection method should be considered in the following aspects:

  • (a)

    Both the computation time and detection performance should be acceptable;

  • (b)

    It is effective to defense different kinds of “shilling” attacks.

In this paper, we propose an unsupervised detection method to spot such attacks, which consists of three stages. The goal of the proposed method is to filter out more genuine users and simultaneously keep all attackers step by step. To explore the similarity pattern of users from a new perspective, a novel graph mining algorithm [57] is employed for distinguishing attacker and genuine user. Naturally, an user-user undirected graph is firstly constructed from original user profiles. Furthermore, an edge between two vertexes (or users) in the graph is created when the number of the co-rated items of the two users is greater than an empirical threshold t. Moreover, during the constructing graph, a part of genuine users can be filtered out while retaining all attackers as far as possible. Based on the constructed graph, a fast and effective graph mining approach is used to calculate distance among vertexes (or users). Thus, a few genuine users can be further filtered out based on the calculated distances of users by exploiting a threshold of the distance, due to the fact that attackers and genuine users have different similarity patterns. Since co-rated items which used in the first two stages are only used to generate graph and calculate the distance among users, it is not enough to fully represent the difference between attack and genuine profiles. In reality, the effect of attacks is determined by both item and rating styles. Accordingly, analyzing target items with special ratings (i.e., the maximum and the minimum ratings) is investigated to capture the concerned attackers and further filter out the remained genuine users based on the result of the second stage. Finally, extensive experiments based on the MovieLens datasets demonstrate the effectiveness of the proposed method as compared to benchmark methods. A series of experiments in 14 different attacks also verify the detection performance of the proposed method.

The rest of the paper is organized as follows. Section 2 introduces related work. Section 3 shows the background of “shilling” attacks based on collaborative filtering recommendation. In Section 4, we detail the proposed method. In Section 5, experimental results are reported and analyzed. Finally, we conclude the paper with a brief summary and discuss the future work.

Section snippets

Related work

Discovering “shilling” attackers hidden in recommender systems is really crucial to enhance the quality and robustness of recommendation. A number of detection methods have been proposed so far, and they exhibit complementary advantage and disadvantage towards various types of attacks. In this section, we just discuss methods related to the present work in two aspects, supervised detection methods and unsupervised detection methods.

For the unsupervised detection methods, the difference between

Background

In this section, the background of collaborative filtering recommendation is firstly introduced. Then, the structure of attack profiles is detailed.

The proposed approach

In this section, the details of the proposed method are introduced in three stages including the stage of constructing graph, the stage of calculating similarity between vertexes in the graph and the stage of analyzing target items as shown in Fig. 1. In the first stage, an undirected user-user graph is constructed from original user profiles. In the second stage, the similarity (or distance) between vertexes is calculated by utilizing a graph mining method, which is used to distinguish

Experiments and analysis

In this section, the experimental settings are detailed at first. Extensive experiments based on the MovieLens datasets are conducted to examine the effectiveness of the proposed method including analyzing the detection performance compared with benchmark methods, detection results on diverse attacks and two different datasets. In addition, the computation time and time complexity of the presented methods and parameters sensitivity analysis are briefly discussed. Additional discussions are

Conclusion and future work

“Shilling” attacks are the main threats in collaborative filtering recommender systems. These attack profiles have a good probability of being similar rating details to a large number of genuine profiles in order to make them hard to be detected. In this paper, we proposed an unsupervised detection method for spotting the attacks (or anomalous ratings), which consists of three stages. Firstly, an user-user graph is constructed, which exploits the co-rated items rated by two users to create an

Acknowledgment

The research is supported by NSFC (61175039 and 61221063), 863 High Tech Development Plan (2012AA011003), Research Fund for Doctoral Program of Higher Education of China (20090201120032), International Research Collaboration Project of Shaanxi Province (2013KW11) and Fundamental Research Funds for Central Universities (2012jdhz08). Three anonymous reviewers have carefully read this paper and have provided to us numerous constructive suggestions. As a result, the overall quality of the paper has

References (66)

  • M. Salehi et al.

    Hybrid recommendation approach for learning material based on sequential pattern of the accessed material and the learner’s preference tree

    Knowl.-Based Syst.

    (2013)
  • WangY. et al.

    A comparative study of shilling attack detectors for recommender systems

    The 12th International Conference on Service Systems and Service Management (ICSSSM)

    (2015)
  • X. Wen et al.

    A rapid learning algorithm for vehicle classification

    Inf. Sci.

    (2015)
  • H. Xia et al.

    A novel item anomaly detection approach against shilling attacks in collaborative recommendation systems using the dynamic time interval segmentation technique

    Inf. Sci.

    (2015)
  • H. Xie et al.

    Community-aware user profile enrichment in folksonomy

    Neural Netw.

    (2014)
  • ZhangF. et al.

    HHT-SVM: An online method for detecting profile injection attacks in collaborative recommender systems

    Knowl.-Based Syst.

    (2014)
  • ZhangZ. et al.

    Graph-based detection of shilling attacks in recommender systems

    IEEE International Workshop on Machine Learning for Signal Processing

    (2013)
  • ZhangZ. et al.

    Detection of shilling attacks in recommender systems via spectral clustering

    International Conference on Information Fusion

    (2014)
  • ZhangZ. et al.

    A hybrid fuzzy-based personalized recommender system for telecom products/services

    Inf. Sci.

    (2013)
  • G. Adomavicius et al.

    Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions

    IEEE Trans. Knowl. Data Eng.

    (2005)
  • K. Bryan et al.

    Unsupervised retrieval of attack profiles in collaborative recommender systems

    ACM Conference on Recommender Systems

    (2008)
  • A. Buja et al.

    Data visualization with multidimensional scaling

    J. Comput. Graphical Stat.

    (2008)
  • R. Burke et al.

    Classification features for attack detection in collaborative recommender systems

    International Conference on Knowledge Discovery and Data Mining

    (2006)
  • J. Cao et al.

    Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system

    World Wide Web

    (2013)
  • P. Chirita et al.

    Preventing shilling attacks in online recommender systems

    In Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management (WIDM’05)

    (2005)
  • Q. Du et al.

    Folksonomy-based personalized search by hybrid user profiles in multiple levels

    Neurocomputing

    (2016)
  • N. Günnemann et al.

    Robust multivariate autoregression for anomaly detection in dynamic product ratings

    Proceedings of the 23rd International Conference on World Wide Web

    (2014)
  • N. Günnemann et al.

    Detecting anomalies in dynamic rating data: A robust probabilistic model for rating evolution

    KDD’2014

    (2014)
  • B. Gu et al.

    Incremental support vector learning for ordinal regression

    IEEE Trans. Neural Netw. Learn. Syst.

    (2015)
  • I. Gunes et al.

    Shilling attacks against recommender systems: a comprehensive survey

    Artif. Intell. Rev.

    (2013)
  • F. He et al.

    Attack detection by rough set theory in recommendation system

    IEEE International Conference on Granular Computing

    (2010)
  • A. Hernandoa et al.

    A non negative matrix factorization for collaborative filtering recommender systems based on a bayesian probabilistic model

    Knowl.-Based Syst.

    (2016)
  • HuangS. et al.

    A hybrid decision approach to detect profile injection attacks in collaborative recommender systems

    Found. Intell. Syst.

    (2012)
  • Cited by (54)

    • Sampling and noise filtering methods for recommender systems: A literature review

      2023, Engineering Applications of Artificial Intelligence
    • A detection method for hybrid attacks in recommender systems

      2023, Information Systems
      Citation Excerpt :

      And the over-smoothing problem comes from the case that representations of nodes are unrelated to the input and converge to a station point when GCN has deep enough structure with more layers. Unlike the traditional detectors for model-generative shilling attacks [1,3,5,6,14–18] or group shilling attacks [9–11,19], in this work, we present a GCN-based detector for the hybrids of those shilling attacks without the additional knowledge about attack types. In particular, the user attributed graph is different from those in [9,17,21], in which the user features are comprehensively extracted by the sequence measurement and ratings statistics from user behaviors, and the edge weight is less likely to be affected by the number of the co-rated items.

    View all citing articles on Scopus

    The research is supported by NSFC(61175039 and 61221063), 863 High Tech Development Plan (2012AA011003), Research Fund for Doctoral Program of Higher Education of China (20090201120032), International Research Collaboration Project of Shaanxi Province (2013KW11) and Fundamental Research Funds for Central Universities (2012jdhz08).

    View full text