Abstract
Online product reviews are becoming increasingly important due to their guidance function in people’s purchase decisions. As being highly subjective, online reviews are subject to opinion spamming, i.e., fraudsters write fake reviews or give unfair ratings to promote or demote target products. Although there have been much efforts in this field, the problem is still left open due to the difficulties in gathering ground-truth data. As more and more people are using Internet in everyday life, group review spamming, which involves a group of fraudsters writing hype-reviews (promote) or defaming-reviews (demote) for one or more target products, becomes the main form of review spamming. In this paper, we propose a LDA-based computing framework, namely GSLDA, for group spamming detection in product review data. As a completely unsupervised approach, GSLDA works in two phases. It first adapts LDA (Latent Dirichlet Allocation) to the product review context in order to bound the closely related group spammers into a small-sized reviewer cluster, and then it extracts high suspicious reviewer groups from each LDA-clusters. Experiments on three real-world datasets show that GSLDA can detect high quality spammer groups, outperforming many state-of-the-art baselines in terms of accuracy.
Similar content being viewed by others
References
Akoglu L, Chandy R, Faloutsos C (2013) Opinion fraud detection in online reviews by network effects. In: Proceedings of the seventh International Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, Massachusetts, USA, July 8-11, 2013
Allahbakhsh M, Ignjatovic A, Benatallah B, Beheshti S-M-R, Bertino E, Foo N (2013) Collusion detection in online rating systems. In: Web Technologies and Applications, vol. 7808 of Lecture Notes in Computer Science. Springer, Berlin, pp 196–207
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Choo E, Yu T, Chi M (2015) Detecting opinion spammer groups through community discovery and sentiment analysis. In: Proceedings of the 29th Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy XXIX, DBSec 2015, Fairfax, VA, USA, July 13-15, 2015, pp 170–187
Crawford M, Khoshgoftaar TM, Prusa JD, Richter AN, Al Najada H (2015) Survey of review spam detection using machine learning techniques. J Big Data 2(1):23
Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Exploiting burstiness in reviews for review spammer detection. In: Seventh international AAAI conference on weblogs and social media
Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining (New York, NY, USA). ACM, pp 219–230
Lee KD, Han K, Myaeng S (2016) Capturing word choice patterns with LDA for fake review detection in sentiment analysis. In: Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics, WIMS 2016, Nîmes, France, June 13-15, 2016, pp 9:1–9:7
Li J, Cardie C, Li S (2013) Topicspam: a topic-model based approach for spam detection. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bulgaria, Volume 2: Short Papers, pp 217–221
Li J, Ott M, Cardie C, Hovy E (2014) Towards a general rule for identifying deceptive opinion spam. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 1: Long Papers) (Baltimore, Maryland), pp 1566–1576
Lim E-P, Nguyen V-A, Jindal N, Liu B, Lauw HW (2010) Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (New York, NY, USA), CIKM’10, pp 939–948
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA), KDD’13, pp 632–640
Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web (New York, NY, USA). ACM, pp 191–200
Mukherjee A, Venkataraman V, Liu B, Glance NS (2013) What yelp fake review filter might be doing?. In: Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, Massachusetts, USA, July 8–11, 2013
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies - Volume 1 (Stroudsburg, PA, USA), pp 309–319
Rayana S, Akoglu L (2015) Collective opinion spam detection: Bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015, pp 985–994
Sandulescu V, Ester M (2015) Detecting singleton review spammers using semantic similarity. In: Proceedings of the 24th International Conference on World Wide Web Companion, WWW 2015, Florence, Italy, May 18-22, 2015 - Companion Volume, pp 971–976
Viviani M, Pasi G (2017) Credibility in social media: opinions, news, and health information - a survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 7(5):e1209
Wang G, Xie S, Liu B, Yu PS (2011) Review graph based online store review spammer detection. In: ICDM, pp 1242–1247
Wang Z, Gu S, Zhao X, Xu X (2017) Graph-based review spammer group detection. Knowledge and Information Systems
Wang Z, Hou T, Song D, Li Z, Kong T (2016) Detecting review spammer groups via bipartite graph projection. Comput J 59(6):861–874
Xu C, Zhang J (2015) Towards collusive fraud detection in online reviews. In: 2015 IEEE International Conference on Data Mining, ICDM 2015, Atlantic City, NJ, USA, November 14-17, 2015, pp 1051–1056
Xu C, Zhang J, Chang K, Long C (2013) Uncovering collusive spammers in chinese review websites. In: Proceedings of the 22Nd ACM International Conference on Conference on Information & Knowledge Management (New York, NY, USA). ACM, pp 979–988
Xu X, Yuruk N, Feng Z, Schweiger TAJ (2007) SCAN: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 12-15, 2007, pp 824–833
Ye J, Akoglu L (2015) Discovering opinion spammer groups by network footprints. In: Appice A, Rodrigues PP, Santos Costa V, Soares C, Gama J, Jorge A (eds) Machine Learning and Knowledge Discovery in Databases, Volume 9284 of Lecture Notes in Computer Science. Springer International Publishing, pp 267–282
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Z., Gu, S. & Xu, X. GSLDA: LDA-based group spamming detection in product reviews. Appl Intell 48, 3094–3107 (2018). https://doi.org/10.1007/s10489-018-1142-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1142-1