Abstract
Untruthful opinions, ballot-stuffing or bad-mouthing online commodities, are challenging to identify because of two prominent obstacles, i.e., lack of ground-truth annotations and cracking deceptive sentiment along review contexts. To rise to these challenges, inspired by a recent algorithm called Learning with Label Noise, we first recruit volunteers to write annotated reviews and then label more unannotated public reviews with a neighborhood graph. Furthermore, based on statistical analysis, we introduce a Sentiment-Distribution-Oriented Clustering (SDOC) method to ferret review spam out, in which product usage aspects and their sentiment polarities are highlighted. Evaluations and comparisons with several state-of-the-art approaches indicate that SDOC is effective and outperforms them with statistical significance. We have also arrived at an interesting conclusion, i.e., genuine reviewers’ feelings tend to fluctuate across different product aspects, whereas spammers always have uniform sentiments along aspects.
Similar content being viewed by others
Notes
⁎ Review B is spam.
References
Li J, Wang X, Yang L, Zhang P, Yang D (2020) Identifying ground truth in opinion spam: an empirical survey based on review psychology. Appl Intell 50(11):3554–3569
Vidanagama DU, Silva TP, Karunananda AS (2019) Deceptive consumer review detection: a survey.Artificial Intelligence Review, :1–30
Jindal N, Liu BO, Spam (2008) and Analysis. Proceedings of the International Conference on Web Search and Web Data Mining (WSDM 2008), of Conference, ACM Press, 219–230
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding Deceptive Opinion Spam by Any Stretch of the Imagination. Proceedings of the the 49th annual meeting of the association for computational linguistics: Human language technologies (volume 1), Association for Computational Linguistics, 309–319
Liu Y, Pang B (2018) A unified framework for detecting author spamicity by modeling review deviation. Expert Syst Appl 112:148–155
Oh YW, Park CH (2021) Machine Cleaning of Online Opinion Spam: Developing a Machine-Learning Algorithm for Detecting Deceptive Comments. Am Behav Sci 65(2):389–403
Mukherjee A, Venkataraman V, Liu B, Glance NW (2013) Yelp Fake Review Filter Might Be Doing. Proceedings of the the Seventh International AAAI Conference on Weblogs and Social Media of Conference, AAAI, 409–418
Mukherjee A, Liu B, Wang J, Glance N, Jindal N, Detecting Group Review Spam. Proceedings of the Proceedings of the 20th international conference companion on World Wide Web (WWW 2011) (2011) of Conference, 93–94
Vrij A (2008) Detecting Lies and Deceit: Pitfalls and Opportunities. John Wiley & Sons
Tang X, Qian T, You Z (2020) Generating behavior features for cold-start spam review detection with adversarial learning. Inf Sci 526:274–288
Kabir HD, Khosravi A, Nahavandi S, Kavousi-Fard (2019) A.J.I.T.o.E.T.i.C.I. Partial adversarial training for neural network-based uncertainty quantification. 5:595–6064
Somayeh S et al (2013) Detecting Deceptive Reviews Using Lexical and Syntactic Features. Proceedings of the 13th International Conference on Intellient Systems Design and Applications, of Conference, IEEE, 53–58
Anderson E, Simester D (2013) Deceptive reviews: the influential tail. Proceedings of the Working paper, Sloan School of Management., of Conference, Northwestern University, 1–40
Wang Qianqian LB, Wenchang S, Zhaohui L, Wei S (2010) Detecting Spam Comments with Malicious Users’ Behavioral Characteristics. Proceedings of the ICITIS2010: 2010 IEEE International Conference on Information Theory and Information Security, of Conference, IEEE, 563–567
Asghar MZ, Ullah A, Ahmad S, Khan A (2020) Opinion spam detection framework using hybrid classification scheme. Soft Comput 24(5):3475–3498
Tian Y, Mirzabagheri M, Tirandazi P, Bamakan SMH (2020) A non-convex semi-supervised approach to opinion spam detection by ramp-one class SVM. Inf Process Manag 57(6):102381
Neisari A, Rueda L, Saad S (2021) Spam review detection using self-organizing maps and convolutional neural networks. Computers & Security 106:102274
Fahfouh A, Riffi J, Adnane Mahraz M, Yahyaouy A, Tairi H (2020) PV-DAE: A hybrid model for deceptive opinion spam based on neural network architectures. Expert Syst Appl 157:113517
Hajek P, Barushka A, Munk M (2020) Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining. Neural Comput Appl 32(23):17259–17274
Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: An empirical study. Inf Sci 385:213–224
Zhang F, Qiu L, Qi P, Luo HA (2020) Novel Text Features Jointing Model for Review Spam Filtering of Chinese. Proceedings of the International Wireless Communications and Mobile Computing (IWCMC), 2020, IEEE, 2051–2056
Li L, Qin B, Ren W, Liu T (2017) Document representation and feature combination for deceptive spam review detection. Neurocomputing 254:33–41
Bhuvaneshwari P, Rao AN, Robinson YH (2021) Spam review detection using self attention based CNN and bi-directional LSTM. Multimedia Tools and Applications 80(12):18107–18124
Gao Y, Gong M, Xie Y, Qin AK (2021) An Attention-Based Unsupervised Adversarial Model for Movie Review Spam Detection. IEEE Trans Multimedia 23:784–796
Aghakhani H, Machiry A, Nilizadeh S, Kruegel C, Vigna G (2018) Detecting Deceptive Reviews Using Generative Adversarial Networks. Proceedings of the IEEE Security and Privacy Workshops (SPW), 2018, IEEE Computer Society, 89–95
Venkateswarlu B, Shenoi V (2021) Optimized generative adversarial network with fractional calculus based feature fusion using Twitter stream for spam detection.
Schouten K, Frasincar F (2016) Survey on Aspect-Level Sentiment Analysis. Knowl Data Eng IEEE Trans on 28(3):813–830
Jiwei Li CC (2013) Sujian Li. Topicspam: a topic-model based approach for spam detection. Proceedings of the the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, 217–221
Lee KD, Han K, Myaeng S-H (2016) 2016 of Conference, Nimes, France, 1–7
You L, Peng Q, Xiong Z, He D, Qiu M, Zhang X (2020) Integrating aspect analysis and local outlier factor for intelligent review spam detection. Future Generation Computer Systems 102:163–172
Zhou J, Huang JX, Chen Q, Hu QV, Wang T, He L (2019) Deep Learning for Aspect-Level Sentiment Classification: Survey, Vision, and Challenges. IEEE Access 7:78454–78483
Xue H, Wang Q, Luo B, Seo H, Li F (2019) Content-aware trust propagation toward online review spam detection. J Data Inform Qual (JDIQ) 11(3):1–31
Li J, Lv P, Xiao W, Yang L, Zhang P (2021) Exploring groups of opinion spam using sentiment analysis guided by nominated topics.Expert Systems with Applications,171
Muhlenbach F, Lallich S, Zighed DA (2004) Identifying and Handling Mislabelled Instances. J Intell Inform Syst 22(1):89–109
Chen C-C, Huang H-H, Chen H-H (2018) NTUSD-Fin: a market sentiment dictionary for financial social media data applications. Proceedings of the Proceedings of the 1st Financial Narrative Processing Workshop (FNP 2018
Li J, Zhang P, Yang L (2021) An unsupervised approach to detect review spam using duplicates of images, videos and Chinese texts. Comput Speech Lang 68:101186
Acknowledgements
This work was supported in part by the National Nature Science Foundation of China, under grant 61801285 and 61802247. We thank Kun Huang, Jingwen Lin, Yingsheng Wang and other anonymous volunteers for writing product reviews. We also appreciate anonymous reviewers for their constructive comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, J., Yang, L. & Zhang, P. Shooting review spam with a weakly supervised approach and a sentiment-distribution-oriented method. Appl Intell 53, 10789–10799 (2023). https://doi.org/10.1007/s10489-022-04063-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04063-5