Skip to main content
Log in

Shooting review spam with a weakly supervised approach and a sentiment-distribution-oriented method

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Untruthful opinions, ballot-stuffing or bad-mouthing online commodities, are challenging to identify because of two prominent obstacles, i.e., lack of ground-truth annotations and cracking deceptive sentiment along review contexts. To rise to these challenges, inspired by a recent algorithm called Learning with Label Noise, we first recruit volunteers to write annotated reviews and then label more unannotated public reviews with a neighborhood graph. Furthermore, based on statistical analysis, we introduce a Sentiment-Distribution-Oriented Clustering (SDOC) method to ferret review spam out, in which product usage aspects and their sentiment polarities are highlighted. Evaluations and comparisons with several state-of-the-art approaches indicate that SDOC is effective and outperforms them with statistical significance. We have also arrived at an interesting conclusion, i.e., genuine reviewers’ feelings tend to fluctuate across different product aspects, whereas spammers always have uniform sentiments along aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. ⁎ Review B is spam.

References

  1. Li J, Wang X, Yang L, Zhang P, Yang D (2020) Identifying ground truth in opinion spam: an empirical survey based on review psychology. Appl Intell 50(11):3554–3569

    Article  Google Scholar 

  2. Vidanagama DU, Silva TP, Karunananda AS (2019) Deceptive consumer review detection: a survey.Artificial Intelligence Review, :1–30

  3. Jindal N, Liu BO, Spam (2008) and Analysis. Proceedings of the International Conference on Web Search and Web Data Mining (WSDM 2008), of Conference, ACM Press, 219–230

  4. Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding Deceptive Opinion Spam by Any Stretch of the Imagination. Proceedings of the the 49th annual meeting of the association for computational linguistics: Human language technologies (volume 1), Association for Computational Linguistics, 309–319

  5. Liu Y, Pang B (2018) A unified framework for detecting author spamicity by modeling review deviation. Expert Syst Appl 112:148–155

    Article  Google Scholar 

  6. Oh YW, Park CH (2021) Machine Cleaning of Online Opinion Spam: Developing a Machine-Learning Algorithm for Detecting Deceptive Comments. Am Behav Sci 65(2):389–403

    Article  Google Scholar 

  7. Mukherjee A, Venkataraman V, Liu B, Glance NW (2013) Yelp Fake Review Filter Might Be Doing. Proceedings of the the Seventh International AAAI Conference on Weblogs and Social Media of Conference, AAAI, 409–418

  8. Mukherjee A, Liu B, Wang J, Glance N, Jindal N, Detecting Group Review Spam. Proceedings of the Proceedings of the 20th international conference companion on World Wide Web (WWW 2011) (2011) of Conference, 93–94

  9. Vrij A (2008) Detecting Lies and Deceit: Pitfalls and Opportunities. John Wiley & Sons

  10. Tang X, Qian T, You Z (2020) Generating behavior features for cold-start spam review detection with adversarial learning. Inf Sci 526:274–288

    Article  Google Scholar 

  11. Kabir HD, Khosravi A, Nahavandi S, Kavousi-Fard (2019) A.J.I.T.o.E.T.i.C.I. Partial adversarial training for neural network-based uncertainty quantification. 5:595–6064

  12. Somayeh S et al (2013) Detecting Deceptive Reviews Using Lexical and Syntactic Features. Proceedings of the 13th International Conference on Intellient Systems Design and Applications, of Conference, IEEE, 53–58

  13. Anderson E, Simester D (2013) Deceptive reviews: the influential tail. Proceedings of the Working paper, Sloan School of Management., of Conference, Northwestern University, 1–40

  14. Wang Qianqian LB, Wenchang S, Zhaohui L, Wei S (2010) Detecting Spam Comments with Malicious Users’ Behavioral Characteristics. Proceedings of the ICITIS2010: 2010 IEEE International Conference on Information Theory and Information Security, of Conference, IEEE, 563–567

  15. Asghar MZ, Ullah A, Ahmad S, Khan A (2020) Opinion spam detection framework using hybrid classification scheme. Soft Comput 24(5):3475–3498

    Article  Google Scholar 

  16. Tian Y, Mirzabagheri M, Tirandazi P, Bamakan SMH (2020) A non-convex semi-supervised approach to opinion spam detection by ramp-one class SVM. Inf Process Manag 57(6):102381

    Article  Google Scholar 

  17. Neisari A, Rueda L, Saad S (2021) Spam review detection using self-organizing maps and convolutional neural networks. Computers & Security 106:102274

    Article  Google Scholar 

  18. Fahfouh A, Riffi J, Adnane Mahraz M, Yahyaouy A, Tairi H (2020) PV-DAE: A hybrid model for deceptive opinion spam based on neural network architectures. Expert Syst Appl 157:113517

    Article  Google Scholar 

  19. Hajek P, Barushka A, Munk M (2020) Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining. Neural Comput Appl 32(23):17259–17274

    Article  Google Scholar 

  20. Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: An empirical study. Inf Sci 385:213–224

    Article  Google Scholar 

  21. Zhang F, Qiu L, Qi P, Luo HA (2020) Novel Text Features Jointing Model for Review Spam Filtering of Chinese. Proceedings of the International Wireless Communications and Mobile Computing (IWCMC), 2020, IEEE, 2051–2056

  22. Li L, Qin B, Ren W, Liu T (2017) Document representation and feature combination for deceptive spam review detection. Neurocomputing 254:33–41

    Article  Google Scholar 

  23. Bhuvaneshwari P, Rao AN, Robinson YH (2021) Spam review detection using self attention based CNN and bi-directional LSTM. Multimedia Tools and Applications 80(12):18107–18124

    Article  Google Scholar 

  24. Gao Y, Gong M, Xie Y, Qin AK (2021) An Attention-Based Unsupervised Adversarial Model for Movie Review Spam Detection. IEEE Trans Multimedia 23:784–796

    Article  Google Scholar 

  25. Aghakhani H, Machiry A, Nilizadeh S, Kruegel C, Vigna G (2018) Detecting Deceptive Reviews Using Generative Adversarial Networks. Proceedings of the IEEE Security and Privacy Workshops (SPW), 2018, IEEE Computer Society, 89–95

  26. Venkateswarlu B, Shenoi V (2021) Optimized generative adversarial network with fractional calculus based feature fusion using Twitter stream for spam detection.

  27. Schouten K, Frasincar F (2016) Survey on Aspect-Level Sentiment Analysis. Knowl Data Eng IEEE Trans on 28(3):813–830

    Article  Google Scholar 

  28. Jiwei Li CC (2013) Sujian Li. Topicspam: a topic-model based approach for spam detection. Proceedings of the the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, 217–221

  29. Lee KD, Han K, Myaeng S-H (2016) 2016 of Conference, Nimes, France, 1–7

  30. You L, Peng Q, Xiong Z, He D, Qiu M, Zhang X (2020) Integrating aspect analysis and local outlier factor for intelligent review spam detection. Future Generation Computer Systems 102:163–172

    Article  Google Scholar 

  31. Zhou J, Huang JX, Chen Q, Hu QV, Wang T, He L (2019) Deep Learning for Aspect-Level Sentiment Classification: Survey, Vision, and Challenges. IEEE Access 7:78454–78483

    Article  Google Scholar 

  32. Xue H, Wang Q, Luo B, Seo H, Li F (2019) Content-aware trust propagation toward online review spam detection. J Data Inform Qual (JDIQ) 11(3):1–31

    Article  Google Scholar 

  33. Li J, Lv P, Xiao W, Yang L, Zhang P (2021) Exploring groups of opinion spam using sentiment analysis guided by nominated topics.Expert Systems with Applications,171

  34. Muhlenbach F, Lallich S, Zighed DA (2004) Identifying and Handling Mislabelled Instances. J Intell Inform Syst 22(1):89–109

    Article  MATH  Google Scholar 

  35. Chen C-C, Huang H-H, Chen H-H (2018) NTUSD-Fin: a market sentiment dictionary for financial social media data applications. Proceedings of the Proceedings of the 1st Financial Narrative Processing Workshop (FNP 2018

  36. Li J, Zhang P, Yang L (2021) An unsupervised approach to detect review spam using duplicates of images, videos and Chinese texts. Comput Speech Lang 68:101186

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Nature Science Foundation of China, under grant 61801285 and 61802247. We thank Kun Huang, Jingwen Lin, Yingsheng Wang and other anonymous volunteers for writing product reviews. We also appreciate anonymous reviewers for their constructive comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiandun Li.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Yang, L. & Zhang, P. Shooting review spam with a weakly supervised approach and a sentiment-distribution-oriented method. Appl Intell 53, 10789–10799 (2023). https://doi.org/10.1007/s10489-022-04063-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04063-5

Keywords

Navigation