Identifying ground truth in opinion spam: an empirical survey based on review psychology

Li, Jiandun; Wang, Xiaogang; Yang, Liu; Zhang, Pengpeng; Yang, Dingyu

doi:10.1007/s10489-020-01764-7

Identifying ground truth in opinion spam: an empirical survey based on review psychology

Published: 15 June 2020

Volume 50, pages 3554–3569, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jiandun Li¹,
Xiaogang Wang¹,
Liu Yang¹,
Pengpeng Zhang¹ &
…
Dingyu Yang¹

802 Accesses
11 Citations
Explore all metrics

Abstract

Because it is very harmful, opinion spam, especially that involving untruthful reviews, has attracted much attention in the last decade. However, the lack of annotations, i.e., the ground truth problem, still serves as the key challenge. It is difficult because spammers always deliberately forge their reviews, which cannot be distinguished even by field experts. Considering the obvious intention of spammers, i.e., to promote or demote an items reputation, the opportunity exists to label them by considering crowd psychology. To date, several studies have applied, verified, and presented helpful evidence, including prior, empirical, heuristic, and simulative pseudo truths. In this paper, after investigating both authentic and deceptive reviewers’ diverse motives, we survey state-of-the-art truth by considering two classical roles, e.g., crowdsourcing and expert spammers. For each role, several topics related to spam attacks either with or without disguising and possible outliers are highlighted. Comparison analyses led to some interesting conclusions: 1) data on professional spammers are more challenging to collect and less reliable than data on crowdsourcing spammers; 2) most linguistic evidences are less reliable than behavioral footprints; 3) abnormal activities are as trustworthy as spamming objectives, while they hardly need any extra support, such as the user profile; and 4) the top reliable facts requiring acceptable effort are deviation, burstiness, grouped spamming, deviation over the threshold, review distribution, opinion proportion and spam cost. Moreover, we introduce several promising directions for future research. In general, this survey may shed light on new angles that can be used to understand review spam and to improve the performance of any anti-spam platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey of various methods in opinion spam detection

Article 05 September 2022

Arvind Mewada & Rupesh Kumar Dewang

Exposing collaborative spammer groups through the review-response graph

Article 17 February 2023

Jiandun Li, Jingyi Hu, … Liu Yang

Shooting review spam with a weakly supervised approach and a sentiment-distribution-oriented method

Article 25 August 2022

Jiandun Li, Liu Yang & Pengpeng Zhang

References

Jindal N, Liu B (2008) Opinion spam and analysis. WSDM’08: the (2008) international conference on web search and data mining. ACM, New York, pp 219–230
Ott M, Cardie C, Hancock J (2012) Estimating the prevalence of deception in online review communities. Proceedings of the 21st international conference on World Wide Web - WWW '12. ACM, New York, pp 201–210
Cardoso EF, Silva RM, Almeida TA (2018) Towards automatic filtering of fake reviews. Neurocomputing 309(2):106–116
Google Scholar
Ren Y, Ji D (2019) Learning to detect deceptive opinion spam: A survey. IEEE Access 7:42934–42945
Google Scholar
Vidanagama DU, Silva TP (2019) Karunananda AS Deceptive consumer review detection: A survey. Artif Intell Rev :1–30
Jindal N, Liu B (2007) Analyzing and detecting review spam. Seventh IEEE International Conference on Data Mining (ICDM 2007). IEEE, Omaha, pp 547–552
Anderson E, Simester D (2013) Deceptive reviews: The influential tail
De Meo P, Messina F, Rosaci D, Sarnè GM (2015) L. 2d-socialnetworks:Away to virally distribute popular information avoiding spam. Stud Comput Intell 570:369–375
Google Scholar
Shih D-H, Chiang H-S, Lin B (2008) Collaborative spam filtering with heterogeneous agents. Expert Syst Appl 35:1555–1566
Google Scholar
Somayeh S et al (2013) Detecting deceptive reviews using lexical and syntactic features. In: 13th International Conference on Intellient Systems Design and Applications
Duhan N, Divya; Mittal M (2017) Opinion mining using ontological spam detection. In: 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (Ictus), pp 557–562
Jindal N, Liu B, Lim E-P (2010) Finding unusual review patterns using unexpected rules. In: The 19th ACM international conference on Information and knowledge management, ACM, New York,  pp 1549–1552
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. KDD’13. Chicago, Illinois, USA, pp 632–640
Savage D, Zhang XZ, Yu XH, Chou P, Wang QM (2015) Detection of opinion spam based on anomalous rating deviation. Expert Syst Appl 42:8650–8657
Google Scholar
Zhang L, Wang S-F, Lin Z-Z, Wu Y (2019) Online ballot stuffing: Influence of self-boosting manipulation on rating dynamics in online rating systems. Telematics Inform 38:1–12
Google Scholar
Mukherjee A, Venkataraman V, Liu B, Glance N (2013) What yelp fake review filter might be doing. In: The Seventh International AAAI Conference on Weblogs and Social Media, AAAI, Menlo Park, pp 409–418
Hu N, Bose I, Gao Y, Liu L (2011) Manipulation in digital word-of-mouth: A reality check for book reviews. Decis Support Syst 50(2011):627–635
Google Scholar
Mayzlin D, Dover Y, Chevalier J (2014) Promotional reviews: An empirical investigation of online review manipulation. Am Econ&nbsp;Rev 104:8
Heydari A, Tavakoli Ma, Salim N, Heydari Z (2015) Detection of review spam: A survey. Expert Syst Appl 42:3634–3642
Google Scholar
Dewang RK, Singh AK (2018) State-of-art approaches for review spammer detection: A survey. J Intell Inf Syst 50(2):231–264
Google Scholar
Li L, Qin B, Liu T (2018) Survey on fake review detection research. Chin J Comput 4(2017):946–968
Google Scholar
Hussain N, Turab Mirza H, Rasool G, Hussain I, Kaleem M (2019) Spam review detection techniques: A systematic literature review. Appl Sci 9:987
Google Scholar
Crawford M, Khoshgoftaar TM, Prusa JD, Richter AN, Al Najada H (2015) Survey of review spam detection using machine learning techniques. J&nbsp; Big Data 2:1
Ahmad S, Pathak A, Jaiswal S (2018) A survey about spam detection and analysis using users’ reviews. Malay J Matematik S(1):1–4
Dou Y (2019) A review of recent advance in online spam detection
Wang Qianqian LB, Wenchang S, Zhaohui L, Wei S. (2010) Detecting spam comments with malicious users’ behavioral characteristics. In: ICITIS2010: 2010 IEEE International Conference on Information Theory and Information Security. IEEE, pp 563–567
Dichter E (1966) How word-of-mouth advertising works. Harvard Bus Rev 44:6
Google Scholar
Engel JF, Kegerreis RJ, Blackwell RD (1969) Word-of-mouth communication by the innovator. J Mark 33:15–19
Google Scholar
Buttle FA (1998) Word of mouth: Understanding and managing referral marketing. J Strateg Mark 6:241–254
Google Scholar
Sundaram DS, Mitra K, Webster C (1998) Word-of-mouth communications: A motivational analysis. ACR N Am Adv 25:1
Google Scholar
Hennig-Thurau T, Gwinner KP, Walsh G, Gremler DD (2004) Electronic word-of-mouth via consumer-opinion platforms: What motivates consumers to articulate themselves on the internet? J Interact Mark 18(1):38–52
Google Scholar
Dellarocas C, Narayan R (2006) What motivates consumers to review a product online? A study of the product-specific antecedents of online movie reviews. WISE
Zhu F, Zhang X (2010) Impact of online consumer reviews on sales:The moderating role of product and consumer characteristics. J Mark 74:133–148
Google Scholar
Balasubramanian S, Mahajan V (2001) The economic leverage of the virtual community. International journal of electronic commerce 5(3):103–138
Google Scholar
Oliver RL, Swan JE (1989) Equity and disconfirmation perceptions as influences on merchant and product satisfaction. Journal of consumer research 16:372–383
Google Scholar
Mark A, James B, Jeffrey G, Ml K, Jon M, Heather S, Robin S (1992) Complaining behavior in social interaction. Personality social psychology bulletin 18:286–295
Google Scholar
Berkowitz L (1970) Experimental investigations of hostility catharsis. J Consult Clin Psychol 35(1):1–7
MathSciNet Google Scholar
Leibenstein H (1950) Bandwagon, snob, and veblen effects in the theory of consumers' demand. Q J Econ 64:183–207
Shyam SS, Oeldorf-Hirsch A, Xu Q (2008) The bandwagon effect of collaborative filtering technology. In: CHI'08: CHI'08 extended abstracts on Human factors in computing systems. ACM, New York, pp 3453–3458
Eric M (1999) Nash equilibrium and welfare optimality. Rev Econ Stud 66:1
MathSciNet MATH Google Scholar
Deborah F, Baron J (1988) Ambiguity and rationality. J Behav Decis Mak 1(3):149–157
Google Scholar
Pronin EL, Daniel Y, Ross (2002) Lee. The bias blind spot: Perceptions of bias in self versus others. Pers Soc Psychol Bull 28:369–381
Google Scholar
Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New York
Liu Y, Zhou W (2016) Can you really make profit from online rating manipulations?: An empirical study (2016). In: IEEE 40th Annual Computer Software and Applications Conference (COMPSAC). IEEE, Atlanta, pp 509–518
Kuran T, Sunstein CR (1998) Availability cascades and risk regulation. Stan L Rev 51:683
Google Scholar
Carragher DJ, Lawrence BJ, Thomas NA, Nicholls ME (2018) R. Visuospatial asymmetries do not modulate the cheerleader effect. Sci Rep 8:1
Google Scholar
McDowell J, Starratt VG (2019) Experimental examination and extension of the cheerleader effect. Personality Individ Differ 147:245–249
Google Scholar
Bickart BA (1993) Carryover and backfire effects in marketing research. Journal of Marketing research 30(1):52–62
Google Scholar
Strack F, Mussweiler T (1997) Explaining the enigmatic anchoring effect: Mechanisms of selective accessibility. J Personal Soc Psychol 73:437
Google Scholar
Tversky A, Kahneman D, Availability (1973) A heuristic for judging frequency and probability. Cogn Psychol 5:207–232
Google Scholar
Burgoon JK, Blair JP, Qin T, Nunamaker JF Jr (2003) Detecting deception through linguistic analysis. Intelligence and Security Informatics
Bar-Hillel M (1980) The base-rate fallacy in probability judgments. Acta Physiol (Oxf) 44(3):211–233
Google Scholar
Hooi B, Shin K, Song HA, Beutel A, Shah N, Faloutsos C (2017) Graph-based fraud detection in the face of camouflage. ACM Trans Knowl Discov Data 11:4
Google Scholar
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: The 49th annual meeting of the association for computational linguistics: Human language technologies-vol 1. Association for Computational Linguistics, Stroudsburg, pp 309–319
Li X, Hitt LM (2008) Self-selection and information role of online product reviews. Inf Syst Res 19:456–474
Google Scholar
Luca M, Zervas G (2016) Fake it till you make it: Reputation, competition, and yelp review fraud. Manag Sci 62:3412–3427
Google Scholar
Newman ML, Pennebaker JW, Berry DS, Richards JM (2003) Lying words: Predicting deception from linguistic styles. Pers Soc Psychol Bull 29(5):665–675
Google Scholar
Zhou L (2003) An exploratory study into deception detection in text-based computer-mediated communication. In: The 36th Annual Hawaii International Conference on System Sciences. IEEE, Big Island, pp 1–10
Anderson Erict (2013) Advertising in a competitive market: The role of product standards, customer learning, and switching costs. J Mark Res 50(4):489–504
Google Scholar
Mukherjee S, Dutta S, Weikum G (2016) Credible review detection with limited information using consistency features. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, pp 195–213
Rout JK, Dalmia A, Choo KKR, Bakshi S, Jena SK (2017) Revisiting semi-supervised learning for online deceptive review detection. IEEE Access 5:1319–1327
Google Scholar
Xie S, Wang G, Lin S, Yu PS (2012) Review spam detection via temporal pattern discovery. In: KDD’12: the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 823–831
Liu Y, Pang B (2018) A unified framework for detecting author spamicity by modeling review deviation. Expert Syst Appl 112:148–155
Google Scholar
Lim E-P, Nguyen V-A, Jindal N, Liu B, Lauw HW (2010) Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, New York
Feng S, Xing L, Gogar A, Choi Y (2012) Distributional footprints of deceptive product reviews. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Li H (2016) Modeling review spam using temporal patterns and co-bursting behaviors. arXiv:1611.06625v1
Günnemann S, Günnemann N, Faloutsos C (2014) Detecting anomalies in dynamic rating data: A robust probabilistic model for rating evolution. In: The 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 841–850
Günnemann N, Günnemann S, Faloutsos C (2014) Robust multivariate autoregression for anomaly detection in dynamic product ratings. In: WWW’14: the 23rd international conference on World wide web. ACM, New York, pp 361–372
Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Exploiting burstiness in reviews for review spammer detection. In: The Seventh International AAAI Conference on Weblogs and Social Media
Wu F, Huberman BA (2008) How public opinion forms. In: International Workshop on Internet and Network Economics. Springer, Berlin, pp 334–341
Godes D, Silva JC (2012) Sequential and temporal dynamics of online opinion. Mark Sci 31(3):448–473
Google Scholar
Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews. In: WWW 2012: the 21st international conference on World Wide Web. ACM, New York, pp 191–200
Zhang L, Wu Z, Cao J (2018) Detecting spammer groups from product reviews: A partially supervised learning model. IEEE Access 6:2559–2568
Google Scholar
Li Q, Wu Q, Zhu C, Zhang J, Zhao W (2019) Unsupervised user behavior representation for fraud review detection with cold-start problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Cham, pp 222–236
Xu Y, Zhang F (2019) Detecting shilling attacks in social recommender systems based on time series analysis and trust features. Knowl-Based Syst 178:25–47
Google Scholar
Aghdam NH, Ashtiani M, Azgomi MA (2020) An uncertainty-aware computational trust model considering the co-existence of trust and distrust in social networks. Inf Sci 513:465–503
MathSciNet Google Scholar
Jiang C, Liu S, Lin Z, Zhao G, Duan R, Liang K (2016) Domain-aware trust network extraction for trust propagation in large-scale heterogeneous trust networks. Knowl-Based Syst 111:237–247
Google Scholar
Wu G, Greene D, Smyth B, Cunningham P (2010) Distortion as a validation criterion in the identification of suspicious reviews
Kakhki AM, Kliman-Silver C, Mislove A (2013) Iolaus: Securing online content rating systems. In: The 22nd international conference on World Wide Web.ACM, New York, pp 919–930
Mayzlin D (2006) Promotional chat on the internet. Mark Sci 25:2
Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Chang W, Xu Z, Zhou S, Cao W (2018) Research on detection methods based on doc2vec abnormal comments. Futur Gener Comput Syst 86:656–662
Google Scholar
Vrij A (2008) Detecting lies and deceit: Pitfalls and opportunities. Wiley, Hoboken
Wang X, Liu K, Zhao J (2017) Handling cold-start problem in review spam detection by jointly embedding texts and behaviors. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (vol 1: Long Papers)
McAuley J, Targett C, Shi Q, van den Hengel A. (2015) Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '15, pp 43–52
Liao XW, Xu XT, Pan JS, Chen GL (2017) Detect online review spammers based on comprehensive trustiness propagation model. J Internet Technol 18(3):637–644
Google Scholar

Download references

Acknowledgment

This work was supported in part by the following funds: National Nature Science Foundation of China, under grant 61702320, 61801285 and 61802247; Shanghai Municipal Commission of Economy and Informatization, under grant 201701014; Shanghai Pudong Science, Technology and Economy Commission, under grant PKJ2019-Y03.

Author information

Authors and Affiliations

School of Electronics Information, Shanghai Dianji University, 201306, Shanghai, China
Jiandun Li, Xiaogang Wang, Liu Yang, Pengpeng Zhang & Dingyu Yang

Authors

Jiandun Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaogang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Liu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Pengpeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dingyu Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiandun Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Wang, X., Yang, L. et al. Identifying ground truth in opinion spam: an empirical survey based on review psychology. Appl Intell 50, 3554–3569 (2020). https://doi.org/10.1007/s10489-020-01764-7

Download citation

Received: 11 March 2020
Revised: 17 May 2020
Accepted: 23 May 2020
Published: 15 June 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10489-020-01764-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying ground truth in opinion spam: an empirical survey based on review psychology

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of various methods in opinion spam detection

Exposing collaborative spammer groups through the review-response graph

Shooting review spam with a weakly supervised approach and a sentiment-distribution-oriented method

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identifying ground truth in opinion spam: an empirical survey based on review psychology

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of various methods in opinion spam detection

Exposing collaborative spammer groups through the review-response graph

Shooting review spam with a weakly supervised approach and a sentiment-distribution-oriented method

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation