Abstract
Because it is very harmful, opinion spam, especially that involving untruthful reviews, has attracted much attention in the last decade. However, the lack of annotations, i.e., the ground truth problem, still serves as the key challenge. It is difficult because spammers always deliberately forge their reviews, which cannot be distinguished even by field experts. Considering the obvious intention of spammers, i.e., to promote or demote an items reputation, the opportunity exists to label them by considering crowd psychology. To date, several studies have applied, verified, and presented helpful evidence, including prior, empirical, heuristic, and simulative pseudo truths. In this paper, after investigating both authentic and deceptive reviewers’ diverse motives, we survey state-of-the-art truth by considering two classical roles, e.g., crowdsourcing and expert spammers. For each role, several topics related to spam attacks either with or without disguising and possible outliers are highlighted. Comparison analyses led to some interesting conclusions: 1) data on professional spammers are more challenging to collect and less reliable than data on crowdsourcing spammers; 2) most linguistic evidences are less reliable than behavioral footprints; 3) abnormal activities are as trustworthy as spamming objectives, while they hardly need any extra support, such as the user profile; and 4) the top reliable facts requiring acceptable effort are deviation, burstiness, grouped spamming, deviation over the threshold, review distribution, opinion proportion and spam cost. Moreover, we introduce several promising directions for future research. In general, this survey may shed light on new angles that can be used to understand review spam and to improve the performance of any anti-spam platforms.
Similar content being viewed by others
References
Jindal N, Liu B (2008) Opinion spam and analysis. WSDM’08: the (2008) international conference on web search and data mining. ACM, New York, pp 219–230
Ott M, Cardie C, Hancock J (2012) Estimating the prevalence of deception in online review communities. Proceedings of the 21st international conference on World Wide Web - WWW '12. ACM, New York, pp 201–210
Cardoso EF, Silva RM, Almeida TA (2018) Towards automatic filtering of fake reviews. Neurocomputing 309(2):106–116
Ren Y, Ji D (2019) Learning to detect deceptive opinion spam: A survey. IEEE Access 7:42934–42945
Vidanagama DU, Silva TP (2019) Karunananda AS Deceptive consumer review detection: A survey. Artif Intell Rev :1–30
Jindal N, Liu B (2007) Analyzing and detecting review spam. Seventh IEEE International Conference on Data Mining (ICDM 2007). IEEE, Omaha, pp 547–552
Anderson E, Simester D (2013) Deceptive reviews: The influential tail
De Meo P, Messina F, Rosaci D, Sarnè GM (2015) L. 2d-socialnetworks:Away to virally distribute popular information avoiding spam. Stud Comput Intell 570:369–375
Shih D-H, Chiang H-S, Lin B (2008) Collaborative spam filtering with heterogeneous agents. Expert Syst Appl 35:1555–1566
Somayeh S et al (2013) Detecting deceptive reviews using lexical and syntactic features. In: 13th International Conference on Intellient Systems Design and Applications
Duhan N, Divya; Mittal M (2017) Opinion mining using ontological spam detection. In: 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (Ictus), pp 557–562
Jindal N, Liu B, Lim E-P (2010) Finding unusual review patterns using unexpected rules. In: The 19th ACM international conference on Information and knowledge management, ACM, New York, pp 1549–1552
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. KDD’13. Chicago, Illinois, USA, pp 632–640
Savage D, Zhang XZ, Yu XH, Chou P, Wang QM (2015) Detection of opinion spam based on anomalous rating deviation. Expert Syst Appl 42:8650–8657
Zhang L, Wang S-F, Lin Z-Z, Wu Y (2019) Online ballot stuffing: Influence of self-boosting manipulation on rating dynamics in online rating systems. Telematics Inform 38:1–12
Mukherjee A, Venkataraman V, Liu B, Glance N (2013) What yelp fake review filter might be doing. In: The Seventh International AAAI Conference on Weblogs and Social Media, AAAI, Menlo Park, pp 409–418
Hu N, Bose I, Gao Y, Liu L (2011) Manipulation in digital word-of-mouth: A reality check for book reviews. Decis Support Syst 50(2011):627–635
Mayzlin D, Dover Y, Chevalier J (2014) Promotional reviews: An empirical investigation of online review manipulation. Am Econ Rev 104:8
Heydari A, Tavakoli Ma, Salim N, Heydari Z (2015) Detection of review spam: A survey. Expert Syst Appl 42:3634–3642
Dewang RK, Singh AK (2018) State-of-art approaches for review spammer detection: A survey. J Intell Inf Syst 50(2):231–264
Li L, Qin B, Liu T (2018) Survey on fake review detection research. Chin J Comput 4(2017):946–968
Hussain N, Turab Mirza H, Rasool G, Hussain I, Kaleem M (2019) Spam review detection techniques: A systematic literature review. Appl Sci 9:987
Crawford M, Khoshgoftaar TM, Prusa JD, Richter AN, Al Najada H (2015) Survey of review spam detection using machine learning techniques. J  Big Data 2:1
Ahmad S, Pathak A, Jaiswal S (2018) A survey about spam detection and analysis using users’ reviews. Malay J Matematik S(1):1–4
Dou Y (2019) A review of recent advance in online spam detection
Wang Qianqian LB, Wenchang S, Zhaohui L, Wei S. (2010) Detecting spam comments with malicious users’ behavioral characteristics. In: ICITIS2010: 2010 IEEE International Conference on Information Theory and Information Security. IEEE, pp 563–567
Dichter E (1966) How word-of-mouth advertising works. Harvard Bus Rev 44:6
Engel JF, Kegerreis RJ, Blackwell RD (1969) Word-of-mouth communication by the innovator. J Mark 33:15–19
Buttle FA (1998) Word of mouth: Understanding and managing referral marketing. J Strateg Mark 6:241–254
Sundaram DS, Mitra K, Webster C (1998) Word-of-mouth communications: A motivational analysis. ACR N Am Adv 25:1
Hennig-Thurau T, Gwinner KP, Walsh G, Gremler DD (2004) Electronic word-of-mouth via consumer-opinion platforms: What motivates consumers to articulate themselves on the internet? J Interact Mark 18(1):38–52
Dellarocas C, Narayan R (2006) What motivates consumers to review a product online? A study of the product-specific antecedents of online movie reviews. WISE
Zhu F, Zhang X (2010) Impact of online consumer reviews on sales:The moderating role of product and consumer characteristics. J Mark 74:133–148
Balasubramanian S, Mahajan V (2001) The economic leverage of the virtual community. International journal of electronic commerce 5(3):103–138
Oliver RL, Swan JE (1989) Equity and disconfirmation perceptions as influences on merchant and product satisfaction. Journal of consumer research 16:372–383
Mark A, James B, Jeffrey G, Ml K, Jon M, Heather S, Robin S (1992) Complaining behavior in social interaction. Personality social psychology bulletin 18:286–295
Berkowitz L (1970) Experimental investigations of hostility catharsis. J Consult Clin Psychol 35(1):1–7
Leibenstein H (1950) Bandwagon, snob, and veblen effects in the theory of consumers' demand. Q J Econ 64:183–207
Shyam SS, Oeldorf-Hirsch A, Xu Q (2008) The bandwagon effect of collaborative filtering technology. In: CHI'08: CHI'08 extended abstracts on Human factors in computing systems. ACM, New York, pp 3453–3458
Eric M (1999) Nash equilibrium and welfare optimality. Rev Econ Stud 66:1
Deborah F, Baron J (1988) Ambiguity and rationality. J Behav Decis Mak 1(3):149–157
Pronin EL, Daniel Y, Ross (2002) Lee. The bias blind spot: Perceptions of bias in self versus others. Pers Soc Psychol Bull 28:369–381
Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New York
Liu Y, Zhou W (2016) Can you really make profit from online rating manipulations?: An empirical study (2016). In: IEEE 40th Annual Computer Software and Applications Conference (COMPSAC). IEEE, Atlanta, pp 509–518
Kuran T, Sunstein CR (1998) Availability cascades and risk regulation. Stan L Rev 51:683
Carragher DJ, Lawrence BJ, Thomas NA, Nicholls ME (2018) R. Visuospatial asymmetries do not modulate the cheerleader effect. Sci Rep 8:1
McDowell J, Starratt VG (2019) Experimental examination and extension of the cheerleader effect. Personality Individ Differ 147:245–249
Bickart BA (1993) Carryover and backfire effects in marketing research. Journal of Marketing research 30(1):52–62
Strack F, Mussweiler T (1997) Explaining the enigmatic anchoring effect: Mechanisms of selective accessibility. J Personal Soc Psychol 73:437
Tversky A, Kahneman D, Availability (1973) A heuristic for judging frequency and probability. Cogn Psychol 5:207–232
Burgoon JK, Blair JP, Qin T, Nunamaker JF Jr (2003) Detecting deception through linguistic analysis. Intelligence and Security Informatics
Bar-Hillel M (1980) The base-rate fallacy in probability judgments. Acta Physiol (Oxf) 44(3):211–233
Hooi B, Shin K, Song HA, Beutel A, Shah N, Faloutsos C (2017) Graph-based fraud detection in the face of camouflage. ACM Trans Knowl Discov Data 11:4
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: The 49th annual meeting of the association for computational linguistics: Human language technologies-vol 1. Association for Computational Linguistics, Stroudsburg, pp 309–319
Li X, Hitt LM (2008) Self-selection and information role of online product reviews. Inf Syst Res 19:456–474
Luca M, Zervas G (2016) Fake it till you make it: Reputation, competition, and yelp review fraud. Manag Sci 62:3412–3427
Newman ML, Pennebaker JW, Berry DS, Richards JM (2003) Lying words: Predicting deception from linguistic styles. Pers Soc Psychol Bull 29(5):665–675
Zhou L (2003) An exploratory study into deception detection in text-based computer-mediated communication. In: The 36th Annual Hawaii International Conference on System Sciences. IEEE, Big Island, pp 1–10
Anderson Erict (2013) Advertising in a competitive market: The role of product standards, customer learning, and switching costs. J Mark Res 50(4):489–504
Mukherjee S, Dutta S, Weikum G (2016) Credible review detection with limited information using consistency features. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, pp 195–213
Rout JK, Dalmia A, Choo KKR, Bakshi S, Jena SK (2017) Revisiting semi-supervised learning for online deceptive review detection. IEEE Access 5:1319–1327
Xie S, Wang G, Lin S, Yu PS (2012) Review spam detection via temporal pattern discovery. In: KDD’12: the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 823–831
Liu Y, Pang B (2018) A unified framework for detecting author spamicity by modeling review deviation. Expert Syst Appl 112:148–155
Lim E-P, Nguyen V-A, Jindal N, Liu B, Lauw HW (2010) Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, New York
Feng S, Xing L, Gogar A, Choi Y (2012) Distributional footprints of deceptive product reviews. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Li H (2016) Modeling review spam using temporal patterns and co-bursting behaviors. arXiv:1611.06625v1
Günnemann S, Günnemann N, Faloutsos C (2014) Detecting anomalies in dynamic rating data: A robust probabilistic model for rating evolution. In: The 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 841–850
Günnemann N, Günnemann S, Faloutsos C (2014) Robust multivariate autoregression for anomaly detection in dynamic product ratings. In: WWW’14: the 23rd international conference on World wide web. ACM, New York, pp 361–372
Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Exploiting burstiness in reviews for review spammer detection. In: The Seventh International AAAI Conference on Weblogs and Social Media
Wu F, Huberman BA (2008) How public opinion forms. In: International Workshop on Internet and Network Economics. Springer, Berlin, pp 334–341
Godes D, Silva JC (2012) Sequential and temporal dynamics of online opinion. Mark Sci 31(3):448–473
Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews. In: WWW 2012: the 21st international conference on World Wide Web. ACM, New York, pp 191–200
Zhang L, Wu Z, Cao J (2018) Detecting spammer groups from product reviews: A partially supervised learning model. IEEE Access 6:2559–2568
Li Q, Wu Q, Zhu C, Zhang J, Zhao W (2019) Unsupervised user behavior representation for fraud review detection with cold-start problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Cham, pp 222–236
Xu Y, Zhang F (2019) Detecting shilling attacks in social recommender systems based on time series analysis and trust features. Knowl-Based Syst 178:25–47
Aghdam NH, Ashtiani M, Azgomi MA (2020) An uncertainty-aware computational trust model considering the co-existence of trust and distrust in social networks. Inf Sci 513:465–503
Jiang C, Liu S, Lin Z, Zhao G, Duan R, Liang K (2016) Domain-aware trust network extraction for trust propagation in large-scale heterogeneous trust networks. Knowl-Based Syst 111:237–247
Wu G, Greene D, Smyth B, Cunningham P (2010) Distortion as a validation criterion in the identification of suspicious reviews
Kakhki AM, Kliman-Silver C, Mislove A (2013) Iolaus: Securing online content rating systems. In: The 22nd international conference on World Wide Web.ACM, New York, pp 919–930
Mayzlin D (2006) Promotional chat on the internet. Mark Sci 25:2
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Chang W, Xu Z, Zhou S, Cao W (2018) Research on detection methods based on doc2vec abnormal comments. Futur Gener Comput Syst 86:656–662
Vrij A (2008) Detecting lies and deceit: Pitfalls and opportunities. Wiley, Hoboken
Wang X, Liu K, Zhao J (2017) Handling cold-start problem in review spam detection by jointly embedding texts and behaviors. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (vol 1: Long Papers)
McAuley J, Targett C, Shi Q, van den Hengel A. (2015) Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '15, pp 43–52
Liao XW, Xu XT, Pan JS, Chen GL (2017) Detect online review spammers based on comprehensive trustiness propagation model. J Internet Technol 18(3):637–644
Acknowledgment
This work was supported in part by the following funds: National Nature Science Foundation of China, under grant 61702320, 61801285 and 61802247; Shanghai Municipal Commission of Economy and Informatization, under grant 201701014; Shanghai Pudong Science, Technology and Economy Commission, under grant PKJ2019-Y03.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, J., Wang, X., Yang, L. et al. Identifying ground truth in opinion spam: an empirical survey based on review psychology. Appl Intell 50, 3554–3569 (2020). https://doi.org/10.1007/s10489-020-01764-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01764-7