Skip to main content
Log in

State-of-art approaches for review spammer detection: a survey

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

E-commerce websites are now favourite for shopping comfortably at home without any burden of going to market. Their success depends upon the reviews written by the consumers who used particular products and subsequently shared their experiences with that product. The reviews also affects the buying decision of customer. Because of this reason the activity of fake reviews posting is increasing. The brand competitors of the product or the company itself may involve in posting fraud reviews to gain more profit. Such fraudulent reviews are spam review that badly affects the decision choice of the prospective consumer of the products. Many customers are misguided due to fake reviews. The person, who writes the fake reviews, is called the spammer. Identification of spammers is indirectly helpful in identifying whether the reviews are spam or not. The detection of review spammers is serious concern for the E-commerce business. To help researchers in this vibrant area, we present the state of art approaches for review spammer detection. This paper presents a comprehensive survey of the existing spammer detection approaches describing the features used for individual and group spammer detection, dataset summary with details of reviews, products and reviewers. The main aim of this paper is to provide a basic, comprehensive and comparative study of current research on detecting review spammer using machine learning techniques and give future directions. This paper also provides a concise summary of published research to help potential researchers in this area to innovate new techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://en.wikipedia.org/wiki/Spamming

  2. http://www.templetons.com/brad/spamreact.html

  3. https://en.wikipedia.org/wiki/Spamming

  4. https://www.medcalc.org/manual/logistic_regression.php

  5. http://snap.stanford.edu/data/web-Amazon-links.html

References

  • Akoglu, L., Chandy, R., & Faloutsos, C. (2013). Opinion fraud detection in online reviews by network effects. ICWSM, 13, 2–11.

    Google Scholar 

  • Aye, C.M., & Oo, K.M. (2014). Review spammer detection by using behaviors based scoring methods. In Proceedings of international conference on advances in engineering and technology.

  • Baeza-Yates, R.A., Castillo, C., López, V., & Telefónica, C. (2005). Pagerank increase under different collusion topologies. In AIRWeb (Vol. 5, pp. 25–32).

  • Berger, P, Hennig, P., Schoenberg, M., & Meinel, C. (2015). Blog, forum or newspaper? Web genre detection using svms. In 2015 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT) (pp. 364–68). IEEE.

  • Carpinter, J., & Hunt, R. (2006). Tightening the net: a review of current and next generation spam filtering tools. Computers & Security, 25(8), 566–578.

    Article  Google Scholar 

  • Choo, E., Yu, T., & Chi, M. (2015). Detecting opinion spammer groups through community discovery and sentiment analysis. In Data and applications security and privacy XXIX (pp. 170–187). Springer.

  • Choudhury, S., Dey, B., & Kumar, S. (2005). Spam: a threat to network security in digital library and information centres.

  • Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., & Al Najada, H. (2015). Survey of review spam detection using machine learning techniques. Journal of Big Data, 2(1), 1–24.

    Article  Google Scholar 

  • Duh, A, Štiglic, G., & Korošak, D. (2013). Enhancing identification of opinion spammer groups. In Proceedings of international conference on making sense of converging media (Vol. 326). ACM.

  • Esuli, A., & Sebastiani, F. (2006). Sentiwordnet: a publicly available lexical resource for opinion mining. In Proceedings of LREC (Vol. 6, pp. 417–422). Citeseer.

  • Fayazbakhsh, S.K., & Sinha, J. (2012). Review spam detection: a network-based approach. Final Project Report: CSE, 590.

  • Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Exploiting burstiness in reviews for review spammer detection. ICWSM, 13, 175–184.

    Google Scholar 

  • Fleiss, J.L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378.

    Article  Google Scholar 

  • Gyongyi, Z., & Garcia-Molina, H. (2005). Web spam taxonomy. In First international workshop on adversarial information retrieval on the web (AIRWeb 2005).

  • Gyöngyi, Z., Garcia-Molina, H., & Pedersen, J (2004). Combating web spam with trustrank. In Proceedings of the thirtieth international conference on very large data bases - volume 30, VLDB ’04 (pp. 576–587). VLDB Endowment.

  • Heydari, A., Ali Tavakoli, M., Salim, N., & Heydari, Z. (2015). Detection of review spam: a survey. Expert Systems with Applications, 42(7), 3634–3642.

    Article  Google Scholar 

  • Hinde, S. (2002). Spam, scams, chains, hoaxes and other junk mail. Computers & Security, 21(7), 592–606.

    Article  Google Scholar 

  • Huang, J., Qian, T., He, G., Zhong, M., & Peng, Q. (2013). Detecting professional spam reviewers. In Advanced data mining and applications (pp. 288–299). Springer.

  • Jindal, N., & Liu, B. (2007). Analyzing and detecting review spam. In Seventh IEEE international conference on data mining, 2007. ICDM 2007 (pp. 547–552). IEEE.

  • Jindal, N., & Liu, B. (2008). Opinion spam and analysis. In Proceedings of the 2008 international conference on web search and data mining, WSDM ’08 (pp. 219–230). ACM, New York.

  • Jiang, B., Chen, B., & et al. (2013). Detecting product review spammers using activity model. In 2013 international conference on advanced computer science and electronics information (ICACSEI 2013). Atlantis Press.

  • Kim, S., Park, H., & Lebanon, G. (2014). Fast spammer detection using structural rank. arXiv:1407.7072.

  • Li, W., Zhong, N., & Liu, C. (2006). Combining multiple email filters based on multivariate statistical analysis. In Foundations of intelligent systems (pp. 729–738). Springer.

  • Liang, D., Liu, X., & Shen, H. (2014). Detecting spam reviewers by combing reviewer feature and relationship. In 2014 international conference on informative and cybernetics for computational social systems (ICCSS) (pp. 102–107). IEEE.

  • Lim, E. -P., Nguyen, V. -A., Jindal, N., Liu, B., & Lauw, H.W. (2010). Detecting product review spammers using rating behaviors. In Proceedings of the 19th ACM international conference on information and knowledge management, CIKM ’10 (pp. 939–948). ACM: New York, NY, USA.

  • Lu, Y., Zhang, L., Xiao, Y., & Li, Y. (2013). Simultaneously detecting fake reviews and review spammers using factor graph model. In Proceedings of the 5th annual ACM web science conference (pp. 225–233). ACM.

  • Luckner, M., Gad, M., & Sobkowiak, P. (2014). Stable web spam detection using features based on lexical items. Computers & Security, 46, 79–93.

    Article  Google Scholar 

  • Ma, Y., & Li, F. (2012). Detecting review spam: challenges and opportunities. In 2012 8th international conference on collaborative computing: networking, applications and worksharing (CollaborateCom) (pp. 651–654). IEEE.

  • McAuley, J., & Leskovec, J. (2013). Hidden factors and hidden topics: understanding rating dimensions with review text. In Proceedings of the 7th ACM conference on recommender systems (pp. 165–172). ACM.

  • Mukherjee, A., Liu, B., Wang, J., Glance, N., & Jindal, N. (2011). Detecting group review spam. In Proceedings of the 20th international conference companion on World wide web (pp. 93–94). ACM.

  • Mukherjee, A., Liu, B., & Glance, N. (2012). Spotting fake reviewer groups in consumer reviews. In Proceedings of the 21st international conference on World Wide Web, WWW ’12 (pp. 191–200). ACM: New York.

  • Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., & Ghosh, R (2013). Spotting opinion spammers using behavioral footprints. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 632–640). ACM.

  • Ott, M., Choi, Y., Cardie, C., & Hancock, J.T (2011). Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies-volume 1 (pp. 309–319). Association for Computational Linguistics.

  • Peng, Q. (2014). Store review spammer detection based on review relationship. In Advances in conceptual modeling (pp. 287–298). Springer.

  • Sahami, M., Dumais, S., Heckerman, D., & Horvitz, E. (1998). A bayesian approach to filtering junk e-mail, Learning for text categorization: papers from the 1998 workshop (Vol. 62, pp. 98–105).

  • Sandulescu, V., & Ester, M. (2015). Detecting singleton review spammers using semantic similarity. In Proceedings of the 24th international conference on World Wide Web (pp. 971–976). ACM.

  • Seneviratne, S., Seneviratne, A., Kaafar, M.A., Mahanti, A., & Mohapatra, P. (2015). Early detection of spam mobile apps. In Proceedings of the 24th international conference on World Wide Web, WWW ’15 (pp. 949–959). ACM, New York.

  • Tan, P.-N., & et al. (2006). Introduction to data mining. India: Pearson Education.

    Google Scholar 

  • Vorakulpipat, C., Visoottiviseth, V., & Siwamogsatham, S. (2012). Polite sender: a resource-saving spam email countermeasure based on sender responsibilities and recipient justifications. Computers & Security, 31(3), 286–298.

    Article  Google Scholar 

  • Wang, J., & Liang, X. (2013). Discovering the rating pattern of online reviewers through data coclustering. In 2013 IEEE international conference on intelligence and security informatics (ISI) (pp. 374–376). IEEE.

  • Wang, G., Xie, S., Liu, B., & Yu, P.S. (2011). Review graph based online store review spammer detection. In Proceedings of the 2011 IEEE 11th international conference on data mining, ICDM’11 (pp. 1242–1247). IEEE Computer Society: Washington, DC, USA.

  • Wang, G., Xie, S., Liu, B., & Yu, P.S. (2012). Identify online store review spammers via social review graph. ACM Transactions on Intelligent Systems and Technology, 3(4), 61:1–61:21.

    Google Scholar 

  • Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., & Patwardhan, S. (2005). Opinionfinder: A system for subjectivity analysis. In Proceedings of hlt/emnlp on interactive demonstrations (pp. 34–35). Association for Computational Linguistics.

  • Wu, B., & Davison, B.D. (2005). Identifying link farm spam pages. In Special interest tracks and posters of the 14th international conference on World Wide Web, WWW ’05 (pp. 820–829). ACM, New York.

  • Wu, B., Goel, V., & Davison, B.D. (2006). Topical trustrank: Using topicality to combat web spam. In Proceedings of the 15th international conference on World Wide Web, WWW ’06 (pp. 63–72). ACM, New York.

  • Xu, C., Zhang, J., Chang, K., & Long, C. (2013). Uncovering collusive spammers in chinese review websites. In Proceedings of the 22nd ACM international conference on conference on information & knowledge management (pp. 979–988). ACM.

  • Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). Recent advances of large-scale linear classification. Proceedings of the IEEE, 100(9), 2584–2603.

    Article  Google Scholar 

  • Zhou, Y. (2011). Structure learning of probabilistic graphical models: a comprehensive survey. arXiv:1111.6925.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rupesh Kumar Dewang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dewang, R.K., Singh, A.K. State-of-art approaches for review spammer detection: a survey. J Intell Inf Syst 50, 231–264 (2018). https://doi.org/10.1007/s10844-017-0454-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-017-0454-7

Keywords

Navigation