Skip to main content

Fake Review Classification Using Supervised Machine Learning

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Abstract

The revolution of social media has propelled the online community to take advantage of online reviews for not only posting feedback about the products, services, and other issues but also assists individuals to analyze user’s feedback for making purchase decisions, and companies for improving the quality of manufactured goods. However, the propagation of fake reviews has become an alarming issue, as it deceives online users while purchasing and promotes or demotes the reputation of competing brands. In this work, we propose a supervised learning-based technique for the detection of fake reviews from the online textual content. The study employs machine learning classifiers for bifurcating fake and genuine reviews. Experimental results are evaluated against different evaluation measures and the performance of the proposed system is compared with baseline works.

Supported by Natural Sciences and Engineering Council of Canada (NSERC).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.kaggle.com/uciml/sms-spam-collection-dataset.

  2. 2.

    https://www.kaggle.com/uciml/sms-spam-collection-dataset.

References

  1. Asghar, M.Z., Subhan, F., Ahmad, H., et al.: Senti-eSystem: a sentiment-based eSystem-using hybridized fuzzy and deep neural network for measuring customer satisfaction. Software: Pract. Exper. 51, 571–594 (2021). https://doi.org/10.1002/spe.2853

    Article  Google Scholar 

  2. Asghar, M.Z., Ullah, A., Ahmad, S., Khan, A.: Opinion spam detection framework using hybrid classification scheme. Soft. Comput. 24(5), 3475–3498 (2019). https://doi.org/10.1007/s00500-019-04107-y

    Article  Google Scholar 

  3. Pragna, B., RamaBa, M.: Spam detection using NLP techniques. Int. J. Recent Technol. Eng. (IJRTE) 8(2S11), 2423–2426 (2019). ISSN 2277-3878

    Google Scholar 

  4. Renuka, D.K., Hamsapriya, T., Chakkaravarthi, M.R., Surya, P.L.: Spam classification based on supervised learning using machine learning techniques. In: 2011 International Conference on Process Automation, Control and Computing, pp. 1–7. IEEE, July 2011

    Google Scholar 

  5. Jain, G., Sharma, M., Agarwal, B.: Optimizing semantic LSTM for spam detection. Int. J. Inf. Technol. 11(2), 239–250 (2018). https://doi.org/10.1007/s41870-018-0157-5

    Article  Google Scholar 

  6. Ghai, R., Kumar, S., Pandey, A.C.: Spam detection using rating and review processing method. In: Panigrahi, B.K., Trivedi, M.C., Mishra, K.K., Tiwari, S., Singh, P.K. (eds.) Smart Innovations in Communication and Computational Sciences. AISC, vol. 670, pp. 189–198. Springer, Singapore (2019). https://doi.org/10.1007/978-981-10-8971-8_18

    Chapter  Google Scholar 

  7. Narayan, R., Rout, J.K., Jena, S.K.: Review spam detection using opinion mining. In: Sa, P.K., Sahoo, M.N., Murugappan, M., Wu, Y., Majhi, B. (eds.) Progress in Intelligent Computing Techniques: Theory, Practice, and Applications. AISC, vol. 519, pp. 273–279. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-3376-6_30

    Chapter  Google Scholar 

  8. You, L., Peng, Q., Xiong, Z., He, D., Qiu, M., Zhang, X.: Integrating aspect analysis and local outlier factor for intelligent review spam detection. Future Gener. Comput. Syst. 102, 163–172 (2020)

    Article  Google Scholar 

  9. Mataoui, M.H., Zelmati, O., Boughaci, D., Chaouche, M., Lagoug, F.: A proposed spam detection approach for Arabic social networks content. In: 2017 International Conference on Mathematics and Information Technology (ICMIT), pp. 222–226. IEEE, December 2017

    Google Scholar 

  10. Li, L., Qin, B., Ren, W., Liu, T.: Document representation and feature combination for deceptive spam review detection. Neurocomputing 254, 33–41 (2017)

    Article  Google Scholar 

  11. Ahmed, H., Traore, I., Saad, S.: Detection of online fake news using N-gram analysis and machine learning techniques. In: Traore, I., Woungang, I., Awad, A. (eds.) ISDDC 2017. LNCS, vol. 10618, pp. 127–138. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69155-8_9

    Chapter  Google Scholar 

  12. Kashti, M.R.P., Prasad, P.S.: Enhancing NLP techniques for fake review detection. Int. Res. J. Eng. Technol. (IRJET) 6, 241–245 (2019)

    Google Scholar 

  13. Kokate, S., Tidke, B.: Fake review and brand spam detection using J48 classifier. IJCSIT Int. J. Comput. Sci. Inf. Technol. 6(4), 3523–3526 (2015)

    Google Scholar 

  14. Noekhah, S., Fouladfar, E., Salim, N., Ghorashi, S.H., Hozhabri, A.A.: A novel approach for opinion spam detection in e-commerce. In: Proceedings of the 8th IEEE International Conference on E-Commerce with Focus on E-Trust (2014)

    Google Scholar 

  15. Asghar, M.Z., Khan, A., Ahmad, S., Khan, I.A., Kundi, F.M.: A unified framework for creating domain dependent polarity lexicons from user generated reviews. PLoS ONE 10(10), e0140204 (2015)

    Article  Google Scholar 

  16. Sun, C., Du, Q., Tian, G.: Exploiting product related review features for fake review detection. Math. Probl. Eng. 2016, 1–7 (2016)

    Google Scholar 

  17. Wang, Y., Zuo, W., Wang, Y.: Research on opinion spam detection by time series anomaly detection. In: Sun, X., Pan, Z., Bertino, E. (eds.) ICAIS 2019. LNCS, vol. 11632, pp. 182–193. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24274-9_16

    Chapter  Google Scholar 

  18. Kiwanuka, F.N., Alqatawna, J.F., Amin, A.H.M., Paul, S., Faris, H.: Towards automated comprehensive feature engineering for spam detection (2019)

    Google Scholar 

  19. Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: What yelp fake review filter might be doing? In: Seventh International AAAI Conference on Weblogs and Social Media, June 2013

    Google Scholar 

  20. Algur, S.P., Biradar, J.G.: Rating consistency and review content based multiple stores review spam detection. In: 2015 International Conference on Information Processing (ICIP), pp. 685–690. IEEE, December 2015

    Google Scholar 

  21. Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a general rule for identifying deceptive opinion spam. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1566–1576, June 2014

    Google Scholar 

  22. Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Al Najada, H.: Survey of review spam detection using machine learning techniques. J. Big Data 2(1), 1–24 (2015). https://doi.org/10.1186/s40537-015-0029-9

    Article  Google Scholar 

  23. Prajapati, J., Bhatt, M., Prajapati, D.J.: Detection and summarization of genuine review using visual data mining. Int. J. Comput. Appl. 975, 8887 (2012)

    Google Scholar 

  24. Fusilier, D.H., Cabrera, R.G., Montes, M., Rosso, P.: Using PU-learning to detect deceptive opinion spam. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 38–45, June 2013

    Google Scholar 

  25. Radulescu, C., Dinsoreanu, M., Potolea, R.: Identification of spam comments using natural language processing techniques. In: 2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 29–35. IEEE, September 2014

    Google Scholar 

  26. Reitermanova, Z.: Data splitting. In: WDS, vol. 10, pp. 31–36 (2010)

    Google Scholar 

  27. Nabil, M., Aly, M., Atiya, A.: ASTD: Arabic sentiment tweets dataset. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2515–2519, September 2015

    Google Scholar 

  28. Asghar, M.Z., Khan, A., Khan, F., Kundi, F.M.: RIFT: a rule induction framework for Twitter sentiment analysis. Arab. J. Sci. Eng. 43(2), 857–877 (2017). https://doi.org/10.1007/s13369-017-2770-1

    Article  Google Scholar 

  29. Ejaz, A., Turabee, Z., Rahim, M., Khoja, S.: Opinion mining approaches on Amazon product reviews: a comparative study. In: 2017 International Conference on Information and Communication Technologies (ICICT), pp. 173–179. IEEE, December 2017

    Google Scholar 

  30. Khattak, A.M., Ullah, H., Khalid, H.A., Habib, A., Asghar, M.Z., Kundi, F.M.: Stock market trend prediction using supervised learning. In: Proceedings of the Tenth International Symposium on Information and Communication Technology, pp. 85–91, December 2019

    Google Scholar 

  31. Joachims, T.: Text categorization with Support Vector Machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683

    Chapter  Google Scholar 

  32. Effrosynidis, D., Peikos, G., Symeonidis, S., Arampatzis, A.: DUTH at SemEval-2018 task 2: Emoji prediction in tweets. In: Proceedings of the 12th International Workshop on Semantic Evaluation, pp. 466–469, June 2018

    Google Scholar 

  33. Nayak, A., Natarajan, D.: Comparative study of Naive Bayes, support vector machine and random forest classifiers in sentiment analysis of Twitter feeds. Int. J. Adv. Stud. Comput. Sci. Eng. 5, 14–17 (2016)

    Google Scholar 

  34. Ismail, H., Harous, S., Belkhoucshe, B.: A comparative analysis of machine learning classifiers for Twitter sentiment analysis. Res. Comput. Sci. 110, 71–83 (2016)

    Article  Google Scholar 

  35. Babajide Mustapha, I., Saeed, F.: Bioactive molecule prediction using extreme gradient boosting. Molecules 21(8), 983 (2016)

    Article  Google Scholar 

  36. Van der Walt, C.M., Barnard, E.: Data characteristics that determine classifier performance (2006)

    Google Scholar 

  37. Kwon, O., Sim, J.M.: Effects of data set features on the performances of classification algorithms. Expert Syst. Appl. 40(5), 1847–1857 (2013)

    Article  Google Scholar 

  38. Reddy, G.T., et al.: Analysis of dimensionality reduction techniques on big data. IEEE Access 8, 54776–54788 (2020)

    Article  Google Scholar 

  39. Maddikunta, P.K.R., Srivastava, G., Gadekallu, T.R., Deepa, N., Boopathy, P.: Predictive model for battery life in IoT networks. IET Intell. Transp. Syst. 14, 1388–1395 (2020)

    Article  Google Scholar 

  40. Ch, R., Srivastava, G., Gadekallu, T.R., Maddikunta, P.K.R., Bhattacharya, S.: Security and privacy of UAV data using blockchain technology. J. Inf. Secur. Appl. 55, 102670 (2020)

    Google Scholar 

  41. Baza, M., Mahmoud, M., Srivastava, G., Alasmary, W., Younis, M.: A light blockchain-powered privacy-preserving organization scheme for ride sharing services. In: 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), pp. 1–6. IEEE, May 2020

    Google Scholar 

  42. MK, M., Srivastava, G., Somayaji, S.R.K., Gadekallu, T.R., Maddikunta, P.K.R., Bhattacharya, S.: An incentive based approach for COVID-19 using blockchain technology. arXiv preprint arXiv:2011.01468 (2020)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gautam Srivastava .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khan, H., Asghar, M.U., Asghar, M.Z., Srivastava, G., Maddikunta, P.K.R., Gadekallu, T.R. (2021). Fake Review Classification Using Supervised Machine Learning. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12664. Springer, Cham. https://doi.org/10.1007/978-3-030-68799-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68799-1_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68798-4

  • Online ISBN: 978-3-030-68799-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics