Abstract
The revolution of social media has propelled the online community to take advantage of online reviews for not only posting feedback about the products, services, and other issues but also assists individuals to analyze user’s feedback for making purchase decisions, and companies for improving the quality of manufactured goods. However, the propagation of fake reviews has become an alarming issue, as it deceives online users while purchasing and promotes or demotes the reputation of competing brands. In this work, we propose a supervised learning-based technique for the detection of fake reviews from the online textual content. The study employs machine learning classifiers for bifurcating fake and genuine reviews. Experimental results are evaluated against different evaluation measures and the performance of the proposed system is compared with baseline works.
Supported by Natural Sciences and Engineering Council of Canada (NSERC).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Asghar, M.Z., Subhan, F., Ahmad, H., et al.: Senti-eSystem: a sentiment-based eSystem-using hybridized fuzzy and deep neural network for measuring customer satisfaction. Software: Pract. Exper. 51, 571–594 (2021). https://doi.org/10.1002/spe.2853
Asghar, M.Z., Ullah, A., Ahmad, S., Khan, A.: Opinion spam detection framework using hybrid classification scheme. Soft. Comput. 24(5), 3475–3498 (2019). https://doi.org/10.1007/s00500-019-04107-y
Pragna, B., RamaBa, M.: Spam detection using NLP techniques. Int. J. Recent Technol. Eng. (IJRTE) 8(2S11), 2423–2426 (2019). ISSN 2277-3878
Renuka, D.K., Hamsapriya, T., Chakkaravarthi, M.R., Surya, P.L.: Spam classification based on supervised learning using machine learning techniques. In: 2011 International Conference on Process Automation, Control and Computing, pp. 1–7. IEEE, July 2011
Jain, G., Sharma, M., Agarwal, B.: Optimizing semantic LSTM for spam detection. Int. J. Inf. Technol. 11(2), 239–250 (2018). https://doi.org/10.1007/s41870-018-0157-5
Ghai, R., Kumar, S., Pandey, A.C.: Spam detection using rating and review processing method. In: Panigrahi, B.K., Trivedi, M.C., Mishra, K.K., Tiwari, S., Singh, P.K. (eds.) Smart Innovations in Communication and Computational Sciences. AISC, vol. 670, pp. 189–198. Springer, Singapore (2019). https://doi.org/10.1007/978-981-10-8971-8_18
Narayan, R., Rout, J.K., Jena, S.K.: Review spam detection using opinion mining. In: Sa, P.K., Sahoo, M.N., Murugappan, M., Wu, Y., Majhi, B. (eds.) Progress in Intelligent Computing Techniques: Theory, Practice, and Applications. AISC, vol. 519, pp. 273–279. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-3376-6_30
You, L., Peng, Q., Xiong, Z., He, D., Qiu, M., Zhang, X.: Integrating aspect analysis and local outlier factor for intelligent review spam detection. Future Gener. Comput. Syst. 102, 163–172 (2020)
Mataoui, M.H., Zelmati, O., Boughaci, D., Chaouche, M., Lagoug, F.: A proposed spam detection approach for Arabic social networks content. In: 2017 International Conference on Mathematics and Information Technology (ICMIT), pp. 222–226. IEEE, December 2017
Li, L., Qin, B., Ren, W., Liu, T.: Document representation and feature combination for deceptive spam review detection. Neurocomputing 254, 33–41 (2017)
Ahmed, H., Traore, I., Saad, S.: Detection of online fake news using N-gram analysis and machine learning techniques. In: Traore, I., Woungang, I., Awad, A. (eds.) ISDDC 2017. LNCS, vol. 10618, pp. 127–138. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69155-8_9
Kashti, M.R.P., Prasad, P.S.: Enhancing NLP techniques for fake review detection. Int. Res. J. Eng. Technol. (IRJET) 6, 241–245 (2019)
Kokate, S., Tidke, B.: Fake review and brand spam detection using J48 classifier. IJCSIT Int. J. Comput. Sci. Inf. Technol. 6(4), 3523–3526 (2015)
Noekhah, S., Fouladfar, E., Salim, N., Ghorashi, S.H., Hozhabri, A.A.: A novel approach for opinion spam detection in e-commerce. In: Proceedings of the 8th IEEE International Conference on E-Commerce with Focus on E-Trust (2014)
Asghar, M.Z., Khan, A., Ahmad, S., Khan, I.A., Kundi, F.M.: A unified framework for creating domain dependent polarity lexicons from user generated reviews. PLoS ONE 10(10), e0140204 (2015)
Sun, C., Du, Q., Tian, G.: Exploiting product related review features for fake review detection. Math. Probl. Eng. 2016, 1–7 (2016)
Wang, Y., Zuo, W., Wang, Y.: Research on opinion spam detection by time series anomaly detection. In: Sun, X., Pan, Z., Bertino, E. (eds.) ICAIS 2019. LNCS, vol. 11632, pp. 182–193. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24274-9_16
Kiwanuka, F.N., Alqatawna, J.F., Amin, A.H.M., Paul, S., Faris, H.: Towards automated comprehensive feature engineering for spam detection (2019)
Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: What yelp fake review filter might be doing? In: Seventh International AAAI Conference on Weblogs and Social Media, June 2013
Algur, S.P., Biradar, J.G.: Rating consistency and review content based multiple stores review spam detection. In: 2015 International Conference on Information Processing (ICIP), pp. 685–690. IEEE, December 2015
Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a general rule for identifying deceptive opinion spam. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1566–1576, June 2014
Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Al Najada, H.: Survey of review spam detection using machine learning techniques. J. Big Data 2(1), 1–24 (2015). https://doi.org/10.1186/s40537-015-0029-9
Prajapati, J., Bhatt, M., Prajapati, D.J.: Detection and summarization of genuine review using visual data mining. Int. J. Comput. Appl. 975, 8887 (2012)
Fusilier, D.H., Cabrera, R.G., Montes, M., Rosso, P.: Using PU-learning to detect deceptive opinion spam. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 38–45, June 2013
Radulescu, C., Dinsoreanu, M., Potolea, R.: Identification of spam comments using natural language processing techniques. In: 2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 29–35. IEEE, September 2014
Reitermanova, Z.: Data splitting. In: WDS, vol. 10, pp. 31–36 (2010)
Nabil, M., Aly, M., Atiya, A.: ASTD: Arabic sentiment tweets dataset. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2515–2519, September 2015
Asghar, M.Z., Khan, A., Khan, F., Kundi, F.M.: RIFT: a rule induction framework for Twitter sentiment analysis. Arab. J. Sci. Eng. 43(2), 857–877 (2017). https://doi.org/10.1007/s13369-017-2770-1
Ejaz, A., Turabee, Z., Rahim, M., Khoja, S.: Opinion mining approaches on Amazon product reviews: a comparative study. In: 2017 International Conference on Information and Communication Technologies (ICICT), pp. 173–179. IEEE, December 2017
Khattak, A.M., Ullah, H., Khalid, H.A., Habib, A., Asghar, M.Z., Kundi, F.M.: Stock market trend prediction using supervised learning. In: Proceedings of the Tenth International Symposium on Information and Communication Technology, pp. 85–91, December 2019
Joachims, T.: Text categorization with Support Vector Machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683
Effrosynidis, D., Peikos, G., Symeonidis, S., Arampatzis, A.: DUTH at SemEval-2018 task 2: Emoji prediction in tweets. In: Proceedings of the 12th International Workshop on Semantic Evaluation, pp. 466–469, June 2018
Nayak, A., Natarajan, D.: Comparative study of Naive Bayes, support vector machine and random forest classifiers in sentiment analysis of Twitter feeds. Int. J. Adv. Stud. Comput. Sci. Eng. 5, 14–17 (2016)
Ismail, H., Harous, S., Belkhoucshe, B.: A comparative analysis of machine learning classifiers for Twitter sentiment analysis. Res. Comput. Sci. 110, 71–83 (2016)
Babajide Mustapha, I., Saeed, F.: Bioactive molecule prediction using extreme gradient boosting. Molecules 21(8), 983 (2016)
Van der Walt, C.M., Barnard, E.: Data characteristics that determine classifier performance (2006)
Kwon, O., Sim, J.M.: Effects of data set features on the performances of classification algorithms. Expert Syst. Appl. 40(5), 1847–1857 (2013)
Reddy, G.T., et al.: Analysis of dimensionality reduction techniques on big data. IEEE Access 8, 54776–54788 (2020)
Maddikunta, P.K.R., Srivastava, G., Gadekallu, T.R., Deepa, N., Boopathy, P.: Predictive model for battery life in IoT networks. IET Intell. Transp. Syst. 14, 1388–1395 (2020)
Ch, R., Srivastava, G., Gadekallu, T.R., Maddikunta, P.K.R., Bhattacharya, S.: Security and privacy of UAV data using blockchain technology. J. Inf. Secur. Appl. 55, 102670 (2020)
Baza, M., Mahmoud, M., Srivastava, G., Alasmary, W., Younis, M.: A light blockchain-powered privacy-preserving organization scheme for ride sharing services. In: 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), pp. 1–6. IEEE, May 2020
MK, M., Srivastava, G., Somayaji, S.R.K., Gadekallu, T.R., Maddikunta, P.K.R., Bhattacharya, S.: An incentive based approach for COVID-19 using blockchain technology. arXiv preprint arXiv:2011.01468 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Khan, H., Asghar, M.U., Asghar, M.Z., Srivastava, G., Maddikunta, P.K.R., Gadekallu, T.R. (2021). Fake Review Classification Using Supervised Machine Learning. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12664. Springer, Cham. https://doi.org/10.1007/978-3-030-68799-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-68799-1_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68798-4
Online ISBN: 978-3-030-68799-1
eBook Packages: Computer ScienceComputer Science (R0)