Abstract
The importance of consumer reviews has evolved significantly with increasing inclination towards e-Commerce. Potential consumers exhibit sincere intents in seeking opinions of other consumers. These consumers have had a usage experience of the products they are intending to make a purchase decision on. The underlying businesses also deem it fit to ascertain common public opinions regarding the quality of their products as well as services. However, the consumer reviews have bulked over time to such an extent that it has become a highly challenging task to read all the reviews and detect their genuineness. Hence, it is crucial to manage reviews since spammers can manipulate the reviews to demote or promote wrong product. The paper proposes an algorithm for detecting the fake reviews. Since the proposed work concentrates only on text. So, n-gram (unigram + bigram) features are used. Supervised learning technique is used for reviews filtering. The proposed algorithm considers the combination of multiple learning algorithms for better predictive performance. The obtained results clearly indicate that using only simple features like n-gram, Ensemble can boost efficiency of algorithm at significant level.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Bajaj, S., Garg, N., Singh, S.K.: A novel user-based spam review detection. Procedia Comput. Sci. 122, 1009–1015 (2017)
Džeroski, S., Ženko, B.: Is combining classifiers with stacking better than selecting the best one? Mach. Learn. 54(3), 255–273 (2004)
Feng, S., Xing, L., Gogar, A., Choi, Y.: Distributional footprints of deceptive product reviews. ICWSM 12, 98–105 (2012)
Gaurav, K., Kumar, P.: Consumer satisfaction rating system using sentiment analysis. In: Kar, A.K., et al. (eds.) I3E 2017. LNCS, vol. 10595, pp. 400–411. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68557-1_35
Gunn, S.R.: Support vector machines for classification and regression. ISIS Tech. Rep. 14(1), 5–16 (1998)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Heredia, B., Khoshgoftaar, T.M., Prusa, J., Crawford, M.: An investigation of ensemble techniques for detection of spam reviews. In: 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 127–133. IEEE, December 2016
Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and DataMining. ACM (2008)
Kim, H.C., Pang, S., Je, H.M., Kim, D., Bang, S.Y.: Constructing support vector machine ensemble. Pattern Recognit. 36(12), 2757–2767 (2003)
Kumar, P., Dasari, Y., Nath, S., Sinha, A.: Controlling and mitigating targeted socio-economic attacks. In: Dwivedi, Y.K., et al. (eds.) I3E 2016. LNCS, vol. 9844, pp. 471–476. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45234-0_42
Kumar, P., Dasari, Y., Jain, A., Sinha, A.: Fake order mitigation: a profile based mechanism. In: Kar, A.K., et al. (eds.) I3E 2017. LNCS, vol. 10595, pp. 276–288. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68557-1_25
Li, J., Ott, M., Cardie, C., Hovy, E.H.: Towards a general rule for identifying deceptive opinion spam. In: ACL, vol. 1, pp. 1566–1576, June 2014
Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: What yelp fake review filter might be doing?. In: Seventh International AAAI Conference on Weblogs and Social Media, June 2013
Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., Ghosh, R.: Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640. ACM, August 2013
Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 309–319. Association for Computational Linguistics, June 2011
Ott, M., Cardi, C., Hancock, J.T.: Negative deceptive opinion spam. In: HLT- NAACL (2013)
Peng, Q., Zhong, M.: Detecting spam review through sentiment analysis. JSW 9(8), 2065–2072 (2014)
Qian, T., Liu, B.: Identifying multiple userids of the same author. In: EMNLP, pp. 1124–1135, October 2013
Rennie, J.D., Shih, L., Teevan, J., Karger, D.R.: Tackling the poor assumptions of naive bayes text classifiers. In: Proceedings of the 20th International Conference on Machine Learning (ICML-2003), pp. 616–623 (2003)
Shojaee, S., Murad, M.A.A., Azman, A.B., Sharef, N.M., Nadali, S.: Detecting deceptive reviews using lexical and syntactic features. In: 2013 13th International Conference on Intelligent Systems Design and Applications (ISDA), pp. 53–58. IEEE, December 2013
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)
Srivastava, A., Singh, M.P., Kumar, P.: Supervised semantic analysis of product reviews using weighted k-NN classifier. In: 2014 11th International Conference on Information Technology: New Generations (ITNG), pp. 502–507. IEEE, April 2014
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Mani, S., Kumari, S., Jain, A., Kumar, P. (2018). Spam Review Detection Using Ensemble Machine Learning. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2018. Lecture Notes in Computer Science(), vol 10935. Springer, Cham. https://doi.org/10.1007/978-3-319-96133-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-96133-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96132-3
Online ISBN: 978-3-319-96133-0
eBook Packages: Computer ScienceComputer Science (R0)