Abstract
Online reviews have become available for consumers’ reference to make purchase decisions, but a large number of spam reviews have damaged e-commerce reputations. Previous research has addressed review spam detection with classification models using textual features, behavior features, and relational features. However, the fine-grained aspect features related to the product attributes in online reviews have been overlooked and have not yet been thoroughly studied. Therefore, this study proposes a review spam detection model based on a list of novel aspect features. The basic idea is that since spam reviews are usually written by users without real experience, the product aspects depicted in spam reviews will be different from those in genuine reviews. First, we use the Bi-LSTM model to automatically extract massive aspect words, which are then clustered into different aspect categories by the K-means algorithm. Further, we propose nine novel aspect features to train a machine learning model for review spam detection. Experimental results on two labeled Yelp datasets show that the proposed aspect features can significantly improve the accuracy of review spam detection by about 16.11% to 38.86% compared with textual and behavior features.
Similar content being viewed by others
Data availability
The Yelp datasets used in the current study are available at http://liu.cs.uic.edu/download/yelp_filter/, and the Amazon datasets used in this study are available at https://www.kaggle.com/datasets/naveedhn/amazon-product-review-spam-and-non-spam?select=Home_and_Kitchen.
References
Akoglu L, Chandy R, Faloutsos C (2013). Opinion fraud detection in online reviews by network effects. Seventh Int AAAI Conf Weblogs and Social Media, 7(1), pp.2–11. https://ojs.aaai.org/index.php/ICWSM/article/view/14380. Accessed 2013-07-10
Bajaj S, Garg N, Singh S (2017) A novel user-based spam review detection. Procedia Computer Science 122:1009–1015. https://doi.org/10.1016/j.procs.2017.11.467
Barbado R, Araque O, Iglesias AC (2019) A framework for fake review detection in online consumer electronics retailers. Inf Process Manage 56(4):1234–1244. https://doi.org/10.1016/j.ipm.2019.03.002
Bhuvaneshwari P, Rao AN, Robinson YH (2021) Spam review detection using self attention based CNN and bi-directional LSTM. Multimedia Tools and Applications 80:18107–18124. https://doi.org/10.1007/s11042-021-10602-y
Buettner R (2016) Predicting user behavior in electronic markets based on personality-mining in large online social networks: A personality-based product recommender framework. Electron Mark 27:247–265. https://doi.org/10.1007/s12525-016-0228-z
Cai M, Tan Y, Ge B, Dou Y, Huang G, Du Y (2021) PURA: A product-and-user oriented approach for requirement analysis from online reviews. IEEE Syst J 99:1–12. https://doi.org/10.1109/JSYST.2021.3067334
Chua AY, Banerjee S (2016) Helpfulness of user-generated reviews as a function of review sentiment, product type and information quality. Comput Hum Behav 54:547–554. https://doi.org/10.1016/j.chb.2015.08.057
Dong M, Yao L, Wang X, Benatallah B, Huang C, Ning X (2018) Opinion fraud detection via neural autoencoder decision forest. Pattern Recogn Lett 132:21–29. https://doi.org/10.1016/j.patrec.2018.07.013
Etaiwi W, Naymat G (2017) The impact of applying different preprocessing steps on review spam detection. Procedia Computer Science 113:273–279. https://doi.org/10.1016/j.procs.2017.08.368
Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013). Exploiting burstiness in reviews for review spammer detection. Seventh Int Conf Weblogs and Social Media, pp.175–184
Feng S, Banerjee R, Yejin C (2012). Syntactic stylometry for deception detection. 50th Annual Meet Assoc Comput Linguist, pp.171–175
Gao Y, Gong M, Xie Y, Qin QK (2020) An attention-based unsupervised adversarial model for movie review spam detection. IEEE Trans Multimedia 23:784–796. https://doi.org/10.1109/TMM.2020.2990085
Graves A, Mohamed AR, Hinton G (2013). Speech recognition with deep recurrent neural networks. Int Conf Acoustics, Speech, and Signal Process (ICASSP), pp.6645–6649. https://doi.org/10.1109/ICASSP.2013.6638947
He D, Pan M, Hong K, Cheng Y, Chan S, Liu X, Guizani N (2020) Fake review detection based on PU learning and behavior density. IEEE Network 99:1–6. https://doi.org/10.1109/MNET.001.1900542
Hernández Fusilier D, Montes-y-Gómez M, Rosso P, Guzmán Cabrera R (2015) Detecting positive and negative deceptive opinions using PU-learning. Inf Process Manage 51(4):433–443. https://doi.org/10.1016/j.ipm.2014.11.001
Hernández-Castañeda Á, Calvo H, Gelbukh A, Flores J (2017) Cross-domain deception detection using support vector networks. Soft Comput 21(3):585–595. https://doi.org/10.1007/s00500-016-2409-2
Heydari A, Tavakoli M, Salim N, Heydari Z (2015) Detection of review spam: A survey. Expert Syst Appl 42(7):3634–4364. https://doi.org/10.1016/j.eswa.2014.12.029
Hussain N, Mirza H, Hussain I, Iqbal F, Memon I (2020) Spam review detection using the linguistic and spammer behavioral methods. IEEE Access 8:53801–53816. https://doi.org/10.1109/ACCESS.2020.2979226
Jia S, Zhang X, Wang X, Liu Y (2018). Fake reviews detection based on LDA. 4th Int Conf Inf Manag, pp.280–283. https://doi.org/10.1109/INFOMAN.2018.8392850
Jindal N, Liu B (2008). Opinion spam and analysis. Int Conf Web Search and Data Mining, pp. 219–230. https://doi.org/10.1145/1341531.1341560
Jindal N, Liu B (2007). Review spam detection. 16th international conference on World Wide Web, pp.1189–1190. https://doi.org/10.1145/1242572.1242759
Karami A, Zhou B (2015). Online review spam detection by new linguistic features. Proceedings of iConference 2015. http://hdl.handle.net/2142/73749. Accessed 2015-03-15
KC S, Mukherjee A (2016). On the temporal dynamics of opinion spamming: Case studies on Yelp. 25th Int Conf World Wide Web, pp.369–379. https://doi.org/10.1145/2872427.2883087
Li H, Fei G, Wang S, Liu B, Shao W, Mukherjee A, Shao J (2017). Bimodal distribution and co-bursting in review spam detection. 26th Int Conf World Wide Web, pp.1063–1072. https://doi.org/10.1145/3038912.3052582
Li F, Huang M, Yang Y, Zhu X (2011). Learning to identify review spam. IJCAI Proc-Int Joint Conf Artificial Intell pp.2488–2493. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-414
Li H, Liu B, Mukherje A, Shao J (2014) Spotting fake reviews using positive-unlabeled learning. Computación y Sistemas 18(3):467–475. https://doi.org/10.13053/cys-18-3-2035
Li J, Ott M, Cardie C, Hovy E (2014). Towards a general rule for identifying deceptive opinion spam. 52nd Annual Meet Assoc Comput Linguist, pp.1566–1576. https://doi.org/10.3115/v1/P14-1147
Li A, Qin Z, Liu R, Yang Y, Li D (2019). Spam review detection with graph convolutional networks. 28th ACM Int Conf, pp.2703–2711. https://doi.org/10.1145/3357384.3357820
Lim E, Nguyen V, Jindal N, Liu B, Lauw H (2010). Detecting product review spammers using rating behaviors. 19th ACM Int Conf Inf Knowledge Manag, pp.939–948. https://doi.org/10.1145/1871437.1871557
Lu Y, Zhang L, Xiao Y, Li Y (2013). Simultaneously detecting fake reviews and review spammers using factor graph model. Third Annual ACM Web Science Conference, pp.225–233. https://doi.org/10.1145/2464464.2464470
Luo Y, Tang R (2019) Understanding hidden dimensions in textual reviews on Airbnb: An application of modified latent aspect rating analysis (LARA). Int J Hosp Manag 80:144–154. https://doi.org/10.1016/j.ijhm.2019.02.008
Mukherjee S, Dutta S, Weikum G (2016). Credible review detection with limited information using consistency features. European Conf Machine Learning and Principles and Practice of Knowledge Discovery, pp.195–213. https://doi.org/10.1007/978-3-319-46227-1_13
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013). Spotting opinion spammers using behavioral footprints. 19th ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.632–640. https://doi.org/10.1145/2487575.2487580
Mukherjee A, Liu B, Glance N (2012). Spotting fake reviewer groups in consumer reviews. 21st Annual Conf World Wide Web, pp.191–200. https://doi.org/10.1145/2187836.2187863
Mukherjee A, Liu B, Wang J, Glance N, Jindal N (2011). Detecting group review spam. 20th Int Conf Companion on World Wide Web, pp.93–94. https://doi.org/10.1145/1963192.1963240
Mukherjee A, Venkataraman V, Liu B, Glance N (2013). What yelp fake review filter might be doing?. Seventh Int Conf Weblogs and Social Media, pp.409–418
Noekhah S, Salim N, Zakaria NH (2018). A comprehensive study on opinion mining features and their applications. International conference of reliable information and communication technology. Int Conf Reliable Inf Commun Technol, pp.78–89. https://doi.org/10.1007/978-3-319-59427-9
Noekhah S, Salim N, Zakaria N (2019) Opinion spam detection: Using multi-iterative graph-based model. Inf. Process. Manage 57(1):102140. https://doi.org/10.1016/j.ipm.2019.102140
Ott M, Cardie C, Hancock J (2013). Negative deceptive opinion spam. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.497–501
Rastogi A, Mehrotra M (2017) Opinion spam detection in online reviews. J Inf Knowl Manag 16(4):1750036. https://doi.org/10.1142/S0219649217500368
Rayana S, Akoglu L (2015). Collective opinion spam detection: Bridging review networks and metadata. 21th ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.985–994. https://doi.org/10.1145/2783258.2783370
Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: An empirical study. Inf Sci 385(38):213–224. https://doi.org/10.1016/j.ins.2017.01.015
Shahariar GM, Biswas S, Omar F, Shah F, Hassan S (2019). Spam review detection using deep learning. IEEE 10th Annual Inf Technol, Electronics and Mobile Commun Conf, pp.0027–0033. https://doi.org/10.1109/IEMCON.2019.8936148
Shehnepoor S, Salehi M, Farahbakhsh R, Crespi N (2017) NetSpam: A network-based spam detection framework for reviews in online social media. IEEE Trans Inf Forensics Secur 12(7):1585–1595. https://doi.org/10.1109/TIFS.2017.2675361
Shojaee S, Murad M, Azman A, Sharef N, Nadali S (2013). Detecting deceptive reviews using lexical and syntactic features. Int Conf Intell Syst Des Appl, pp.53–58. https://doi.org/10.1109/ISDA.2013.6920707
Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsl 19(1):22–36. https://doi.org/10.1145/3137597.3137600
Tang X, Qian T, You Z (2020) Generating behavior features for cold-start spam review detection with adversarial learning. Inf Sci 526:274–288. https://doi.org/10.1016/j.ins.2020.03.063
Thapa R, Lamichhane B, Ma D, Jiao X (2021). SpamHD: Memory-efficient text spam detection using brain-inspired hyperdimensional computing. IEEE Comput Soc Annual Symposium on VLSI (ISVLSI), pp.84–89. https://doi.org/10.1109/ISVLSI51109.2021.00026
Tsai CF, Chen K, Hu YH, Chen WK (2020) Improving text summarization of online hotel reviews with review helpfulness and sentiment. Tour. Manag 80:104122. https://doi.org/10.1016/j.tourman.2020.104122
Wang X, Liu K, Zhao J (2017). Handling cold-start problem in review spam detection by jointly embedding texts and behaviors. 55th Annual Meet Associ Comput Linguist pp.366–376. https://doi.org/10.18653/v1/P17-1034
Wang H, Lu Y, Zhai C (2010). Latent aspect rating analysis on review text data: A rating regression approach. ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.783–792. https://doi.org/10.1145/1835804.1835903
Wang G, Xie S, Liu B, Yu P (2012) Identify online store review spammers via social review graph. ACM Trans Intell Syst Technol 3(4):1–21. https://doi.org/10.1145/2337542.2337546
Xie S, Wang G, Lin S, Yu P (2012). Review spam detection via temporal pattern discovery. ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.823–831. https://doi.org/10.1145/2339530.2339662
Xue H, Wang Q, Luo B, Seo H, Li F (2019) Content-aware trust propagation toward online review spam detection. Journal of Data and Information Quality 11(3):1–31. https://doi.org/10.1145/3305258
Yang Y, Mueller N, Croes R (2016) Market accessibility and hotel prices in the Caribbean: The moderating effect of quality-signaling factors. Tour Manage 56:40–51. https://doi.org/10.1016/j.tourman.2016.03.021
Ye J, Akoglu L (2015). Discovering Opinion Spammer Groups by Network Footprints. ACM on Conf Online Social Netw, pp.97. https://doi.org/10.1145/2817946.2820606
Yilmaz C, Durahim O (2018). SPR2EP: A semi-supervised spam review detection framework. IEEE/ACM Int Conf Advances in Social Networks Analysis and Mining, pp.306–313. https://doi.org/10.1109/ASONAM.2018.8508314
You L, Peng Q, Xiong Z, He D, Qiu M, Zhang X (2019) Integrating aspect analysis and local outlier factor for intelligent review spam detection. Futur Gener Comput Syst 102:163–172. https://doi.org/10.1016/j.future.2019.07.044
Yuan C, Zhou W, Ma Q, Lv S, Han J, Hu S (2019). Learning review representations from user and product level information for spam detection. IEEE Int Conf Data Mining, pp.1444–1449. https://doi.org/10.1109/ICDM.2019.00188
Zhang W, Du Y, Yoshida T, Wang Q (2018) DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network. Inf Process Manage 54(4):576–592. https://doi.org/10.1016/j.ipm.2018.03.007
Zhang M, Fan B, Zhang N, Wang W, Fan W (2021) Mining product innovation ideas from online reviews. Inf Process Manag 58:102389. https://doi.org/10.1016/j.ipm.2020.102389
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
This work was supported by the National Natural Science Foundation of China (72201272, 72025405, 72088101), the National Social Science Foundation of China (22ZDA102), the Hunan Science and Technology Plan Project (2020TP1013, 2020JJ4673, 2023JJ40685), the Shenzhen Basic Research Project for Development of Science and Technology (JCYJ20200109141218676, 202008291726500001), the Innovation Team Project of Colleges in Guangdong Province (2020KCXTD040), and the Social Science Foundation of Hunan Province (20YBA012). The authors declare that they have no conflict of interest.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cai, M., Du, Y., Tan, Y. et al. Aspect-based classification method for review spam detection. Multimed Tools Appl 83, 20931–20952 (2024). https://doi.org/10.1007/s11042-023-16293-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16293-x