Skip to main content
Log in

Aspect-based classification method for review spam detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Online reviews have become available for consumers’ reference to make purchase decisions, but a large number of spam reviews have damaged e-commerce reputations. Previous research has addressed review spam detection with classification models using textual features, behavior features, and relational features. However, the fine-grained aspect features related to the product attributes in online reviews have been overlooked and have not yet been thoroughly studied. Therefore, this study proposes a review spam detection model based on a list of novel aspect features. The basic idea is that since spam reviews are usually written by users without real experience, the product aspects depicted in spam reviews will be different from those in genuine reviews. First, we use the Bi-LSTM model to automatically extract massive aspect words, which are then clustered into different aspect categories by the K-means algorithm. Further, we propose nine novel aspect features to train a machine learning model for review spam detection. Experimental results on two labeled Yelp datasets show that the proposed aspect features can significantly improve the accuracy of review spam detection by about 16.11% to 38.86% compared with textual and behavior features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

The Yelp datasets used in the current study are available at http://liu.cs.uic.edu/download/yelp_filter/, and the Amazon datasets used in this study are available at https://www.kaggle.com/datasets/naveedhn/amazon-product-review-spam-and-non-spam?select=Home_and_Kitchen.

Notes

  1. https://www.kaggle.com/datasets/naveedhn/amazon-product-review-spam-and-non-spam?select=Home_and_Kitchen.

References

  1. Akoglu L, Chandy R, Faloutsos C (2013). Opinion fraud detection in online reviews by network effects. Seventh Int AAAI Conf Weblogs and Social Media, 7(1), pp.2–11. https://ojs.aaai.org/index.php/ICWSM/article/view/14380. Accessed 2013-07-10

  2. Bajaj S, Garg N, Singh S (2017) A novel user-based spam review detection. Procedia Computer Science 122:1009–1015. https://doi.org/10.1016/j.procs.2017.11.467

    Article  Google Scholar 

  3. Barbado R, Araque O, Iglesias AC (2019) A framework for fake review detection in online consumer electronics retailers. Inf Process Manage 56(4):1234–1244. https://doi.org/10.1016/j.ipm.2019.03.002

    Article  Google Scholar 

  4. Bhuvaneshwari P, Rao AN, Robinson YH (2021) Spam review detection using self attention based CNN and bi-directional LSTM. Multimedia Tools and Applications 80:18107–18124. https://doi.org/10.1007/s11042-021-10602-y

    Article  Google Scholar 

  5. Buettner R (2016) Predicting user behavior in electronic markets based on personality-mining in large online social networks: A personality-based product recommender framework. Electron Mark 27:247–265. https://doi.org/10.1007/s12525-016-0228-z

    Article  Google Scholar 

  6. Cai M, Tan Y, Ge B, Dou Y, Huang G, Du Y (2021) PURA: A product-and-user oriented approach for requirement analysis from online reviews. IEEE Syst J 99:1–12. https://doi.org/10.1109/JSYST.2021.3067334

    Article  Google Scholar 

  7. Chua AY, Banerjee S (2016) Helpfulness of user-generated reviews as a function of review sentiment, product type and information quality. Comput Hum Behav 54:547–554. https://doi.org/10.1016/j.chb.2015.08.057

    Article  Google Scholar 

  8. Dong M, Yao L, Wang X, Benatallah B, Huang C, Ning X (2018) Opinion fraud detection via neural autoencoder decision forest. Pattern Recogn Lett 132:21–29. https://doi.org/10.1016/j.patrec.2018.07.013

    Article  ADS  Google Scholar 

  9. Etaiwi W, Naymat G (2017) The impact of applying different preprocessing steps on review spam detection. Procedia Computer Science 113:273–279. https://doi.org/10.1016/j.procs.2017.08.368

    Article  Google Scholar 

  10. Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013). Exploiting burstiness in reviews for review spammer detection. Seventh Int Conf Weblogs and Social Media, pp.175–184

  11. Feng S, Banerjee R, Yejin C (2012). Syntactic stylometry for deception detection. 50th Annual Meet Assoc Comput Linguist, pp.171–175

  12. Gao Y, Gong M, Xie Y, Qin QK (2020) An attention-based unsupervised adversarial model for movie review spam detection. IEEE Trans Multimedia 23:784–796. https://doi.org/10.1109/TMM.2020.2990085

    Article  Google Scholar 

  13. Graves A, Mohamed AR, Hinton G (2013). Speech recognition with deep recurrent neural networks. Int Conf Acoustics, Speech, and Signal Process (ICASSP), pp.6645–6649. https://doi.org/10.1109/ICASSP.2013.6638947

  14. He D, Pan M, Hong K, Cheng Y, Chan S, Liu X, Guizani N (2020) Fake review detection based on PU learning and behavior density. IEEE Network 99:1–6. https://doi.org/10.1109/MNET.001.1900542

    Article  Google Scholar 

  15. Hernández Fusilier D, Montes-y-Gómez M, Rosso P, Guzmán Cabrera R (2015) Detecting positive and negative deceptive opinions using PU-learning. Inf Process Manage 51(4):433–443. https://doi.org/10.1016/j.ipm.2014.11.001

    Article  Google Scholar 

  16. Hernández-Castañeda Á, Calvo H, Gelbukh A, Flores J (2017) Cross-domain deception detection using support vector networks. Soft Comput 21(3):585–595. https://doi.org/10.1007/s00500-016-2409-2

    Article  Google Scholar 

  17. Heydari A, Tavakoli M, Salim N, Heydari Z (2015) Detection of review spam: A survey. Expert Syst Appl 42(7):3634–4364. https://doi.org/10.1016/j.eswa.2014.12.029

    Article  Google Scholar 

  18. Hussain N, Mirza H, Hussain I, Iqbal F, Memon I (2020) Spam review detection using the linguistic and spammer behavioral methods. IEEE Access 8:53801–53816. https://doi.org/10.1109/ACCESS.2020.2979226

    Article  Google Scholar 

  19. Jia S, Zhang X, Wang X, Liu Y (2018). Fake reviews detection based on LDA. 4th Int Conf Inf Manag, pp.280–283. https://doi.org/10.1109/INFOMAN.2018.8392850

  20. Jindal N, Liu B (2008). Opinion spam and analysis. Int Conf Web Search and Data Mining, pp. 219–230. https://doi.org/10.1145/1341531.1341560

  21. Jindal N, Liu B (2007). Review spam detection. 16th international conference on World Wide Web, pp.1189–1190. https://doi.org/10.1145/1242572.1242759

  22. Karami A, Zhou B (2015). Online review spam detection by new linguistic features. Proceedings of iConference 2015. http://hdl.handle.net/2142/73749. Accessed 2015-03-15

  23. KC S, Mukherjee A (2016). On the temporal dynamics of opinion spamming: Case studies on Yelp. 25th Int Conf World Wide Web, pp.369–379. https://doi.org/10.1145/2872427.2883087

  24. Li H, Fei G, Wang S, Liu B, Shao W, Mukherjee A, Shao J (2017). Bimodal distribution and co-bursting in review spam detection. 26th Int Conf World Wide Web, pp.1063–1072. https://doi.org/10.1145/3038912.3052582

  25. Li F, Huang M, Yang Y, Zhu X (2011). Learning to identify review spam. IJCAI Proc-Int Joint Conf Artificial Intell pp.2488–2493. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-414

  26. Li H, Liu B, Mukherje A, Shao J (2014) Spotting fake reviews using positive-unlabeled learning. Computación y Sistemas 18(3):467–475. https://doi.org/10.13053/cys-18-3-2035

    Article  Google Scholar 

  27. Li J, Ott M, Cardie C, Hovy E (2014). Towards a general rule for identifying deceptive opinion spam. 52nd Annual Meet Assoc Comput Linguist, pp.1566–1576. https://doi.org/10.3115/v1/P14-1147

  28. Li A, Qin Z, Liu R, Yang Y, Li D (2019). Spam review detection with graph convolutional networks. 28th ACM Int Conf, pp.2703–2711. https://doi.org/10.1145/3357384.3357820

  29. Lim E, Nguyen V, Jindal N, Liu B, Lauw H (2010). Detecting product review spammers using rating behaviors. 19th ACM Int Conf Inf Knowledge Manag, pp.939–948. https://doi.org/10.1145/1871437.1871557

  30. Lu Y, Zhang L, Xiao Y, Li Y (2013). Simultaneously detecting fake reviews and review spammers using factor graph model. Third Annual ACM Web Science Conference, pp.225–233. https://doi.org/10.1145/2464464.2464470

  31. Luo Y, Tang R (2019) Understanding hidden dimensions in textual reviews on Airbnb: An application of modified latent aspect rating analysis (LARA). Int J Hosp Manag 80:144–154. https://doi.org/10.1016/j.ijhm.2019.02.008

    Article  Google Scholar 

  32. Mukherjee S, Dutta S, Weikum G (2016). Credible review detection with limited information using consistency features. European Conf Machine Learning and Principles and Practice of Knowledge Discovery, pp.195–213. https://doi.org/10.1007/978-3-319-46227-1_13

  33. Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013). Spotting opinion spammers using behavioral footprints. 19th ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.632–640. https://doi.org/10.1145/2487575.2487580

  34. Mukherjee A, Liu B, Glance N (2012). Spotting fake reviewer groups in consumer reviews. 21st Annual Conf World Wide Web, pp.191–200. https://doi.org/10.1145/2187836.2187863

  35. Mukherjee A, Liu B, Wang J, Glance N, Jindal N (2011). Detecting group review spam. 20th Int Conf Companion on World Wide Web, pp.93–94. https://doi.org/10.1145/1963192.1963240

  36. Mukherjee A, Venkataraman V, Liu B, Glance N (2013). What yelp fake review filter might be doing?. Seventh Int Conf Weblogs and Social Media, pp.409–418

  37. Noekhah S, Salim N, Zakaria NH (2018). A comprehensive study on opinion mining features and their applications. International conference of reliable information and communication technology. Int Conf Reliable Inf Commun Technol, pp.78–89. https://doi.org/10.1007/978-3-319-59427-9

  38. Noekhah S, Salim N, Zakaria N (2019) Opinion spam detection: Using multi-iterative graph-based model. Inf. Process. Manage 57(1):102140. https://doi.org/10.1016/j.ipm.2019.102140

    Article  Google Scholar 

  39. Ott M, Cardie C, Hancock J (2013). Negative deceptive opinion spam. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.497–501

  40. Rastogi A, Mehrotra M (2017) Opinion spam detection in online reviews. J Inf Knowl Manag 16(4):1750036. https://doi.org/10.1142/S0219649217500368

    Article  Google Scholar 

  41. Rayana S, Akoglu L (2015). Collective opinion spam detection: Bridging review networks and metadata. 21th ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.985–994. https://doi.org/10.1145/2783258.2783370

  42. Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: An empirical study. Inf Sci 385(38):213–224. https://doi.org/10.1016/j.ins.2017.01.015

    Article  Google Scholar 

  43. Shahariar GM, Biswas S, Omar F, Shah F, Hassan S (2019). Spam review detection using deep learning. IEEE 10th Annual Inf Technol, Electronics and Mobile Commun Conf, pp.0027–0033. https://doi.org/10.1109/IEMCON.2019.8936148

  44. Shehnepoor S, Salehi M, Farahbakhsh R, Crespi N (2017) NetSpam: A network-based spam detection framework for reviews in online social media. IEEE Trans Inf Forensics Secur 12(7):1585–1595. https://doi.org/10.1109/TIFS.2017.2675361

    Article  Google Scholar 

  45. Shojaee S, Murad M, Azman A, Sharef N, Nadali S (2013). Detecting deceptive reviews using lexical and syntactic features. Int Conf Intell Syst Des Appl, pp.53–58. https://doi.org/10.1109/ISDA.2013.6920707

  46. Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsl 19(1):22–36. https://doi.org/10.1145/3137597.3137600

    Article  Google Scholar 

  47. Tang X, Qian T, You Z (2020) Generating behavior features for cold-start spam review detection with adversarial learning. Inf Sci 526:274–288. https://doi.org/10.1016/j.ins.2020.03.063

    Article  Google Scholar 

  48. Thapa R, Lamichhane B, Ma D, Jiao X (2021). SpamHD: Memory-efficient text spam detection using brain-inspired hyperdimensional computing. IEEE Comput Soc Annual Symposium on VLSI (ISVLSI), pp.84–89. https://doi.org/10.1109/ISVLSI51109.2021.00026

  49. Tsai CF, Chen K, Hu YH, Chen WK (2020) Improving text summarization of online hotel reviews with review helpfulness and sentiment. Tour. Manag 80:104122. https://doi.org/10.1016/j.tourman.2020.104122

    Article  Google Scholar 

  50. Wang X, Liu K, Zhao J (2017). Handling cold-start problem in review spam detection by jointly embedding texts and behaviors. 55th Annual Meet Associ Comput Linguist pp.366–376. https://doi.org/10.18653/v1/P17-1034

  51. Wang H, Lu Y, Zhai C (2010). Latent aspect rating analysis on review text data: A rating regression approach. ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.783–792. https://doi.org/10.1145/1835804.1835903

  52. Wang G, Xie S, Liu B, Yu P (2012) Identify online store review spammers via social review graph. ACM Trans Intell Syst Technol 3(4):1–21. https://doi.org/10.1145/2337542.2337546

    Article  Google Scholar 

  53. Xie S, Wang G, Lin S, Yu P (2012). Review spam detection via temporal pattern discovery. ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.823–831. https://doi.org/10.1145/2339530.2339662

  54. Xue H, Wang Q, Luo B, Seo H, Li F (2019) Content-aware trust propagation toward online review spam detection. Journal of Data and Information Quality 11(3):1–31. https://doi.org/10.1145/3305258

    Article  Google Scholar 

  55. Yang Y, Mueller N, Croes R (2016) Market accessibility and hotel prices in the Caribbean: The moderating effect of quality-signaling factors. Tour Manage 56:40–51. https://doi.org/10.1016/j.tourman.2016.03.021

    Article  Google Scholar 

  56. Ye J, Akoglu L (2015). Discovering Opinion Spammer Groups by Network Footprints. ACM on Conf Online Social Netw, pp.97. https://doi.org/10.1145/2817946.2820606

  57. Yilmaz C, Durahim O (2018). SPR2EP: A semi-supervised spam review detection framework. IEEE/ACM Int Conf Advances in Social Networks Analysis and Mining, pp.306–313. https://doi.org/10.1109/ASONAM.2018.8508314

  58. You L, Peng Q, Xiong Z, He D, Qiu M, Zhang X (2019) Integrating aspect analysis and local outlier factor for intelligent review spam detection. Futur Gener Comput Syst 102:163–172. https://doi.org/10.1016/j.future.2019.07.044

    Article  Google Scholar 

  59. Yuan C, Zhou W, Ma Q, Lv S, Han J, Hu S (2019). Learning review representations from user and product level information for spam detection. IEEE Int Conf Data Mining, pp.1444–1449. https://doi.org/10.1109/ICDM.2019.00188

  60. Zhang W, Du Y, Yoshida T, Wang Q (2018) DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network. Inf Process Manage 54(4):576–592. https://doi.org/10.1016/j.ipm.2018.03.007

    Article  Google Scholar 

  61. Zhang M, Fan B, Zhang N, Wang W, Fan W (2021) Mining product innovation ideas from online reviews. Inf Process Manag 58:102389. https://doi.org/10.1016/j.ipm.2020.102389

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Lu.

Ethics declarations

Conflict of interest

This work was supported by the National Natural Science Foundation of China (72201272, 72025405, 72088101), the National Social Science Foundation of China (22ZDA102), the Hunan Science and Technology Plan Project (2020TP1013, 2020JJ4673, 2023JJ40685), the Shenzhen Basic Research Project for Development of Science and Technology (JCYJ20200109141218676, 202008291726500001), the Innovation Team Project of Colleges in Guangdong Province (2020KCXTD040), and the Social Science Foundation of Hunan Province (20YBA012). The authors declare that they have no conflict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, M., Du, Y., Tan, Y. et al. Aspect-based classification method for review spam detection. Multimed Tools Appl 83, 20931–20952 (2024). https://doi.org/10.1007/s11042-023-16293-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16293-x

Keywords

Navigation