Aspect-based classification method for review spam detection

Cai, Mengsi; Du, Yonghao; Tan, Yuejin; Lu, Xin

doi:10.1007/s11042-023-16293-x

Aspect-based classification method for review spam detection

Published: 05 August 2023

Volume 83, pages 20931–20952, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Mengsi Cai¹,
Yonghao Du¹,
Yuejin Tan¹ &
…
Xin Lu ORCID: orcid.org/0000-0002-3547-6493¹

209 Accesses
Explore all metrics

Abstract

Online reviews have become available for consumers’ reference to make purchase decisions, but a large number of spam reviews have damaged e-commerce reputations. Previous research has addressed review spam detection with classification models using textual features, behavior features, and relational features. However, the fine-grained aspect features related to the product attributes in online reviews have been overlooked and have not yet been thoroughly studied. Therefore, this study proposes a review spam detection model based on a list of novel aspect features. The basic idea is that since spam reviews are usually written by users without real experience, the product aspects depicted in spam reviews will be different from those in genuine reviews. First, we use the Bi-LSTM model to automatically extract massive aspect words, which are then clustered into different aspect categories by the K-means algorithm. Further, we propose nine novel aspect features to train a machine learning model for review spam detection. Experimental results on two labeled Yelp datasets show that the proposed aspect features can significantly improve the accuracy of review spam detection by about 16.11% to 38.86% compared with textual and behavior features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review Spam Detection Based on Multi-dimensional Features

A Novel Chinese Text Mining Method for E-Commerce Review Spam Detection

Opinion spam detection framework using hybrid classification scheme

Article 11 June 2019

Data availability

The Yelp datasets used in the current study are available at http://liu.cs.uic.edu/download/yelp_filter/, and the Amazon datasets used in this study are available at https://www.kaggle.com/datasets/naveedhn/amazon-product-review-spam-and-non-spam?select=Home_and_Kitchen.

Notes

https://www.kaggle.com/datasets/naveedhn/amazon-product-review-spam-and-non-spam?select=Home_and_Kitchen.

References

Akoglu L, Chandy R, Faloutsos C (2013). Opinion fraud detection in online reviews by network effects. Seventh Int AAAI Conf Weblogs and Social Media, 7(1), pp.2–11. https://ojs.aaai.org/index.php/ICWSM/article/view/14380. Accessed 2013-07-10
Bajaj S, Garg N, Singh S (2017) A novel user-based spam review detection. Procedia Computer Science 122:1009–1015. https://doi.org/10.1016/j.procs.2017.11.467
Article Google Scholar
Barbado R, Araque O, Iglesias AC (2019) A framework for fake review detection in online consumer electronics retailers. Inf Process Manage 56(4):1234–1244. https://doi.org/10.1016/j.ipm.2019.03.002
Article Google Scholar
Bhuvaneshwari P, Rao AN, Robinson YH (2021) Spam review detection using self attention based CNN and bi-directional LSTM. Multimedia Tools and Applications 80:18107–18124. https://doi.org/10.1007/s11042-021-10602-y
Article Google Scholar
Buettner R (2016) Predicting user behavior in electronic markets based on personality-mining in large online social networks: A personality-based product recommender framework. Electron Mark 27:247–265. https://doi.org/10.1007/s12525-016-0228-z
Article Google Scholar
Cai M, Tan Y, Ge B, Dou Y, Huang G, Du Y (2021) PURA: A product-and-user oriented approach for requirement analysis from online reviews. IEEE Syst J 99:1–12. https://doi.org/10.1109/JSYST.2021.3067334
Article Google Scholar
Chua AY, Banerjee S (2016) Helpfulness of user-generated reviews as a function of review sentiment, product type and information quality. Comput Hum Behav 54:547–554. https://doi.org/10.1016/j.chb.2015.08.057
Article Google Scholar
Dong M, Yao L, Wang X, Benatallah B, Huang C, Ning X (2018) Opinion fraud detection via neural autoencoder decision forest. Pattern Recogn Lett 132:21–29. https://doi.org/10.1016/j.patrec.2018.07.013
Article ADS Google Scholar
Etaiwi W, Naymat G (2017) The impact of applying different preprocessing steps on review spam detection. Procedia Computer Science 113:273–279. https://doi.org/10.1016/j.procs.2017.08.368
Article Google Scholar
Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013). Exploiting burstiness in reviews for review spammer detection. Seventh Int Conf Weblogs and Social Media, pp.175–184
Feng S, Banerjee R, Yejin C (2012). Syntactic stylometry for deception detection. 50th Annual Meet Assoc Comput Linguist, pp.171–175
Gao Y, Gong M, Xie Y, Qin QK (2020) An attention-based unsupervised adversarial model for movie review spam detection. IEEE Trans Multimedia 23:784–796. https://doi.org/10.1109/TMM.2020.2990085
Article Google Scholar
Graves A, Mohamed AR, Hinton G (2013). Speech recognition with deep recurrent neural networks. Int Conf Acoustics, Speech, and Signal Process (ICASSP), pp.6645–6649. https://doi.org/10.1109/ICASSP.2013.6638947
He D, Pan M, Hong K, Cheng Y, Chan S, Liu X, Guizani N (2020) Fake review detection based on PU learning and behavior density. IEEE Network 99:1–6. https://doi.org/10.1109/MNET.001.1900542
Article Google Scholar
Hernández Fusilier D, Montes-y-Gómez M, Rosso P, Guzmán Cabrera R (2015) Detecting positive and negative deceptive opinions using PU-learning. Inf Process Manage 51(4):433–443. https://doi.org/10.1016/j.ipm.2014.11.001
Article Google Scholar
Hernández-Castañeda Á, Calvo H, Gelbukh A, Flores J (2017) Cross-domain deception detection using support vector networks. Soft Comput 21(3):585–595. https://doi.org/10.1007/s00500-016-2409-2
Article Google Scholar
Heydari A, Tavakoli M, Salim N, Heydari Z (2015) Detection of review spam: A survey. Expert Syst Appl 42(7):3634–4364. https://doi.org/10.1016/j.eswa.2014.12.029
Article Google Scholar
Hussain N, Mirza H, Hussain I, Iqbal F, Memon I (2020) Spam review detection using the linguistic and spammer behavioral methods. IEEE Access 8:53801–53816. https://doi.org/10.1109/ACCESS.2020.2979226
Article Google Scholar
Jia S, Zhang X, Wang X, Liu Y (2018). Fake reviews detection based on LDA. 4th Int Conf Inf Manag, pp.280–283. https://doi.org/10.1109/INFOMAN.2018.8392850
Jindal N, Liu B (2008). Opinion spam and analysis. Int Conf Web Search and Data Mining, pp. 219–230. https://doi.org/10.1145/1341531.1341560
Jindal N, Liu B (2007). Review spam detection. 16th international conference on World Wide Web, pp.1189–1190. https://doi.org/10.1145/1242572.1242759
Karami A, Zhou B (2015). Online review spam detection by new linguistic features. Proceedings of iConference 2015. http://hdl.handle.net/2142/73749. Accessed 2015-03-15
KC S, Mukherjee A (2016). On the temporal dynamics of opinion spamming: Case studies on Yelp. 25th Int Conf World Wide Web, pp.369–379. https://doi.org/10.1145/2872427.2883087
Li H, Fei G, Wang S, Liu B, Shao W, Mukherjee A, Shao J (2017). Bimodal distribution and co-bursting in review spam detection. 26th Int Conf World Wide Web, pp.1063–1072. https://doi.org/10.1145/3038912.3052582
Li F, Huang M, Yang Y, Zhu X (2011). Learning to identify review spam. IJCAI Proc-Int Joint Conf Artificial Intell pp.2488–2493. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-414
Li H, Liu B, Mukherje A, Shao J (2014) Spotting fake reviews using positive-unlabeled learning. Computación y Sistemas 18(3):467–475. https://doi.org/10.13053/cys-18-3-2035
Article Google Scholar
Li J, Ott M, Cardie C, Hovy E (2014). Towards a general rule for identifying deceptive opinion spam. 52nd Annual Meet Assoc Comput Linguist, pp.1566–1576. https://doi.org/10.3115/v1/P14-1147
Li A, Qin Z, Liu R, Yang Y, Li D (2019). Spam review detection with graph convolutional networks. 28th ACM Int Conf, pp.2703–2711. https://doi.org/10.1145/3357384.3357820
Lim E, Nguyen V, Jindal N, Liu B, Lauw H (2010). Detecting product review spammers using rating behaviors. 19th ACM Int Conf Inf Knowledge Manag, pp.939–948. https://doi.org/10.1145/1871437.1871557
Lu Y, Zhang L, Xiao Y, Li Y (2013). Simultaneously detecting fake reviews and review spammers using factor graph model. Third Annual ACM Web Science Conference, pp.225–233. https://doi.org/10.1145/2464464.2464470
Luo Y, Tang R (2019) Understanding hidden dimensions in textual reviews on Airbnb: An application of modified latent aspect rating analysis (LARA). Int J Hosp Manag 80:144–154. https://doi.org/10.1016/j.ijhm.2019.02.008
Article Google Scholar
Mukherjee S, Dutta S, Weikum G (2016). Credible review detection with limited information using consistency features. European Conf Machine Learning and Principles and Practice of Knowledge Discovery, pp.195–213. https://doi.org/10.1007/978-3-319-46227-1_13
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013). Spotting opinion spammers using behavioral footprints. 19th ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.632–640. https://doi.org/10.1145/2487575.2487580
Mukherjee A, Liu B, Glance N (2012). Spotting fake reviewer groups in consumer reviews. 21st Annual Conf World Wide Web, pp.191–200. https://doi.org/10.1145/2187836.2187863
Mukherjee A, Liu B, Wang J, Glance N, Jindal N (2011). Detecting group review spam. 20th Int Conf Companion on World Wide Web, pp.93–94. https://doi.org/10.1145/1963192.1963240
Mukherjee A, Venkataraman V, Liu B, Glance N (2013). What yelp fake review filter might be doing?. Seventh Int Conf Weblogs and Social Media, pp.409–418
Noekhah S, Salim N, Zakaria NH (2018). A comprehensive study on opinion mining features and their applications. International conference of reliable information and communication technology. Int Conf Reliable Inf Commun Technol, pp.78–89. https://doi.org/10.1007/978-3-319-59427-9
Noekhah S, Salim N, Zakaria N (2019) Opinion spam detection: Using multi-iterative graph-based model. Inf. Process. Manage 57(1):102140. https://doi.org/10.1016/j.ipm.2019.102140
Article Google Scholar
Ott M, Cardie C, Hancock J (2013). Negative deceptive opinion spam. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.497–501
Rastogi A, Mehrotra M (2017) Opinion spam detection in online reviews. J Inf Knowl Manag 16(4):1750036. https://doi.org/10.1142/S0219649217500368
Article Google Scholar
Rayana S, Akoglu L (2015). Collective opinion spam detection: Bridging review networks and metadata. 21th ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.985–994. https://doi.org/10.1145/2783258.2783370
Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: An empirical study. Inf Sci 385(38):213–224. https://doi.org/10.1016/j.ins.2017.01.015
Article Google Scholar
Shahariar GM, Biswas S, Omar F, Shah F, Hassan S (2019). Spam review detection using deep learning. IEEE 10th Annual Inf Technol, Electronics and Mobile Commun Conf, pp.0027–0033. https://doi.org/10.1109/IEMCON.2019.8936148
Shehnepoor S, Salehi M, Farahbakhsh R, Crespi N (2017) NetSpam: A network-based spam detection framework for reviews in online social media. IEEE Trans Inf Forensics Secur 12(7):1585–1595. https://doi.org/10.1109/TIFS.2017.2675361
Article Google Scholar
Shojaee S, Murad M, Azman A, Sharef N, Nadali S (2013). Detecting deceptive reviews using lexical and syntactic features. Int Conf Intell Syst Des Appl, pp.53–58. https://doi.org/10.1109/ISDA.2013.6920707
Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsl 19(1):22–36. https://doi.org/10.1145/3137597.3137600
Article Google Scholar
Tang X, Qian T, You Z (2020) Generating behavior features for cold-start spam review detection with adversarial learning. Inf Sci 526:274–288. https://doi.org/10.1016/j.ins.2020.03.063
Article Google Scholar
Thapa R, Lamichhane B, Ma D, Jiao X (2021). SpamHD: Memory-efficient text spam detection using brain-inspired hyperdimensional computing. IEEE Comput Soc Annual Symposium on VLSI (ISVLSI), pp.84–89. https://doi.org/10.1109/ISVLSI51109.2021.00026
Tsai CF, Chen K, Hu YH, Chen WK (2020) Improving text summarization of online hotel reviews with review helpfulness and sentiment. Tour. Manag 80:104122. https://doi.org/10.1016/j.tourman.2020.104122
Article Google Scholar
Wang X, Liu K, Zhao J (2017). Handling cold-start problem in review spam detection by jointly embedding texts and behaviors. 55th Annual Meet Associ Comput Linguist pp.366–376. https://doi.org/10.18653/v1/P17-1034
Wang H, Lu Y, Zhai C (2010). Latent aspect rating analysis on review text data: A rating regression approach. ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.783–792. https://doi.org/10.1145/1835804.1835903
Wang G, Xie S, Liu B, Yu P (2012) Identify online store review spammers via social review graph. ACM Trans Intell Syst Technol 3(4):1–21. https://doi.org/10.1145/2337542.2337546
Article Google Scholar
Xie S, Wang G, Lin S, Yu P (2012). Review spam detection via temporal pattern discovery. ACM SIGKDD Int Conf Knowledge Discovery and Data Mining, pp.823–831. https://doi.org/10.1145/2339530.2339662
Xue H, Wang Q, Luo B, Seo H, Li F (2019) Content-aware trust propagation toward online review spam detection. Journal of Data and Information Quality 11(3):1–31. https://doi.org/10.1145/3305258
Article Google Scholar
Yang Y, Mueller N, Croes R (2016) Market accessibility and hotel prices in the Caribbean: The moderating effect of quality-signaling factors. Tour Manage 56:40–51. https://doi.org/10.1016/j.tourman.2016.03.021
Article Google Scholar
Ye J, Akoglu L (2015). Discovering Opinion Spammer Groups by Network Footprints. ACM on Conf Online Social Netw, pp.97. https://doi.org/10.1145/2817946.2820606
Yilmaz C, Durahim O (2018). SPR2EP: A semi-supervised spam review detection framework. IEEE/ACM Int Conf Advances in Social Networks Analysis and Mining, pp.306–313. https://doi.org/10.1109/ASONAM.2018.8508314
You L, Peng Q, Xiong Z, He D, Qiu M, Zhang X (2019) Integrating aspect analysis and local outlier factor for intelligent review spam detection. Futur Gener Comput Syst 102:163–172. https://doi.org/10.1016/j.future.2019.07.044
Article Google Scholar
Yuan C, Zhou W, Ma Q, Lv S, Han J, Hu S (2019). Learning review representations from user and product level information for spam detection. IEEE Int Conf Data Mining, pp.1444–1449. https://doi.org/10.1109/ICDM.2019.00188
Zhang W, Du Y, Yoshida T, Wang Q (2018) DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network. Inf Process Manage 54(4):576–592. https://doi.org/10.1016/j.ipm.2018.03.007
Article Google Scholar
Zhang M, Fan B, Zhang N, Wang W, Fan W (2021) Mining product innovation ideas from online reviews. Inf Process Manag 58:102389. https://doi.org/10.1016/j.ipm.2020.102389
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Systems Engineering, National University of Defense Technology, Changsha, 410073, China
Mengsi Cai, Yonghao Du, Yuejin Tan & Xin Lu

Authors

Mengsi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Yonghao Du
View author publications
You can also search for this author in PubMed Google Scholar
Yuejin Tan
View author publications
You can also search for this author in PubMed Google Scholar
Xin Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Lu.

Ethics declarations

Conflict of interest

This work was supported by the National Natural Science Foundation of China (72201272, 72025405, 72088101), the National Social Science Foundation of China (22ZDA102), the Hunan Science and Technology Plan Project (2020TP1013, 2020JJ4673, 2023JJ40685), the Shenzhen Basic Research Project for Development of Science and Technology (JCYJ20200109141218676, 202008291726500001), the Innovation Team Project of Colleges in Guangdong Province (2020KCXTD040), and the Social Science Foundation of Hunan Province (20YBA012). The authors declare that they have no conflict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cai, M., Du, Y., Tan, Y. et al. Aspect-based classification method for review spam detection. Multimed Tools Appl 83, 20931–20952 (2024). https://doi.org/10.1007/s11042-023-16293-x

Download citation

Received: 28 September 2022
Revised: 29 June 2023
Accepted: 10 July 2023
Published: 05 August 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-16293-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Aspect-based classification method for review spam detection

Abstract

Access this article

Similar content being viewed by others

Review Spam Detection Based on Multi-dimensional Features

A Novel Chinese Text Mining Method for E-Commerce Review Spam Detection

Opinion spam detection framework using hybrid classification scheme

Data availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Aspect-based classification method for review spam detection

Abstract

Access this article

Similar content being viewed by others

Review Spam Detection Based on Multi-dimensional Features

A Novel Chinese Text Mining Method for E-Commerce Review Spam Detection

Opinion spam detection framework using hybrid classification scheme

Data availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation