Abstract
The cold-start problem in spam review detection is a significant challenge referring to identifying the authenticity of the first review posted by new users. For generating more sensitive features to identify new reviews, existing methods mainly leverage text-similarity of review to find relevant features to approximate the incomplete behavior features of new reviews. However, they over-rely on the text information of new reviews while ignoring the mutual behavioral information in the review system, leading to a decrease in the sensitivity of features. To address the issue, we propose a deep feature fusion method, which balances the importance of text information and behavior information to enhance features’ sensitivity. Specifically, we construct a heterogeneous graph, where products and users serve as vertices connected by edges representing reviews. Then, we perform graph convolution calculation on this graph in the first feature fusion stage. We utilize the mutual behavioral information in the review system to compensate for the incomplete behavior feature of new reviews. Furthermore, we design a co-attention network, which can give features different weights in the global feature fusion stage, to gain features with high sensitivity of identifying new reviews. Extensive experiments on Yelp-hotel and Yelp-restaurant datasets demonstrate that our proposed approach yields better classification performance over existing methods.
Similar content being viewed by others
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to privacy and confidentiality agreements as well as other restrictions, but are available from the corresponding author on reasonable request.
References
Luca M (2016) Reviews, reputation, and revenue: the case of yelp. com. Com (March 15, 2016). Harvard Business School NOM Unit Working Paper (12-016)
Ho-Dac NN, Carson SJ, Moore WL (2013) The effects of positive and negative online customer reviews: do brand strength and category maturity matter? J Mark 77(6):37–53
Zhu F, Zhang X (2010) Impact of online consumer reviews on sales: The moderating role of product and consumer characteristics. J Mark 74(2):133–148
Hussain N, Mirza HT, Hussain I, Iqbal F, Memon I (2020) Spam review detection using the linguistic and spammer behavioral methods. IEEE Access 8:53801–53816
Mohawesh R, Xu S, Tran SN, Ollington R, Springer M, Jararweh Y, Maqsood S (2021) Fake reviews detection: a survey. IEEE Access 9:65771–65802
Mukherjee A, Venkataraman V, Liu B, Glance N (2013) What yelp fake review filter might be doing? In: Seventh International AAAI Conference on Weblogs and Social Media
Wang X, Liu K, Zhao J (2017) Handling cold-start problem in review spam detection by jointly embedding texts and behaviors. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 366–376
Shehnepoor S, Salehi M, Farahbakhsh R, Crespi N (2017) Netspam: a network-based spam detection framework for reviews in online social media. IEEE Trans Inf Forensics Secur 12(7):1585–1595
Dou Y (2019) A review of recent advance in online spam detection
You Z, Qian T, Liu B (2018) An attribute enhanced domain adaptive model for cold-start spam review detection. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1884–1895
Shehnepoor S, Togneri R, Liu W, Bennamoun M (2021) Dfraud\(^3\): Multi-component fraud detection free of cold-start. IEEE Trans Inf Forensics Secur 16:3456–3468
Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. Adv Neural Inform Process Syst 26
Huynh V-P, Papotti P (2018) Towards a benchmark for fact checking with knowledge bases. In: Companion Proceedings of the The Web Conference 2018, pp. 1595–1598
Gori M, Monfardini G, Scarselli F (2005) A new model for learning in graph domains. In: Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., vol. 2, pp. 729–734 . IEEE
Xu B, Shen H, Sun B, An R, Cao Q, Cheng X (2021) Towards consumer loan fraud detection: Graph neural networks with role-constrained conditional random field. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 4537–4545
Kudo W, Nishiguchi M, Toriumi F (2020) Gcnext: graph convolutional network with expanded balance theory for fraudulent user detection. Soc Netw Anal Min 10(1):1–12
Zhang S, Yin H, Chen T, Hung QVN, Huang Z, Cui L (2020) Gcn-based user representation learning for unifying robust recommendation and fraudster detection. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 689–698
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Xiang L, Guo G, Li Q, Zhu C, Chen J, Ma H (2020) Spam detection in reviews using lstm-based multi-entity temporal features. Intell Automat Soft Comput
Rayana S, Akoglu L (2015) Collective opinion spam detection: Bridging review networks and metadata. In: Proceedings of the 21th Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 985–994
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Du C, Sun H, Wang J, Qi Q, Liao J (2020) Adversarial and domain-aware bert for cross-domain sentiment analysis. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4019–4028
Zhang X, Lai H, Feng J (2018) Attention-aware deep adversarial hashing for cross-modal retrieval. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 591–606
Lu J, Yang J, Batra D, Parikh D (2016) Hierarchical question-image co-attention for visual question answering. Adv Neural Inf Process Syst 29:289–297
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640
Mukherjee A, Venkataraman V, Liu B, Glance N et al (2013) Fake review detection: Classification and analysis of real and pseudo reviews. UIC-CS-03-2013. Technical Report
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903
Acknowledgements
This project is supported by National Natural Science Foundation of China under Grant 61972057, and 62172059, Hunan Provincial Natural Science Foundation of China under Grant 2022JJ30623, Scientific Research Fund of Hunan Provincial Education Department of China under Grant 21A0211, Hunan Provincial Innovation Foundation For Postgraduate under Grant CX20210812.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xiang, L., You, H., Guo, G. et al. Deep feature fusion for cold-start spam review detection. J Supercomput 79, 419–434 (2023). https://doi.org/10.1007/s11227-022-04685-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04685-z