Skip to main content
Log in

Deep feature fusion for cold-start spam review detection

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The cold-start problem in spam review detection is a significant challenge referring to identifying the authenticity of the first review posted by new users. For generating more sensitive features to identify new reviews, existing methods mainly leverage text-similarity of review to find relevant features to approximate the incomplete behavior features of new reviews. However, they over-rely on the text information of new reviews while ignoring the mutual behavioral information in the review system, leading to a decrease in the sensitivity of features. To address the issue, we propose a deep feature fusion method, which balances the importance of text information and behavior information to enhance features’ sensitivity. Specifically, we construct a heterogeneous graph, where products and users serve as vertices connected by edges representing reviews. Then, we perform graph convolution calculation on this graph in the first feature fusion stage. We utilize the mutual behavioral information in the review system to compensate for the incomplete behavior feature of new reviews. Furthermore, we design a co-attention network, which can give features different weights in the global feature fusion stage, to gain features with high sensitivity of identifying new reviews. Extensive experiments on Yelp-hotel and Yelp-restaurant datasets demonstrate that our proposed approach yields better classification performance over existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The datasets generated and/or analyzed during the current study are not publicly available due to privacy and confidentiality agreements as well as other restrictions, but are available from the corresponding author on reasonable request.

References

  1. Luca M (2016) Reviews, reputation, and revenue: the case of yelp. com. Com (March 15, 2016). Harvard Business School NOM Unit Working Paper (12-016)

  2. Ho-Dac NN, Carson SJ, Moore WL (2013) The effects of positive and negative online customer reviews: do brand strength and category maturity matter? J Mark 77(6):37–53

    Article  Google Scholar 

  3. Zhu F, Zhang X (2010) Impact of online consumer reviews on sales: The moderating role of product and consumer characteristics. J Mark 74(2):133–148

    Article  Google Scholar 

  4. Hussain N, Mirza HT, Hussain I, Iqbal F, Memon I (2020) Spam review detection using the linguistic and spammer behavioral methods. IEEE Access 8:53801–53816

    Article  Google Scholar 

  5. Mohawesh R, Xu S, Tran SN, Ollington R, Springer M, Jararweh Y, Maqsood S (2021) Fake reviews detection: a survey. IEEE Access 9:65771–65802

    Article  Google Scholar 

  6. Mukherjee A, Venkataraman V, Liu B, Glance N (2013) What yelp fake review filter might be doing? In: Seventh International AAAI Conference on Weblogs and Social Media

  7. Wang X, Liu K, Zhao J (2017) Handling cold-start problem in review spam detection by jointly embedding texts and behaviors. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 366–376

  8. Shehnepoor S, Salehi M, Farahbakhsh R, Crespi N (2017) Netspam: a network-based spam detection framework for reviews in online social media. IEEE Trans Inf Forensics Secur 12(7):1585–1595

    Article  Google Scholar 

  9. Dou Y (2019) A review of recent advance in online spam detection

  10. You Z, Qian T, Liu B (2018) An attribute enhanced domain adaptive model for cold-start spam review detection. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1884–1895

  11. Shehnepoor S, Togneri R, Liu W, Bennamoun M (2021) Dfraud\(^3\): Multi-component fraud detection free of cold-start. IEEE Trans Inf Forensics Secur 16:3456–3468

    Article  Google Scholar 

  12. Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. Adv Neural Inform Process Syst 26

  13. Huynh V-P, Papotti P (2018) Towards a benchmark for fact checking with knowledge bases. In: Companion Proceedings of the The Web Conference 2018, pp. 1595–1598

  14. Gori M, Monfardini G, Scarselli F (2005) A new model for learning in graph domains. In: Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., vol. 2, pp. 729–734 . IEEE

  15. Xu B, Shen H, Sun B, An R, Cao Q, Cheng X (2021) Towards consumer loan fraud detection: Graph neural networks with role-constrained conditional random field. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 4537–4545

  16. Kudo W, Nishiguchi M, Toriumi F (2020) Gcnext: graph convolutional network with expanded balance theory for fraudulent user detection. Soc Netw Anal Min 10(1):1–12

    Article  Google Scholar 

  17. Zhang S, Yin H, Chen T, Hung QVN, Huang Z, Cui L (2020) Gcn-based user representation learning for unifying robust recommendation and fraudster detection. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 689–698

  18. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  19. Xiang L, Guo G, Li Q, Zhu C, Chen J, Ma H (2020) Spam detection in reviews using lstm-based multi-entity temporal features. Intell Automat Soft Comput

  20. Rayana S, Akoglu L (2015) Collective opinion spam detection: Bridging review networks and metadata. In: Proceedings of the 21th Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 985–994

  21. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907

  22. Du C, Sun H, Wang J, Qi Q, Liao J (2020) Adversarial and domain-aware bert for cross-domain sentiment analysis. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4019–4028

  23. Zhang X, Lai H, Feng J (2018) Attention-aware deep adversarial hashing for cross-modal retrieval. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 591–606

  24. Lu J, Yang J, Batra D, Parikh D (2016) Hierarchical question-image co-attention for visual question answering. Adv Neural Inf Process Syst 29:289–297

    Google Scholar 

  25. Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640

  26. Mukherjee A, Venkataraman V, Liu B, Glance N et al (2013) Fake review detection: Classification and analysis of real and pseudo reviews. UIC-CS-03-2013. Technical Report

  27. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903

Download references

Acknowledgements

This project is supported by National Natural Science Foundation of China under Grant 61972057, and 62172059, Hunan Provincial Natural Science Foundation of China under Grant 2022JJ30623, Scientific Research Fund of Hunan Provincial Education Department of China under Grant 21A0211, Hunan Provincial Innovation Foundation For Postgraduate under Grant CX20210812.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huiqing You.

Ethics declarations

Conflict of interest

We have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, L., You, H., Guo, G. et al. Deep feature fusion for cold-start spam review detection. J Supercomput 79, 419–434 (2023). https://doi.org/10.1007/s11227-022-04685-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04685-z

Keywords

Navigation