Spam review detection using LSTM autoencoder: an unsupervised approach

Saumya, Sunil; Singh, Jyoti Prakash

doi:10.1007/s10660-020-09413-4

Spam review detection using LSTM autoencoder: an unsupervised approach

Published: 13 May 2020

Volume 22, pages 113–133, (2022)
Cite this article

Electronic Commerce Research Aims and scope Submit manuscript

2147 Accesses
Explore all metrics

Abstract

The review of online products or services is becoming a major factor in the user’s purchasing decisions. The popularity and influence of online reviews attract spammers who intend to elevate their products or services by writing positive reviews for them and lowering the business of others by writing negative reviews. Traditionally, the spam review identification task is seen as a two-class classification problem. The classification approach requires a labelled dataset to train a model for the environment it is working on. The unavailability of the labelled dataset is a major limitation in the classification approach. To overcome the problem of the labelled dataset, we propose an unsupervised learning model combining long short-term memory (LSTM) networks and autoencoder (LSTM-autoencoder) to distinguish spam reviews from other real reviews. The said model is trained to learn the patterns of real review from the review’s textual details without any label. The experimental results show that our model is able to separate the real and spam review with good accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Detecting Fake Reviews Using Multiple Machine Learning Models: A Comparative Study

Survey of review spam detection using machine learning techniques

Article Open access 05 October 2015

Online Spam Review Detection: A Survey of Literature

Article Open access 05 May 2022

References

Akoglu, L., Chandy, R., & Faloutsos, C. (2013). Opinion fraud detection in online reviews by network effects. ICWSM, 13, 2–11.
Google Scholar
Alberto, T. C., Lochter, J. V., & Almeida, T. A. (2015). Tubespam: Comment spam filtering on youtube. In 2015 IEEE 14th international conference on machine learning and applications (ICMLA) (pp. 138–143). IEEE.
Banerjee, S., Chua, A. Y., & Kim, J.-J. (2015). Using supervised learning to classify authentic and fake online reviews. In Proceedings of the 9th international conference on ubiquitous information management and communication (p. 88). ACM.
Chua, A. Y., & Banerjee, S. (2013). Reliability of reviews on the internet: The case of tripadvisor. In World congress on engineering & computer science (pp. 453–457). York.
Crawford, M., Khoshgoftaar, T. M., Prusa, J. D., Richter, A. N., & Al Najada, H. (2015). Survey of review spam detection using machine learning techniques. Journal of Big Data, 2(1), 23.
Article Google Scholar
Dong, M., Yao, L., Wang, X., Benatallah, B., Huang, C., & Ning, X. (2018). Opinion fraud detection via neural autoencoder decision forest. arXiv:1805.03379.
Feng, S., Banerjee, R., & Choi, Y. (2012). Syntactic stylometry for deception detection. In Proceedings of the 50th annual meeting of the association for computational linguistics: Short papers (Vol. 2, pp. 171–175). Association for Computational Linguistics.
Heydari, A., ali Tavakoli, M., Salim, N., & Heydari, Z. (2015). Detection of review spam: A survey. Expert Systems with Applications, 42(7), 3634–3642.
Article Google Scholar
Heydari, A., Tavakoli, M., & Salim, N. (2016). Detection of fake opinions using time series. Expert Systems with Applications, 58, 83–92.
Article Google Scholar
Jindal, N., & Liu, B. (2007a). Analyzing and detecting review spam. In ICDM (pp. 547–552). IEEE.
Jindal, N., & Liu, B. (2007b). Review spam detection. In Proceedings of the 16th international conference on world wide web (pp. 1189–1190). ACM.
Jindal, N., & Liu, B. (2008). Opinion spam and analysis. In Proceedings of the 2008 international conference on web search and data mining (pp. 219–230). ACM.
Jindal, N., Liu, B., & Lim, E. -P. (2010). Finding unusual review patterns using unexpected rules. In Proceedings of the 19th ACM international conference on information and knowledge management (pp. 1549–1552). ACM.
Kolhar, M. (2018). E-commerce review system to detect false reviews. Science and Engineering Ethics, 24, 1577–1588. https://doi.org/10.1007/s11948-017-9959-2.
Article Google Scholar
Lai, C., Xu, K., Lau, R. Y., Li, Y., & Jing, L. (2010). Toward a language modeling approach for consumer review spam detection. In 2010 IEEE 7th international conference on e-business engineering (ICEBE) (pp. 1–8). IEEE.
Lai, C., Xu, K., Lau, R. Y., Li, Y., & Song, D. (2010). High-order concept associations mining and inferential language modeling for online review spam detection. In 2010 IEEE international conference on data mining workshops (ICDMW) (pp. 1120–1127). IEEE.
Lau, R. Y., Liao, S., Kwok, R. C. W., Xu, K., Xia, Y., & Li, Y. (2011). Text mining and probabilistic language modeling for online review spam detecting. ACM Transactions on Management Information Systems, 2(4), 1–30.
Article Google Scholar
Li, H., Fei, G., Wang, S., Liu, B., Shao, W., Mukherjee, A., & Shao, J. (2017). Bimodal distribution and co-bursting in review spam detection. In Proceedings of the 26th international conference on world wide web (pp. 1063–1072). International World Wide Web Conferences Steering Committee.
Li, J., Ott, M., Cardie, C., & Hovy, E. H. (2014). Towards a general rule for identifying deceptive opinion spam. ACL, 1, 1566–1576.
Google Scholar
Li, L., Qin, B., Ren, W., & Liu, T. (2017). Document representation and feature combination for deceptive spam review detection. Neurocomputing, 254, 33–41.
Article Google Scholar
Li, Y., Lin, Y., Zhang, J., Li, J., & Zhao, L. (2015). Highlighting the fake reviews in review sequence with the suspicious contents and behaviours. Journal of Information & Computational Science, 12(4), 1615–1627.
Article Google Scholar
Lim, E. -P., Nguyen, V. -A., Jindal, N., Liu, B., & Lauw, H. W. (2010). Detecting product review spammers using rating behaviors. In Proceedings of the 19th ACM international conference on information and knowledge management (pp. 939–948). ACM.
Lu, Y., Zhang, L., Xiao, Y., & Li, Y. (2013). Simultaneously detecting fake reviews and review spammers using factor graph model. In Proceedings of the 5th annual ACM web science conference (pp. 225–233). ACM.
Markines, B., Cattuto, C., & Menczer, F. (2009). Social spam detection. In Proceedings of the 5th international workshop on adversarial information retrieval on the web (pp. 41–48). ACM.
Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Spotting opinion spammers using behavioral footprints. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 632–640). ACM.
Mukherjee, A., Liu, B., & Glance, N. (2012). Spotting fake reviewer groups in consumer reviews. In Proceedings of the 21st international conference on world wide web (pp. 191–200). ACM.
Ott, M., Choi, Y., Cardie, C., Hancock, J. T. (2011). Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (Vol. 1, pp. 309–319). Association for Computational Linguistics.
Rastogi, A., & Mehrotra, M. (2017). Opinion spam detection in online reviews. Journal of Information & Knowledge Management, 16(04), 1750036.
Article Google Scholar
Rayana, S., & Akoglu, L. (2015). Collective opinion spam detection: Bridging review networks and metadata. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 985–994). ACM.
Ren, Y., & Ji, D. (2017). Neural networks for deceptive opinion spam detection: An empirical study. Information Sciences, 385, 213–224.
Article Google Scholar
Rowland, C. H. (2002). Intrusion detection system. US Patent 6,405,318.
Roy, P. K., Singh, J. P., & Banerjee, S. (2020). Deep learning to filter SMS spam. Future Generation Computer Systems, 102, 524–533.
Article Google Scholar
Saumya, S., & Singh, J. P. (2018). Detection of spam reviews: A sentiment analysis approach. CSI Transactions on ICT, 6(2), 137–148.
Article Google Scholar
Saumya, S., Singh, J. P., Baabdullah, A. M., Rana, N. P., & Dwivedi, Y. K. (2018). Ranking online consumer reviews. Electronic Commerce Research and Applications, 29, 78–89.
Article Google Scholar
Saumya, S., Singh, J. P., & Dwivedi, Y. K. (2019). Predicting the helpfulness score of online reviews using convolutional neural network. Soft Computing. https://doi.org/10.1007/s00500-019-03851-5.
Article Google Scholar
Singh, J. P., Irani, S., Rana, N. P., Dwivedi, Y. K., Saumya, S., & Roy, P. K. (2017). Predicting the “helpfulness” of online consumer reviews. Journal of Business Research, 70, 346–355.
Article Google Scholar
Sundermeyer, M., Schlüter, R., & Ney, H. (2012). LSTM neural networks for language modeling. In Thirteenth annual conference of the international speech communication association.
Wang, G., Xie, S., Liu, B., & Yu, P. S. (2012). Identify online store review spammers via social review graph. ACM Transactions on Intelligent Systems and Technology (TIST), 3(4), 61.
Google Scholar
Wang, Z., Gu, S., & Xu, X. (2018). GSLDA: LDA-based group spamming detection in product reviews. Applied Intelligence, 48, 3094–3107. https://doi.org/10.1007/s10489-018-1142-1.
Article Google Scholar
Wang, Z., Hou, T., Song, D., Li, Z., & Kong, T. (2016). Detecting review spammer groups via bipartite graph projection. The Computer Journal, 59(6), 861–874.
Article Google Scholar
Xie, S., Wang, G., Lin, S., & Yu, P. S. (2012). Review spam detection via temporal pattern discovery. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 823–831). ACM.
Zhang, W., Du, Y., Yoshida, T., & Wang, Q. (2018). DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network. Information Processing & Management, 54(4), 576–592.
Article Google Scholar

Download references

Acknowledgements

The author would like to acknowledge the Ministry of Electronics and Information Technology (MeitY), Government of India for supporting the financial assistant during research work through “Visvesvaraya Ph.D. Scheme for Electronics and IT”.

Author information

Authors and Affiliations

Indian Institute of Information Technology Dharwad, Dharwad, India
Sunil Saumya
National Institute of Technology Patna, Patna, India
Jyoti Prakash Singh

Authors

Sunil Saumya
View author publications
You can also search for this author inPubMed Google Scholar
Jyoti Prakash Singh
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sunil Saumya.

Ethics declarations

Conflict of interest

Authors have no conflict of interest to disclose.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Sunil Saumya was formerly at National Institute of Technology Patna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saumya, S., Singh, J.P. Spam review detection using LSTM autoencoder: an unsupervised approach. Electron Commer Res 22, 113–133 (2022). https://doi.org/10.1007/s10660-020-09413-4

Download citation

Published: 13 May 2020
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10660-020-09413-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spam review detection using LSTM autoencoder: an unsupervised approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Detecting Fake Reviews Using Multiple Machine Learning Models: A Comparative Study

Survey of review spam detection using machine learning techniques

Online Spam Review Detection: A Survey of Literature

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now