skip to main content
10.1145/2905055.2905081acmotherconferencesArticle/Chapter ViewAbstractPublication PagesictcsConference Proceedingsconference-collections
research-article

Finding of Review Spam through "Corleone, Review Genre, Writing Style and Review Text Detail Features"

Published: 04 March 2016 Publication History

Abstract

In the era of e-commerce, customers are trust on on-line review. Reviews help them to make right decisions to buy a product or hire a service. Spammer are writes the fake review to promote or demote the target products. Reviews are spam or non-spam; this is the biggest problem for E-commerce business. Moreover, the spam detection problem is complex task. Spammers are inventing new methods to writing spam reviews. It cannot be recognized easily. In this paper, we have extracted some new writing style features like, attractive text ratio and function word ratio and corleone-based features like, lexical validity and text like fraction and compared with already existing features. We have applied support vector machine (SVM), logistic regression, random forest, Jrip, functional tree, Naive Bayes, J48, PART algorithms for classification of review as a spam or non-spam. The SVM, random forest and logistic regression gives the 68% accuracy.

References

[1]
S. Afroz, M. Brennan, and R. Greenstadt. Detecting hoaxes, frauds, and deception in writing style online. In Security and Privacy (SP), 2012 IEEE Symposium on, pages 461--475. IEEE, 2012.
[2]
S. Banerjee and A. Y. Chua. Applauses in hotel reviews: Genuine or deceptive? In Science and Information Conference (SAI), 2014, pages 938--942. IEEE, 2014.
[3]
S. Banerjee and A. Y. Chua. A linguistic framework to distinguish between genuine and deceptive online reviews. In Proceedings of the International Conference on Internet Computing and Web Services, 2014.
[4]
S. Banerjee and A. Y. Chua. A study of manipulative and authentic negative reviews. In Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication, page 76. ACM, 2014.
[5]
R. K. Dewang and A. Singh. Identification of fake reviews using new set of lexical and syntactic features. In Proceedings of the Sixth International Conference on Computer and Communication Technology 2015, pages 115--119. ACM, 2015.
[6]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10--18, 2009.
[7]
A. Heydari, M. ali Tavakoli, N. Salim, and Z. Heydari. Detection of review spam: A survey. Expert Systems with Applications, 42(7):3634--3642, 2015.
[8]
N. Hu, I. Bose, N. S. Koh, and L. Liu. Manipulation of online reviews: An analysis of ratings, readability, and sentiments. Decision Support Systems, 52(3):674--684, 2012.
[9]
N. Jindal and B. Liu. Analyzing and detecting review spam. In Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on, pages 547--552. IEEE, 2007.
[10]
N. Jindal and B. Liu. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining, pages 219--230. ACM, 2008.
[11]
R. Y. Lau, S. S. Liao, and K. Xu. An empirical study of online consumer review spam: A design science approach. In ICIS, volume 2010, pages 103--123, 2010.
[12]
F. Li, M. Huang, Y. Yang, and X. Zhu. Learning to identify review spam. In IJCAI Proceedings-International Joint Conference on Artificial Intelligence, volume 22, page 2488, 2011.
[13]
Y. Lin, T. Zhu, H. Wu, J. Zhang, X. Wang, and A. Zhou. Towards online anti-opinion spam: Spotting fake reviews from the review sequence. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, pages 261--264. IEEE, 2014.
[14]
B. Liu. Sentiment analysis and opinion mining (synthesis lectures on human language technologies). Morgan & Claypool Publishers, 2012.
[15]
M. Ott, C. Cardie, and J. T. Hancock. Negative deceptive opinion spam. In HLT-NAACL, pages 497--501, 2013.
[16]
M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 309--319. Association for Computational Linguistics, 2011.
[17]
J. Piskorski, M. Sydow, and D. Weiss. Exploring linguistic features for web spam detection: a preliminary study. In Proceedings of the 4th international workshop on Adversarial information retrieval on the web, pages 25--28. ACM, 2008.
[18]
S. Shojaee, M. A. A. Murad, A. Bin Azman, N. M. Sharef, and S. Nadali. Detecting deceptive reviews using lexical and syntactic features. In Intelligent Systems Design and Applications (ISDA), 2013 13th International Conference on, pages 53--58. IEEE, 2013.
[19]
K. Toutanova, D. Klein, C. D. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pages 173--180. Association for Computational Linguistics, 2003.
[20]
R. Zheng, J. Li, H. Chen, and Z. Huang. A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology, 57(3):378--393, 2006.

Cited By

View all
  • (2022)A comprehensive survey of various methods in opinion spam detectionMultimedia Tools and Applications10.1007/s11042-022-13702-582:9(13199-13239)Online publication date: 5-Sep-2022
  • (2019)Exploring Writing Pattern with Pop Culture Ingredients for Social User Modeling2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8852187(1-8)Online publication date: Jul-2019
  • (2019)isAnon: Flow-Based Anonymity Network Traffic Identification Using Extreme Gradient Boosting2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8851964(1-8)Online publication date: Jul-2019
  • Show More Cited By
  1. Finding of Review Spam through "Corleone, Review Genre, Writing Style and Review Text Detail Features"

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies
      March 2016
      843 pages
      ISBN:9781450339629
      DOI:10.1145/2905055
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 March 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Lexical Features
      2. Review Spam
      3. Supervised Learning Algorithms

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ICTCS '16

      Acceptance Rates

      Overall Acceptance Rate 97 of 270 submissions, 36%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 13 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)A comprehensive survey of various methods in opinion spam detectionMultimedia Tools and Applications10.1007/s11042-022-13702-582:9(13199-13239)Online publication date: 5-Sep-2022
      • (2019)Exploring Writing Pattern with Pop Culture Ingredients for Social User Modeling2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8852187(1-8)Online publication date: Jul-2019
      • (2019)isAnon: Flow-Based Anonymity Network Traffic Identification Using Extreme Gradient Boosting2019 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2019.8851964(1-8)Online publication date: Jul-2019
      • (2018)A Systematic Review of Time Series Based Spam Identification TechniquesProceedings of the Future Technologies Conference (FTC) 201810.1007/978-3-030-02686-8_33(435-443)Online publication date: 18-Oct-2018
      • (2018)Temporal Spam Identification: A Multifaceted Approach to Identifying Review SpamIntelligent Systems and Applications10.1007/978-3-030-01057-7_58(773-787)Online publication date: 8-Nov-2018

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media