skip to main content
10.1145/2905055.2905081acmotherconferencesArticle/Chapter ViewAbstractPublication PagesictcsConference Proceedingsconference-collections
research-article

Finding of Review Spam through "Corleone, Review Genre, Writing Style and Review Text Detail Features"

Authors Info & Claims
Published:04 March 2016Publication History

ABSTRACT

In the era of e-commerce, customers are trust on on-line review. Reviews help them to make right decisions to buy a product or hire a service. Spammer are writes the fake review to promote or demote the target products. Reviews are spam or non-spam; this is the biggest problem for E-commerce business. Moreover, the spam detection problem is complex task. Spammers are inventing new methods to writing spam reviews. It cannot be recognized easily. In this paper, we have extracted some new writing style features like, attractive text ratio and function word ratio and corleone-based features like, lexical validity and text like fraction and compared with already existing features. We have applied support vector machine (SVM), logistic regression, random forest, Jrip, functional tree, Naive Bayes, J48, PART algorithms for classification of review as a spam or non-spam. The SVM, random forest and logistic regression gives the 68% accuracy.

References

  1. S. Afroz, M. Brennan, and R. Greenstadt. Detecting hoaxes, frauds, and deception in writing style online. In Security and Privacy (SP), 2012 IEEE Symposium on, pages 461--475. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Banerjee and A. Y. Chua. Applauses in hotel reviews: Genuine or deceptive? In Science and Information Conference (SAI), 2014, pages 938--942. IEEE, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  3. S. Banerjee and A. Y. Chua. A linguistic framework to distinguish between genuine and deceptive online reviews. In Proceedings of the International Conference on Internet Computing and Web Services, 2014.Google ScholarGoogle Scholar
  4. S. Banerjee and A. Y. Chua. A study of manipulative and authentic negative reviews. In Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication, page 76. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. K. Dewang and A. Singh. Identification of fake reviews using new set of lexical and syntactic features. In Proceedings of the Sixth International Conference on Computer and Communication Technology 2015, pages 115--119. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10--18, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Heydari, M. ali Tavakoli, N. Salim, and Z. Heydari. Detection of review spam: A survey. Expert Systems with Applications, 42(7):3634--3642, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Hu, I. Bose, N. S. Koh, and L. Liu. Manipulation of online reviews: An analysis of ratings, readability, and sentiments. Decision Support Systems, 52(3):674--684, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Jindal and B. Liu. Analyzing and detecting review spam. In Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on, pages 547--552. IEEE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. Jindal and B. Liu. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining, pages 219--230. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Y. Lau, S. S. Liao, and K. Xu. An empirical study of online consumer review spam: A design science approach. In ICIS, volume 2010, pages 103--123, 2010.Google ScholarGoogle Scholar
  12. F. Li, M. Huang, Y. Yang, and X. Zhu. Learning to identify review spam. In IJCAI Proceedings-International Joint Conference on Artificial Intelligence, volume 22, page 2488, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Lin, T. Zhu, H. Wu, J. Zhang, X. Wang, and A. Zhou. Towards online anti-opinion spam: Spotting fake reviews from the review sequence. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, pages 261--264. IEEE, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  14. B. Liu. Sentiment analysis and opinion mining (synthesis lectures on human language technologies). Morgan & Claypool Publishers, 2012.Google ScholarGoogle Scholar
  15. M. Ott, C. Cardie, and J. T. Hancock. Negative deceptive opinion spam. In HLT-NAACL, pages 497--501, 2013.Google ScholarGoogle Scholar
  16. M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 309--319. Association for Computational Linguistics, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Piskorski, M. Sydow, and D. Weiss. Exploring linguistic features for web spam detection: a preliminary study. In Proceedings of the 4th international workshop on Adversarial information retrieval on the web, pages 25--28. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Shojaee, M. A. A. Murad, A. Bin Azman, N. M. Sharef, and S. Nadali. Detecting deceptive reviews using lexical and syntactic features. In Intelligent Systems Design and Applications (ISDA), 2013 13th International Conference on, pages 53--58. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  19. K. Toutanova, D. Klein, C. D. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pages 173--180. Association for Computational Linguistics, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Zheng, J. Li, H. Chen, and Z. Huang. A framework for authorship identification of online messages: Writing-style features and classification techniques. Journal of the American Society for Information Science and Technology, 57(3):378--393, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Finding of Review Spam through "Corleone, Review Genre, Writing Style and Review Text Detail Features"

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies
        March 2016
        843 pages
        ISBN:9781450339629
        DOI:10.1145/2905055

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 March 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate97of270submissions,36%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader