Skip to main content

Does Sentiment Analysis Help in Bayesian Spam Filtering?

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9648))

Included in the following conference series:

Abstract

Unsolicited email campaigns remain as one of the biggest threats affecting millions of users per day. During the last years several techniques to detect unsolicited emails have been developed. Among all proposed automatic classification techniques, machine learning algorithms have achieved more success, obtaining detection rates up to a 96 % [1]. This work provides means to validate the assumption that being spam a commercial communication, the semantics of its contents are usually shaped with a positive meaning. We produce the polarity score of each message using sentiment classifiers, and then we compare spam filtering classifiers with and without the polarity score in terms of accuracy. This work shows that the top 10 results of Bayesian filtering classifiers have been improved, reaching to a 99.21 % of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://textblob.readthedocs.org.

  2. 2.

    http://www.cs.cornell.edu/People/pabo/movie-review-data/.

  3. 3.

    http://csmining.org/index.php/spam-email-datasets-.html.

References

  1. Malarvizhi, R.: Content-based spam filtering and detection algorithms-an efficient analysis & comparison 1 (2013)

    Google Scholar 

  2. KasperskyLab: Spam and phishing in 2015 q1 (2015). http://www.kaspersky.com/about/news/virus/2015/Spam-and-Phishing-in-Q1-New-domains-revitalize-old-spam

  3. Saadat, N.: Survey on spam filtering techniques. Commun. Netw. 3(3), 153–160 (2011)

    Article  Google Scholar 

  4. Cormack, G.V.: Email spam filtering: a systematic review. Found. Trends Inf. Retrieval 1(4), 335–455 (2007)

    Article  Google Scholar 

  5. Tretyakov, K.: Machine learning techniques in spam filtering. In: Data Mining Problem-oriented Seminar, MTAT, vol. 3, pp. 60–79 (2004)

    Google Scholar 

  6. Sanz, E.P., Hidalgo, J.M.G., Cortizo, J.C.: Email spam filtering. Adv. Comput. 74, 45–114 (2008)

    Article  Google Scholar 

  7. Teli, S., Biradar, S.: Effective spam detection method for email. In: International Conference on Advances in Engineering & Technology (2014)

    Google Scholar 

  8. Eberhardt, J.J.: Bayesian spam detection. University of Minnesota, Morris Undergraduate Journal, Scholarly Horizons (2015)

    Google Scholar 

  9. Liddy, E.: Natural language processing (2001)

    Google Scholar 

  10. Giyanani, R., Desai, M.: Spam detection using natural language processing. Int. J. Comput. Sci. Res. Technol. 1, 55–58 (2013)

    Google Scholar 

  11. Echeverria Briones, P.F., Altamirano Valarezo, Z.V., Pinto Astudillo, A.B., Sanchez Guerrero, J.D.C.: Text mining aplicado a la clasificación y distribución automática de correo electrónico y detección de correo spam (2009)

    Google Scholar 

  12. Lau, R.Y.K., Liao, S.Y., Kwok, R.C.W., Xu, K., Xia, Y., Li, Y.: Text mining and probabilistic language modeling for online review spam detection. ACM Trans. Manage. Inf. Syst. 2(4), 25:1–25:30 (2012)

    Google Scholar 

  13. Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 415–463. Springer, New York (2012)

    Chapter  Google Scholar 

  14. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  15. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, Stroudsburg, PA, USA, vol. 10, pp. 79–86. Association for Computational Linguistics (2002)

    Google Scholar 

  16. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 417–424. Association for Computational Linguistics, USA (2002)

    Google Scholar 

  17. Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6, pp. 417–422. Citeseer (2006)

    Google Scholar 

  18. Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010)

    Google Scholar 

  19. Ohana, B., Tierney, B.: Sentiment classification of reviews using sentiwordnet. In: 9th. IT & T Conference, p. 13 (2009)

    Google Scholar 

  20. Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the ACL (2004)

    Google Scholar 

Download references

Acknowledgments

This work has been partially funded by the Basque Department of Education, Language policy and Culture under the project SocialSPAM (PI_2014_1_102).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Enaitz Ezpeleta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Ezpeleta, E., Zurutuza, U., Gómez Hidalgo, J.M. (2016). Does Sentiment Analysis Help in Bayesian Spam Filtering?. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2016. Lecture Notes in Computer Science(), vol 9648. Springer, Cham. https://doi.org/10.1007/978-3-319-32034-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32034-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32033-5

  • Online ISBN: 978-3-319-32034-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics