Skip to main content

Part of the book series: Advances in Soft Computing ((AINSC,volume 53))

Abstract

Spam detection has become a necessity for successful email communications, security and convenience. This paper describes a learning process where the text of incoming emails is analysed and filtered based on the salient features identified. The method described has promising results and at the same time significantly better performance than other statistical and probabilistic methods. The salient features of emails are selected automatically based on functions combining word frequency and other discriminating matrices, and emails are then encoded into a representative vector model. Several classifiers are then used for identifying spam, and self-organising maps seem to give significantly better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Manomaisupat, P., Vrusias, B., Ahmad, K.: Categorization of Large Text Collections: Feature Selection for Training Neural Networks. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds.) IDEAL 2006. LNCS, vol. 4224, pp. 1003–1013. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Kohonen, T.: Self-organizing maps, 2nd edn. Springer, New York (1997)

    MATH  Google Scholar 

  3. Metsis, V., Androutsopoulos, I., Paliouras, G.: Spam Filtering with Naïve Bayes – Which Naïve Bayes? In: CEAS, 3rd Conf. on Email and AntiSpam, California, USA (2006)

    Google Scholar 

  4. Zhang, L., Zhu, J., Yao, T.: An Evaluation of Statistical Spam Filtering Techniques. ACM Trans. on Asian Language Information Processing 3(4), 243–269 (2004)

    Article  Google Scholar 

  5. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Learning for Text Categorization – Papers from the AAAI Workshop, Madison, Wisconsin, pp. 55–62 (1998)

    Google Scholar 

  6. Androutsopoulos, I., Paliouras, G., Karkaletsi, V., Sakkis, G., Spyropoulos, C.D., Stamatopoulos, P.: Learning to Filter Spam E-Mail: A Comparison of a Naïve Bayesian and a Memory-Based Approach. In: Proceedings of the Workshop Machine Learning and Textual Information Access. 4th European Conf. on KDD, Lyon, France, pp. 1–13 (2000)

    Google Scholar 

  7. Youn, S., McLeod, D.: Efficient Spam Email Filtering using Adaptive Ontology. In: 4th International Conf. on Information Technology, ITNG 2007, pp. 249–254 (2007)

    Google Scholar 

  8. Hunt, R., Carpinter, J.: Current and New Developments in Spam Filtering. In: 14th IEEE International Conference on Networks, ICON 2006, vol. 2, pp. 1–6 (2006)

    Google Scholar 

  9. Peng, F., Schuurmans, D., Wang, S.: Augmenting Naive Bayes Classifiers with Statistical Language Models. Information Retrieval 7, 317–345 (2004)

    Article  Google Scholar 

  10. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  11. Vrusias, B.: Combining Unsupervised Classifiers: A Multimodal Case Study, PhD thesis, University of Surrey (2004)

    Google Scholar 

  12. Drucker, H., Wu, D., Vapnik, V.N.: Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vrusias, B., Golledge, I. (2009). Adaptable Text Filters and Unsupervised Neural Classifiers for Spam Detection. In: Corchado, E., Zunino, R., Gastaldo, P., Herrero, Á. (eds) Proceedings of the International Workshop on Computational Intelligence in Security for Information Systems CISIS’08. Advances in Soft Computing, vol 53. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88181-0_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88181-0_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88180-3

  • Online ISBN: 978-3-540-88181-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics