Adaptable Text Filters and Unsupervised Neural Classifiers for Spam Detection

Vrusias, Bogdan; Golledge, Ian

doi:10.1007/978-3-540-88181-0_25

Bogdan Vrusias⁴ &
Ian Golledge⁴

Part of the book series: Advances in Soft Computing ((AINSC,volume 53))

779 Accesses
1 Citations

Abstract

Spam detection has become a necessity for successful email communications, security and convenience. This paper describes a learning process where the text of incoming emails is analysed and filtered based on the salient features identified. The method described has promising results and at the same time significantly better performance than other statistical and probabilistic methods. The salient features of emails are selected automatically based on functions combining word frequency and other discriminating matrices, and emails are then encoded into a representative vector model. Several classifiers are then used for identifying spam, and self-organising maps seem to give significantly better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Manomaisupat, P., Vrusias, B., Ahmad, K.: Categorization of Large Text Collections: Feature Selection for Training Neural Networks. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds.) IDEAL 2006. LNCS, vol. 4224, pp. 1003–1013. Springer, Heidelberg (2006)
Chapter Google Scholar
Kohonen, T.: Self-organizing maps, 2nd edn. Springer, New York (1997)
MATH Google Scholar
Metsis, V., Androutsopoulos, I., Paliouras, G.: Spam Filtering with Naïve Bayes – Which Naïve Bayes? In: CEAS, 3rd Conf. on Email and AntiSpam, California, USA (2006)
Google Scholar
Zhang, L., Zhu, J., Yao, T.: An Evaluation of Statistical Spam Filtering Techniques. ACM Trans. on Asian Language Information Processing 3(4), 243–269 (2004)
Article Google Scholar
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Learning for Text Categorization – Papers from the AAAI Workshop, Madison, Wisconsin, pp. 55–62 (1998)
Google Scholar
Androutsopoulos, I., Paliouras, G., Karkaletsi, V., Sakkis, G., Spyropoulos, C.D., Stamatopoulos, P.: Learning to Filter Spam E-Mail: A Comparison of a Naïve Bayesian and a Memory-Based Approach. In: Proceedings of the Workshop Machine Learning and Textual Information Access. 4th European Conf. on KDD, Lyon, France, pp. 1–13 (2000)
Google Scholar
Youn, S., McLeod, D.: Efficient Spam Email Filtering using Adaptive Ontology. In: 4th International Conf. on Information Technology, ITNG 2007, pp. 249–254 (2007)
Google Scholar
Hunt, R., Carpinter, J.: Current and New Developments in Spam Filtering. In: 14th IEEE International Conference on Networks, ICON 2006, vol. 2, pp. 1–6 (2006)
Google Scholar
Peng, F., Schuurmans, D., Wang, S.: Augmenting Naive Bayes Classifiers with Statistical Language Models. Information Retrieval 7, 317–345 (2004)
Article Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523 (1988)
Article Google Scholar
Vrusias, B.: Combining Unsupervised Classifiers: A Multimodal Case Study, PhD thesis, University of Surrey (2004)
Google Scholar
Drucker, H., Wu, D., Vapnik, V.N.: Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing, Faculty of Electronic and Physical Sciences, University of Surrey, Guildford, UK
Bogdan Vrusias & Ian Golledge

Authors

Bogdan Vrusias
View author publications
You can also search for this author in PubMed Google Scholar
Ian Golledge
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidad de Burgos, Burgos, Spain
Emilio Corchado
University of Genova, Genova, Italy
Rodolfo Zunino & Paolo Gastaldo &
Escuela Politécnica Superior, Burgos, Spain
Álvaro Herrero

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vrusias, B., Golledge, I. (2009). Adaptable Text Filters and Unsupervised Neural Classifiers for Spam Detection. In: Corchado, E., Zunino, R., Gastaldo, P., Herrero, Á. (eds) Proceedings of the International Workshop on Computational Intelligence in Security for Information Systems CISIS’08. Advances in Soft Computing, vol 53. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88181-0_25

Download citation

DOI: https://doi.org/10.1007/978-3-540-88181-0_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88180-3
Online ISBN: 978-3-540-88181-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics