Abstract
Email is fast and cheap message transfer way, it has become widely used form of communication, due to the popularization of Internet and the increasing use of smart devices. However, an unsolicited kind of email known as spam has been appeared and caused major problems of the today’s Internet, by bringing financial damage to companies and annoying individual users. In this paper we present a new method to classify automatically legitimate email from spam, based on the combination of two improved versions of Support Vector Domain Description (SVDD), the first one aims to adjust the volume of the minimal spheres, while the second replaces the standard decision function of SVDD by a new improved one. An experimental evaluation of the proposed method is carried out on a benchmark spam email dataset. The experimental results demonstrate that the proposed method achieves high recognition rate with good generalization ability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
López-Herrera, A.G., Herrera-Viedma, E., Herrera, F.: A multiobjective evolutionary algorithm for spam e-mail filtering. In: Proceedings of the 3rd International Conference on Intelligent System and Knowledge Engineering, vol. 1, pp. 366–371. Xiamen, China (2008)
Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Proceedings of 3rd IEEE International Conference on Data Mining, pp. 541–544. Melbourne, Florida, USA (2003)
Tavakkoli, A., Nicolescu, M., Bebis, G., Nicolescu, M.: A support vector data description approach for background modeling in videos with quasi-stationary backgrounds. Int. J. Artif. Intell. Tools 17(4), 635–658 (2008)
Cui, B., Mondal, A., Shen, J., Cong, G., Tan, K.: On effective email classification via neural networks. In: Proceedings of DEXA, pp. 85–94. Copenhagen, Denmark (2005)
Tax, D.M.J., Duin, R.P.W.: Data domain description using support vectors. In: Proceedings of European Symposium on Artificial Neural Networks, Bruges (Belgium), pp. 251–256 (1999)
Tax, D.M.J., Duin, R.P.W.: Support Vector Data Description. Mach. Learn. 54, 45–66 (2004)
Tax, D.M.J., Duin, R.P.W.: Support vector domain description. Pattern Recogn. Lett. 20(11–13), 1191–1199 (1999)
Ferris Research: 2008. Industry Statistics-Ferris Research. http://www.ferris.com/research-library/industry-statistics/, Washington, DC, U.S.A, 2003
Drucker, H., Wu, S., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Trans. Neural Netw. 10(5), 1048–1054 (1999)
Chen, J., Xiao, G., Gao, F., Zhang, Y.: Spam filtering method based on an artificial immune system. In: Proceedings of the IEEE International Conference on Multimedia and Information Technology (MMIT), pp. 169–171. Three Gorges, (2008)
El Boujnouni, M., Jedra, M., Zahid, N.: A small sphere and parametric volume for support vector domain description. J. Theor. Appl. Inf. Technol. 46(1), 471–478 (2012)
El Boujnouni, M., Jedra, M., Zahid, N.: New decision function for support vector data description. J. Inf. Syst. Manage. 2(3), 105–115 (2012)
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Proceeding of the AAAI Workshop on Learning for Text Categorization, pp. 98–105. Madison, Wisconsin (1998)
Pantel, P., Lin, D.: Spamcop: a spam classification and organization program. In: Proceeding of the AAAI Workshop on Learning for Text Categorization, pp. 95–98. Madison, Wisconsin (1998)
Radicati, Email statistics report, 2009–2013
Fawcett, T., In vivo spam filtering: a challenge problem for data mining. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Explorations, vol. 5, No.°2. Washington, DC, USA (2003)
Oda, T., White, T., Developing an immunity to spam. In: Genetic and Evolutionary Computation Conference, pp. 231–242. Chicago, IL (2003)
UCI repository of machine learning databases. http://archive.ics.uci.edu/ml/
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Carreras, X., Marquez, L., Salgado, J.G.: Boosting trees for anti-spam email filtering. In: Proceedings of the 4th International Conference on Recent Advances in Natural Language Processing, pp. 58–64. Tzigov Chark, BG (2001)
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf B., Burges C., Smola A. (eds.) Advances in Kernels Methods: Support Vector Learning, MIT Press, Cambridge, Mass (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Boujnouni, M.E., Jedra, M., Zahid, N. (2015). Email Spam Filtering Using the Combination of Two Improved Versions of Support Vector Domain Description. In: Herrero, Á., Baruque, B., Sedano, J., Quintián, H., Corchado, E. (eds) International Joint Conference. CISIS 2015. Advances in Intelligent Systems and Computing, vol 369. Springer, Cham. https://doi.org/10.1007/978-3-319-19713-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-19713-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19712-8
Online ISBN: 978-3-319-19713-5
eBook Packages: EngineeringEngineering (R0)