Abstract
Malicious spam is one of the major problems of the Internet nowadays. It brings financial damage to companies and security threat to governments and organizations. Most recent spam emails contain URLs that redirect spam receivers to malicious Web servers. In this paper, we propose an online machine learning based malicious spam email detection system. The term-weighting scheme represents each spam email. These feature vectors are then used as the input of the classifier. The learning is periodically performed to update the classifier so that the system provides increased adaptability to take account of spam emails whose contents change from time to time. A real data set is labeled by the SPIKE system which is developed by NICT. Evaluation experiments show that the detection system is efficient and accurate to identify malicious spam emails.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Prabhakar, R., Basavaraju, M.: A Novel Method of Spam Mail Detection Using Text Based Clustering Approach. Phil. Trans. Roy. Soc. London A247, 529–551 (2010)
Internet 2012 in Numbers, http://royal.pingdom.com/2013/01/16/internet-2012-in-numbers
Inoue, D., Eto, M., Yoshioka, K., Baba, S., Suzuki, K., Nakazato, J., Ohtaka, K., Nakao, K.: Nicter: An Incident Analysis System Toward Binding Network Monitoring with Malware Analysis. In: WOMBAT Workshop on Information Security Threats Data Collection and Sharing (WISTDCS), pp. 58–66 (2008)
Nakao, K., Yoshioka, K., Inoue, D., Eto, M.: A Novel Concept of Network Incident Analysis based on Multi-layer Observations of Malware Activities. In: The 2nd Joint Workshop on Information Security (JWIS 2007), pp. 267–279 (2007)
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Communications of the ACM 18(11), 613–620 (1975)
Guzella, T.S., Caminhas, W.M.: A review of machine learning approaches to spam filtering. Expert Syst. Appl. 36, 10206–10222 (2009)
Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets, pp. 1–17 (2011)
Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm support vector machines. Advances in Neural Information Processing Systems 16, 49–56 (2004)
Tretyakov, K.: Machine Learning Techniques in Spam Filtering. Technical Report, Institute of Computer Science, University of Tartu (2004)
Chang, Y.W., Lin, C.J.: Feature Ranking using linear SVM. In: JMLR Workshop and Conference Proceedings, vol. 3, pp. 53–64 (2008)
Lewis, D.D.: Evaluating and optimizing autonomous text classification systems. In: Fox, E.A., Ingwersen, P., Fidel, R. (eds.) Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, pp. 246–254 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Dai, Y., Tada, S., Ban, T., Nakazato, J., Shimamura, J., Ozawa, S. (2014). Detecting Malicious Spam Mails: An Online Machine Learning Approach. In: Loo, C.K., Yap, K.S., Wong, K.W., Beng Jin, A.T., Huang, K. (eds) Neural Information Processing. ICONIP 2014. Lecture Notes in Computer Science, vol 8836. Springer, Cham. https://doi.org/10.1007/978-3-319-12643-2_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-12643-2_45
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12642-5
Online ISBN: 978-3-319-12643-2
eBook Packages: Computer ScienceComputer Science (R0)