Abstract
We propose dynamic anti-spam filtering methods for agglutinative languages in general and for Turkish in particular, based on Artificial Neural Networks (ANN) and Bayesian filters. The algorithms are adaptive and have two components. The first one deals with the morphology and the second one classifies the e-mails by using the roots. Two ANN structures, single layer perceptron and multi layer perceptron, are considered and the inputs to the networks are determined using binary and probabilistic models. For Bayesian classification, three approaches are employed: binary, probabilistic, and advance probabilistic models. In the experiments, a total of 750 e-mails (410 spam and 340 normal) were used and a success rate of about 90% was achieved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Spam–off the Menu? In: NISCC Quarterly Review, London, 14-17 (January-March 2003)
http://www.turk.internet.com/haber/yazigoster.php3?yaziid=8859
Androutsopoulos, I., Koutsias, J.: An Evaluation of Naive Bayesian Networks. In: Potamias, G., Moustakis, V., van Someren, M. (eds.) Machine Learning in the New Information Age, Barcelona Spain, pp. 9–17 (2000)
Apte, C., Damerau, F., Weiss, S.M.: Automated Learning of Decision Rules for Text Categorization. ACM Transactions on Information Systems 12-3, 233–251 (1994)
Cohen, W.: Learning Rules That Classify E-Mail. In: Hearst, M.A., Hirsh, H. (eds.) AAAI Spring Symposium on Machine Learning in Information Access, pp. 18–25. AAAI Press, Stanford California (1996)
Lewis, D.: Feature Selection and Feature Extraction for Text Categorization. In: DARPA Workshop on Speech and Natural Language, pp. 212–217. Morgan Kaufmann, Harriman, New York (1992)
Lewis, D., Croft, W.B.: Term Clustering of Syntactic Phrases. In: Vidick, J.L. (ed.) ACM SIGIR International Conference on Research and Development in Information Retrieval, Brussels Belgium, pp. 385–404 (1990)
Dagan, I., Karov, Y., Roth, D.: Mistake-Driven Learning in Text Categorization. In: Cardie, C., Weischedel, R. (eds.) Conference on Emprical Methods in Natural Language Processing, pp. 55–63. ACM, Providence (1997)
Güngör, T.: Computer Processing of Turkish: Morphological and Lexical Investigation. PhD Thesis. Bo_aziçi University, İstanbul (1995)
Bishop, C.: Neural Networks for Pattern Recognition. Oxford University, Oxford (1995)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Gama, J.: A Linear-Bayes Classifier. In: Monard, M.C., Sichman, J.S. (eds.) SBIA 2000 and IBERAMIA 2000. LNCS (LNAI), vol. 1952, pp. 269–279. Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Özgür, L., Güngör, T., Gürgen, F. (2004). Spam Mail Detection Using Artificial Neural Network and Bayesian Filter. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_74
Download citation
DOI: https://doi.org/10.1007/978-3-540-28651-6_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22881-3
Online ISBN: 978-3-540-28651-6
eBook Packages: Springer Book Archive