Abstract
A three-way decisions solution based on Bayesian decision theory for filtering spam emails is examined in this paper. Compared to existed filtering systems, the spam filtering is no longer viewed as a binary classification problem. Each incoming email is accepted as a legitimate or rejected as a spam or undecided as a further-exam email by considering the misclassification cost. The three-way decisions solution for spam filtering can reduce the error rate of classifying a legitimate email to spam, and provide a more meaningful decision procedure for users. The solution is not restricted to a specific classifier. Experimental results on several corpus show that the three-way decisions solution can get a better total cost ratio value and a lower weighted error.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C.D., Stamatopoulos, P.: Learning to filter spam e-mail: A comparison of a naive bayesian and a memory-based approach. In: 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 1–13 (2000)
Carreras, X., Marquez, L.: Boosting trees for anti-spam email filtering. In: European Conference on Recent Advances in NLP (2001)
Chow, C.K.: On optimum recognition error and reject tradeoff. IEEE Transcations on Information Theory 16(1), 41–46 (1970)
Domingos, P., Pazzani, M.: Beyond independece: Conditions for the optimality of the simple Bayesian classifier. In: 13th International Conference on Machine Learning, pp. 105–112 (1996)
Drucker, H., Wu, D.H., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)
Elkan, C.: The foundations of cost-sensitive learning. In: 17th International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)
Metsis, V., Androutsopoulos, I., Paliouras, G.: Spam filtering with naive bayes-which naive bayes? In: 3rd Conference on Email and Anti-Spam (2006)
Mitchell, T.M.: Machine Learning. McGraw-Hill (1997)
Pauker, S.G., Kassirer, J.P.: The threshold approach to clinical decision making. New England Journal of Medicine 302, 1109–1117 (1980)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayesian approach to filtering junk e-mail. In: Learning for Text Categorization-Papers from the AAAI Workshop, pp. 55–62 (1996)
Schneider, K.M.: A comparison of event models for Naive Bayes anti-spam e-mail filtering. In: 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 307–314 (2003)
Yao, Y., Wong, S.K.M., Lingras, P.: A decision-theoretic rough set model. Methodologies for Intelligent Systems 5, 17–24 (1992)
Yao, Y.: Three-way decisions with probabilistic rough sets. Information Sciences 180, 341–353 (2010)
Zhou, B., Yao, Y., Luo, J.: A Three-Way Decision Approach to Email Spam Filtering. In: Farzindar, A., Kešelj, V. (eds.) Canadian AI 2010. LNCS, vol. 6085, pp. 28–39. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jia, X., Zheng, K., Li, W., Liu, T., Shang, L. (2012). Three-Way Decisions Solution to Filter Spam Email: An Empirical Study. In: Yao, J., et al. Rough Sets and Current Trends in Computing. RSCTC 2012. Lecture Notes in Computer Science(), vol 7413. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32115-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-32115-3_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32114-6
Online ISBN: 978-3-642-32115-3
eBook Packages: Computer ScienceComputer Science (R0)