ABSTRACT
Spam detection is one of the important problems in these days. Many spam detection methods were proposed, but fax spam detection is not popular. It not easy to directly use existing content-based spam detection methods for fax documents because the documents are processed as image rather than text. In this paper, we propose a fax spam detection framework which is based on keyword patterns by using an Optical Character Recognition (OCR) technique. To demonstrate how effective the proposed framework is, we analyze and compare three fax spam detection algorithms (rule based method, SVM based method, and naïve Bayesian based method) with 219 normal and 212 spam documents. Our recommendation is to use naïve Bayesian based method which is capable of achieving an accuracy of 92.49%.
- S.M. Lee, D.S. Kim, J.H. Kim, J.S. Park, "Spam Detection Using Feature Selection and Parameters Optimization," In Proceedings of the 2010 International Conference on Complex, Intelligent and Software Intensive Systems (CISIS), pp. 883--888, 2010. Google ScholarDigital Library
- M.F. Saeedian, H. Beigy, "Spam Detection Using Dynamic Weighted Voting Based on Clustering," In Proceedings of the Second International Symposium on Intelligent Information Technology Application, Vol. 2, pp. 122--126, 2008. Google ScholarDigital Library
- J.W. Yoon, H. Kim, J.H. Huh, "Hybrid Spam Filtering for Mobile Communication," Computers & Security, Vol. 29, Issue 4, pp. 446--459, 2010. Google ScholarDigital Library
- S.J. Soman, "A Survey on Behaviors Exhibited by Spammers in Popular Social Media Networks," In Proceedings of the 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), pp. 1--6, 2016.Google Scholar
- Q. Li, H. Yu, P. Li, "A Method for Spam Fax Detection and Classification Based on Clustering," In Proceedings of the 2010 2nd International Workshop on Intelligent Systems and Applications (ISA), pp. 1--4, 2010.Google Scholar
- M. Iqbal, M. M. Abid, M. Ahmad, and F. Fhurshid, "Study on the Effectiveness of Spam Detection Technologies," International Journal of Information Technology and Computer Science, Vol. 1, pp. 11--21, 2016.Google ScholarCross Ref
- T.S. Guzella, W.M. Caminhas, "A Review of Machine Learning Approaches to Spam Filtering," Expert Systems with Applications, Vol. 36, Issue 7, pp. 10206--10222, 2009. Google ScholarDigital Library
- E. Blanzieri, A. Bryl, "A Survey of Learning-based Techniques of Email Spam Filtering," Journal Artificial Intelligence Review, Vol. 29 Issue 1, pp. 63--92 2008. Google ScholarDigital Library
- http://nlp.kookmin.ac.kr/HAM/kor/Google Scholar
- T.M. Mahmoud, A.M. Mahfouz, "SMS Spam Filtering Technique Based on Artificial Immune System," International Journal of Computer Science Issues (IJCSI), Vol. 9, Issue 2, 2012.Google Scholar
- C. Cortes, V. Vapnik, "Support-vector Networks," Journal Machine Learning, Vol. 20, Issue 3, pp. 273--297, 1995. Google ScholarDigital Library
- T. Joachims, "Making Large-scale SVM Learning Practical," Advances in kernel methods, pp. 169--184, 1999. Google ScholarDigital Library
- S. Shalev-Shwartz, N. Srebro, "SVM Optimization: Inverse Dependence on Training Set Size," In Proceedings of the 25th International Conference on Machine Learning, pp. 928--935, 2008. Google ScholarDigital Library
- D.D. Lewis, "Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval," In Proceedings of the 10th European Conference on Machine Learning, pp. 4--15, 1998. Google ScholarDigital Library
- S. Ghosh, S. Mondal, B. Ghosh, "A Comparative Study of Breast Cancer Detection Based on SVM and MLP BPN Classifier," In Proceedings of the 2014 First International Conference on Automation, Control, Energy and Systems (ACES), pp. 1--4, 2014.Google Scholar
Index Terms
- Analysis and comparison of fax spam detection algorithms
Recommendations
Modeling host-based detection and active worm containment
CNS '08: Proceedings of the 11th communications and networking simulation symposiumRecent advancements in Internet worms propagation techniques has generated interest in the development of appropriate defense techniques against such worms. Modeling the behaviour of worm defense techniques to better understand and measure their defense ...
Spam Detection: Technologies for spam detection
The underlying problem with spam detection is how to define spam. Simon Heron of Network Box examines current techniques for defining and detecting spam and how spamming itself has evolved in order to avoid detection. From early whitelisting and ...
A social-spam detection framework
CEAS '11: Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam ConferenceSocial networks such as Facebook, MySpace, and Twitter have become increasingly important for reaching millions of users. Consequently, spammers are increasing using such networks for propagating spam. Existing filtering techniques such as collaborative ...
Comments