skip to main content
10.1145/3022227.3022284acmconferencesArticle/Chapter ViewAbstractPublication PagesicuimcConference Proceedingsconference-collections
short-paper

Analysis and comparison of fax spam detection algorithms

Published:05 January 2017Publication History

ABSTRACT

Spam detection is one of the important problems in these days. Many spam detection methods were proposed, but fax spam detection is not popular. It not easy to directly use existing content-based spam detection methods for fax documents because the documents are processed as image rather than text. In this paper, we propose a fax spam detection framework which is based on keyword patterns by using an Optical Character Recognition (OCR) technique. To demonstrate how effective the proposed framework is, we analyze and compare three fax spam detection algorithms (rule based method, SVM based method, and naïve Bayesian based method) with 219 normal and 212 spam documents. Our recommendation is to use naïve Bayesian based method which is capable of achieving an accuracy of 92.49%.

References

  1. S.M. Lee, D.S. Kim, J.H. Kim, J.S. Park, "Spam Detection Using Feature Selection and Parameters Optimization," In Proceedings of the 2010 International Conference on Complex, Intelligent and Software Intensive Systems (CISIS), pp. 883--888, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M.F. Saeedian, H. Beigy, "Spam Detection Using Dynamic Weighted Voting Based on Clustering," In Proceedings of the Second International Symposium on Intelligent Information Technology Application, Vol. 2, pp. 122--126, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J.W. Yoon, H. Kim, J.H. Huh, "Hybrid Spam Filtering for Mobile Communication," Computers & Security, Vol. 29, Issue 4, pp. 446--459, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S.J. Soman, "A Survey on Behaviors Exhibited by Spammers in Popular Social Media Networks," In Proceedings of the 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), pp. 1--6, 2016.Google ScholarGoogle Scholar
  5. Q. Li, H. Yu, P. Li, "A Method for Spam Fax Detection and Classification Based on Clustering," In Proceedings of the 2010 2nd International Workshop on Intelligent Systems and Applications (ISA), pp. 1--4, 2010.Google ScholarGoogle Scholar
  6. M. Iqbal, M. M. Abid, M. Ahmad, and F. Fhurshid, "Study on the Effectiveness of Spam Detection Technologies," International Journal of Information Technology and Computer Science, Vol. 1, pp. 11--21, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  7. T.S. Guzella, W.M. Caminhas, "A Review of Machine Learning Approaches to Spam Filtering," Expert Systems with Applications, Vol. 36, Issue 7, pp. 10206--10222, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Blanzieri, A. Bryl, "A Survey of Learning-based Techniques of Email Spam Filtering," Journal Artificial Intelligence Review, Vol. 29 Issue 1, pp. 63--92 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. http://nlp.kookmin.ac.kr/HAM/kor/Google ScholarGoogle Scholar
  10. T.M. Mahmoud, A.M. Mahfouz, "SMS Spam Filtering Technique Based on Artificial Immune System," International Journal of Computer Science Issues (IJCSI), Vol. 9, Issue 2, 2012.Google ScholarGoogle Scholar
  11. C. Cortes, V. Vapnik, "Support-vector Networks," Journal Machine Learning, Vol. 20, Issue 3, pp. 273--297, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Joachims, "Making Large-scale SVM Learning Practical," Advances in kernel methods, pp. 169--184, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Shalev-Shwartz, N. Srebro, "SVM Optimization: Inverse Dependence on Training Set Size," In Proceedings of the 25th International Conference on Machine Learning, pp. 928--935, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D.D. Lewis, "Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval," In Proceedings of the 10th European Conference on Machine Learning, pp. 4--15, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Ghosh, S. Mondal, B. Ghosh, "A Comparative Study of Breast Cancer Detection Based on SVM and MLP BPN Classifier," In Proceedings of the 2014 First International Conference on Automation, Control, Energy and Systems (ACES), pp. 1--4, 2014.Google ScholarGoogle Scholar

Index Terms

  1. Analysis and comparison of fax spam detection algorithms

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication
      January 2017
      746 pages
      ISBN:9781450348881
      DOI:10.1145/3022227

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 January 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      IMCOM '17 Paper Acceptance Rate113of366submissions,31%Overall Acceptance Rate213of621submissions,34%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader