Skip to main content
Log in

A survey of image spamming and filtering techniques

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Many techniques have been proposed to combat the upsurge in image-based spam. All the proposed techniques have the same target, trying to avoid the image spam entering our inboxes. Image spammers avoid the filter by different tricks and each of them needs to be analyzed to determine what facility the filters need to have for overcoming the tricks and not allowing spammers to full our inbox. Different tricks give rise to different techniques. This work surveys image spam phenomena from all sides, containing definitions, image spam tricks, anti image spam techniques, data set, etc. We describe each image spamming trick separately, and by perusing the methods used by researchers to combat them, a classification is drawn in three groups: header-based, content-based, and text-based. Finally, we discus the data sets which researchers use in experimental evaluation of their articles to show the accuracy of their ideas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aradhye HB, Myers GK, Herson JA (2005) Image analysis for efficient categorization of image-based spam e-mail. In: Eight international conference on document analysis and recognition (ICDAR’05), IEEE, Korea

  • Biggio B, Fumera G, Pillai I, Roli F (2007) Image spam filtering using visual information. In: The 14th international conference on image analysis and processing, Modena, Italy, 10–14 September 2007. IEEE Computer Society, pp 105–110

  • Biggio B, Fumera G, Pillai I, Roli F (2007) Image spam filtering by content obscuring detection. In: The 4th conference on email and AntiSpam, CEAS2007, Mountain View, California, USA, August 2007

  • Biggio B, Fumera G, Pillai I, Roli F (2008) Improving image spam filtering using image text features. In: Fifth conference on email and anti-spam (CEAS 2008), Mountain View, CA, USA

  • Biggio B, Fumera G, Pillai I, Roli F (2011) A survey and experimental evaluation of image spam filtering techniques. Pattern Recogn Lett (in press)

  • Blanzieri E, Bryl A (2009) A survey of learning-based techniques of email spam filtering. J Artif Intell Rev

  • Chen W, Zhang C (2009) Image spam clustering—an unsupervised approach. In: Proceedings of the first ACM workshop on multimedia in forensics, China, October 2009

  • Cheng H, Qin Z, Fu C, Wang Y (2010) Novel spam image filtering framework with multi-label classification. In: International conference on communications, circuits and systems (ICCCAS), China

  • Chew M, Tygar JD (2004) Image recognition CAPTCHAs. In: 7th International information security conference, ISC2004, Palo Alto, CA, USA, September 2004

  • Dredze M, Gevaryahu R, Elias-Bachrach A (2007) Learning fast classifiers for image spam. In: Proceedings of the 4th conference on email and anti-spam (CEAS), California, USA

  • Fritsch Ch, Netter M, Reisser A, Pernul G (2010) Attacking image recognition captchas. In: The 7th international conference, TrustBus 2010, Bilbao, Spain, August, 2010

  • Fumera G, Pillai I, Roli F (2006) Spam filtering based on the analysis of text information embedded into images. J Mach Learn Res 7: 2699–2720

    Google Scholar 

  • Fumera G, Pillai I, Roli F, Biggio B (2007) Image spam filtering using textual and visual information. In: The MIT spam conference 2007, Cambridge, USA, March 2007

  • Gao Y (2009) Choudhary a active learning image spam hunter. In: 5th International symposium on visual computing (ISVC), USA

  • Gao Y, Yang M, Zhao X, Pardo B, Wu Y, Pappas TN, Choudhary A (2008) Image spam hunter. Acoustics, speech and signal processing ICASSP. In: IEEE international conference on ICASSP, IEEE, USA

  • Gao Y, Yang M, Choudhary A (2009)Semi supervised image spam hunter: a regularized discriminant EM approach. In: The international conference on advanced data mining and applications (ADMA), China

  • Gao Y, Choudhary A, Hua G (2010) A nonnegative sparsity induced similarity measure with application to cluster analysis of spam images. In: International conference on acoustics speech and signal processing (ICASSP), USA

  • Gargiulo F, Sansone C (2008) Combining visual and textual features for filtering spam emails. In: 19th International conference on pattern recognition (ICPR), USA

  • Gargiulo F, Penta A, Picariello A, Sansone C (2008) Using heterogeneous features for anti-spam filters. In: 19th International conference on database and expert systems application, Italy

  • Gargiulo F, Penta A, Picariello A, Sansone C (2009) A personal anti spam system based on a behaviour-knowledge space approach. Springer J Stud Comput Intell, vol 245

  • Goodman J, Heckerman D, Rounthwaite R (2005) Stopping spam. Scientific American, USA

    Google Scholar 

  • Hayati P, Potdar V (2008) Evaluation of spam detection and prevention frameworks for email and image spam—a state of art. In: Proceedings of iiWAS, ACM, Linz, Austria

  • He P, Sun Y, Zheng W, Wen X (2008) Filtering short message spam of group sending using CAPTCHA. In: IEEE, workshop on knowledge discovery and data mining, Australia, March 2008

  • He P, Wen X, Zheng W (2009) A simple method for filtering image spam. In: ACIS international conference on computer and information science, IEEE, Australia-Japan

  • Huang H, Guo W, Zhang Y (2008) A novel method for image spam filtering. In: The 9th international conference for young computer scientists

  • Issac B, Raman V (2006) Spam detection proposal in regular and text-based image emails. In: IEEE Region 10 Conference TENCON, China

  • Jithesh K, Sulochana KG, Kumar RR (2003) Optical character recognition (OCR) system for Malayalam language. In: National Workshop on application of language technology in Indian languages

  • Johnston N (2007) Spam evolves, PDF becomes the latest threat. Anti-Spam Development at MessageLabs, A MessageLabs Whitepaper, August 2007

  • Kelly N (2007) Image spam: the new email scourge. McAfee, Inc. 3965 Freedom Circle Santa Clara, CA 95054, 888.847.8766 www.mcafee.com

  • Kim J, Kim SH, Yang HJ, Son HJ, Kim WP (2007) Text extraction for spam-mail image filtering using a text color estimation technique. In: The 20th international conference on industrial, engineering and other applications of applied intelligent systems, IEA/AIE, Japan, June 2007

  • Kim H, Chang H, Lee J, Lee D (2010) BASIL: effective near-duplicate image detection using gene sequence alignment. In: 32nd European conference on information retrieval. Springer, UK

  • Klangpraphant P, Bhattarakosol P (2010) PIMSI: A partial image SPAM inspector. In: 5th International conference on future information technology (FutureTech), Thailand

  • Krasser S, Tang Y, Gould J, Alperovitch D, Judge P (2007) Identifying image spam based on header and file properties using C4.5 decision trees and support vector machine learning. In: Proceedings of the IEEE, workshop on information assurance, United States Military Academy, West Point

  • Lang SR, Williams N (2010) Impeding CAPTCHA breakers with visual decryption. In: The 8th Australasian information security conference (AISC 2010), Brisbane, Australia

  • Lawton G (2007) News briefs. Published by the IEEE Computer Society

  • Liu Q, Zhang F, Qin Z, Wang C, Chen S, Ma Q (2010) Feature selection for image spam classification. In: International conference on communications, circuits and systems (ICCCAS), China

  • Liu T, Tsao W, Lee C (2010) A high performance image-spam filtering system. In: Ninth international symposium on distributed computing and applications to business, engineering and science, China

  • Liu Q, Qin Z, Cheng H, Wan M (2010) Efficient modeling of spam images. In: 3th International symposium on intelligent information technology and security informatics, IEEE, China

  • Ma W, Tran D, Sharma D (2006) Detecting image based spam email. In: Proceedings of the Asia-Pacific workshop on visual information processing, Asia-Pacific Workshop on Visual Information Processing, Beijing, China

  • Mehta B, Nangia S, Gupta M, Nejdl W (2008) Detecting image spam using visual features and near duplicate detection. security and privacy. ACM, Beijing

    Google Scholar 

  • Nhung N, Phuong T (2007) An efficient method for filtering image-based spam E-mail, research, innovation and vision for the future. IEEE International, Vietnam

    Google Scholar 

  • Nielson J, Castro D, Aycock J (2008) Image Spam—ASCII to the Rescue!. In: 3rd International conference on malicious and unwanted software (MALWARE), USA

  • Qu Z, Zhang Y (2009) Filtering image spam using image semantics and near-duplicate detection. In: Second international conference on intelligent computation technology and automation, IEEE, China

  • Rusu A, Govindaraju V (2004) Handwritten CAPTCHA: using the difference in the abilities of humans and machines in reading handwritten words. In: 9th International workshop on frontiers in handwriting recognition (IWFHR-9 2004), IEEE, Japan

  • Saraubon K, Limthanmaphon B (2009) Fast effective botnet spam detection. In: Fourth international conference on computer sciences and convergence information technology, Korea

  • Soranamageswari M, Meena C (2010) Statistical feature extraction for classification of image spam using artificial neural networks. In: 2nd International conference on machine learning and computing, IEEE Press, Bangalore, India

  • Stone B (2006) Spam doubles, finding new ways to deliver itself. The New York Times, A01 6. E

  • Stern H (2008) A survey of modern spam tools. In: The fifth conference on email and anti-spam, CEAS, Mountain View, USA

  • Stuart I, Cha H, Tappert C (2004) A neural network classifier for junk mail. Springer, Link, pp 442–450

    Google Scholar 

  • Thomas R, Samosseiko D (2006) The game goes on: an analysis of modern spam techniques. Virus Bulletin conference, VB2006, October, Canada

  • Uemura M, Tabat T (2008) Design and evaluation of a Bayesian-filter-based image spam filtering method. In: International conference on information security and assurance, IEEE Press, Busan, Korea

  • Wang Z, Josephson W, Lv Q, Charikar M, Li K (2007) Filtering image spam with near-duplicate detection. In: Fourth conference on email and anti-spam, Mountain View, CA, USA

  • Wang C, Zhang F, Li F, Liu Q (2010) Image spam classifcation based on low-level image features. In: IEEE international conference on communications, circuits and systems (ICCCAS 2010), Chengdu, China, July, 2010

  • Wu C, Cheng K, Zhu Q, Wu Y (2005) Using visual features for anti-spam filtering. In: IEEE international conference on image processing III, pp 501–504

  • Ye M, Tao T, Mai FJ, Cheng XH (2008) An spam discrimination based on mail header feature and SVM. In: The 4th international conference on wireless communications, networking and mobile computing, WiCOM ’08, China, November, 2008

  • Youn S, McLeod D (2009) Improved spam filtering by extraction of information from text embedded image E-mail. In: Proceedings of the ACM symposium on applied computing, USA

  • Zinman A, Donath J (2007) Is Britney spears spam? In: Fourth conference on email and anti-spam mountain view, California, USA, August 2007

  • Zhen X, Hong-guo W, Zeng-zhen S (2009) Evaluation of image spam classification system based on AHP. In: International conference on computational intelligence and software engineering (CiSE), China

  • Zuo H, Hu W, Wu O, Chen Y, Luo G (2009) Detecting image spam using local invariant features and pyramid match kernel. In: 18th International world wide web conference (WWW), Spain

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reza Ebrahimi Atani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Attar, A., Rad, R.M. & Atani, R.E. A survey of image spamming and filtering techniques. Artif Intell Rev 40, 71–105 (2013). https://doi.org/10.1007/s10462-011-9280-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-011-9280-4

Keywords

Navigation