Abstract
Many techniques have been proposed to combat the upsurge in image-based spam. All the proposed techniques have the same target, trying to avoid the image spam entering our inboxes. Image spammers avoid the filter by different tricks and each of them needs to be analyzed to determine what facility the filters need to have for overcoming the tricks and not allowing spammers to full our inbox. Different tricks give rise to different techniques. This work surveys image spam phenomena from all sides, containing definitions, image spam tricks, anti image spam techniques, data set, etc. We describe each image spamming trick separately, and by perusing the methods used by researchers to combat them, a classification is drawn in three groups: header-based, content-based, and text-based. Finally, we discus the data sets which researchers use in experimental evaluation of their articles to show the accuracy of their ideas.
Similar content being viewed by others
References
Aradhye HB, Myers GK, Herson JA (2005) Image analysis for efficient categorization of image-based spam e-mail. In: Eight international conference on document analysis and recognition (ICDAR’05), IEEE, Korea
Biggio B, Fumera G, Pillai I, Roli F (2007) Image spam filtering using visual information. In: The 14th international conference on image analysis and processing, Modena, Italy, 10–14 September 2007. IEEE Computer Society, pp 105–110
Biggio B, Fumera G, Pillai I, Roli F (2007) Image spam filtering by content obscuring detection. In: The 4th conference on email and AntiSpam, CEAS2007, Mountain View, California, USA, August 2007
Biggio B, Fumera G, Pillai I, Roli F (2008) Improving image spam filtering using image text features. In: Fifth conference on email and anti-spam (CEAS 2008), Mountain View, CA, USA
Biggio B, Fumera G, Pillai I, Roli F (2011) A survey and experimental evaluation of image spam filtering techniques. Pattern Recogn Lett (in press)
Blanzieri E, Bryl A (2009) A survey of learning-based techniques of email spam filtering. J Artif Intell Rev
Chen W, Zhang C (2009) Image spam clustering—an unsupervised approach. In: Proceedings of the first ACM workshop on multimedia in forensics, China, October 2009
Cheng H, Qin Z, Fu C, Wang Y (2010) Novel spam image filtering framework with multi-label classification. In: International conference on communications, circuits and systems (ICCCAS), China
Chew M, Tygar JD (2004) Image recognition CAPTCHAs. In: 7th International information security conference, ISC2004, Palo Alto, CA, USA, September 2004
Dredze M, Gevaryahu R, Elias-Bachrach A (2007) Learning fast classifiers for image spam. In: Proceedings of the 4th conference on email and anti-spam (CEAS), California, USA
Fritsch Ch, Netter M, Reisser A, Pernul G (2010) Attacking image recognition captchas. In: The 7th international conference, TrustBus 2010, Bilbao, Spain, August, 2010
Fumera G, Pillai I, Roli F (2006) Spam filtering based on the analysis of text information embedded into images. J Mach Learn Res 7: 2699–2720
Fumera G, Pillai I, Roli F, Biggio B (2007) Image spam filtering using textual and visual information. In: The MIT spam conference 2007, Cambridge, USA, March 2007
Gao Y (2009) Choudhary a active learning image spam hunter. In: 5th International symposium on visual computing (ISVC), USA
Gao Y, Yang M, Zhao X, Pardo B, Wu Y, Pappas TN, Choudhary A (2008) Image spam hunter. Acoustics, speech and signal processing ICASSP. In: IEEE international conference on ICASSP, IEEE, USA
Gao Y, Yang M, Choudhary A (2009)Semi supervised image spam hunter: a regularized discriminant EM approach. In: The international conference on advanced data mining and applications (ADMA), China
Gao Y, Choudhary A, Hua G (2010) A nonnegative sparsity induced similarity measure with application to cluster analysis of spam images. In: International conference on acoustics speech and signal processing (ICASSP), USA
Gargiulo F, Sansone C (2008) Combining visual and textual features for filtering spam emails. In: 19th International conference on pattern recognition (ICPR), USA
Gargiulo F, Penta A, Picariello A, Sansone C (2008) Using heterogeneous features for anti-spam filters. In: 19th International conference on database and expert systems application, Italy
Gargiulo F, Penta A, Picariello A, Sansone C (2009) A personal anti spam system based on a behaviour-knowledge space approach. Springer J Stud Comput Intell, vol 245
Goodman J, Heckerman D, Rounthwaite R (2005) Stopping spam. Scientific American, USA
Hayati P, Potdar V (2008) Evaluation of spam detection and prevention frameworks for email and image spam—a state of art. In: Proceedings of iiWAS, ACM, Linz, Austria
He P, Sun Y, Zheng W, Wen X (2008) Filtering short message spam of group sending using CAPTCHA. In: IEEE, workshop on knowledge discovery and data mining, Australia, March 2008
He P, Wen X, Zheng W (2009) A simple method for filtering image spam. In: ACIS international conference on computer and information science, IEEE, Australia-Japan
Huang H, Guo W, Zhang Y (2008) A novel method for image spam filtering. In: The 9th international conference for young computer scientists
Issac B, Raman V (2006) Spam detection proposal in regular and text-based image emails. In: IEEE Region 10 Conference TENCON, China
Jithesh K, Sulochana KG, Kumar RR (2003) Optical character recognition (OCR) system for Malayalam language. In: National Workshop on application of language technology in Indian languages
Johnston N (2007) Spam evolves, PDF becomes the latest threat. Anti-Spam Development at MessageLabs, A MessageLabs Whitepaper, August 2007
Kelly N (2007) Image spam: the new email scourge. McAfee, Inc. 3965 Freedom Circle Santa Clara, CA 95054, 888.847.8766 www.mcafee.com
Kim J, Kim SH, Yang HJ, Son HJ, Kim WP (2007) Text extraction for spam-mail image filtering using a text color estimation technique. In: The 20th international conference on industrial, engineering and other applications of applied intelligent systems, IEA/AIE, Japan, June 2007
Kim H, Chang H, Lee J, Lee D (2010) BASIL: effective near-duplicate image detection using gene sequence alignment. In: 32nd European conference on information retrieval. Springer, UK
Klangpraphant P, Bhattarakosol P (2010) PIMSI: A partial image SPAM inspector. In: 5th International conference on future information technology (FutureTech), Thailand
Krasser S, Tang Y, Gould J, Alperovitch D, Judge P (2007) Identifying image spam based on header and file properties using C4.5 decision trees and support vector machine learning. In: Proceedings of the IEEE, workshop on information assurance, United States Military Academy, West Point
Lang SR, Williams N (2010) Impeding CAPTCHA breakers with visual decryption. In: The 8th Australasian information security conference (AISC 2010), Brisbane, Australia
Lawton G (2007) News briefs. Published by the IEEE Computer Society
Liu Q, Zhang F, Qin Z, Wang C, Chen S, Ma Q (2010) Feature selection for image spam classification. In: International conference on communications, circuits and systems (ICCCAS), China
Liu T, Tsao W, Lee C (2010) A high performance image-spam filtering system. In: Ninth international symposium on distributed computing and applications to business, engineering and science, China
Liu Q, Qin Z, Cheng H, Wan M (2010) Efficient modeling of spam images. In: 3th International symposium on intelligent information technology and security informatics, IEEE, China
Ma W, Tran D, Sharma D (2006) Detecting image based spam email. In: Proceedings of the Asia-Pacific workshop on visual information processing, Asia-Pacific Workshop on Visual Information Processing, Beijing, China
Mehta B, Nangia S, Gupta M, Nejdl W (2008) Detecting image spam using visual features and near duplicate detection. security and privacy. ACM, Beijing
Nhung N, Phuong T (2007) An efficient method for filtering image-based spam E-mail, research, innovation and vision for the future. IEEE International, Vietnam
Nielson J, Castro D, Aycock J (2008) Image Spam—ASCII to the Rescue!. In: 3rd International conference on malicious and unwanted software (MALWARE), USA
Qu Z, Zhang Y (2009) Filtering image spam using image semantics and near-duplicate detection. In: Second international conference on intelligent computation technology and automation, IEEE, China
Rusu A, Govindaraju V (2004) Handwritten CAPTCHA: using the difference in the abilities of humans and machines in reading handwritten words. In: 9th International workshop on frontiers in handwriting recognition (IWFHR-9 2004), IEEE, Japan
Saraubon K, Limthanmaphon B (2009) Fast effective botnet spam detection. In: Fourth international conference on computer sciences and convergence information technology, Korea
Soranamageswari M, Meena C (2010) Statistical feature extraction for classification of image spam using artificial neural networks. In: 2nd International conference on machine learning and computing, IEEE Press, Bangalore, India
Stone B (2006) Spam doubles, finding new ways to deliver itself. The New York Times, A01 6. E
Stern H (2008) A survey of modern spam tools. In: The fifth conference on email and anti-spam, CEAS, Mountain View, USA
Stuart I, Cha H, Tappert C (2004) A neural network classifier for junk mail. Springer, Link, pp 442–450
Thomas R, Samosseiko D (2006) The game goes on: an analysis of modern spam techniques. Virus Bulletin conference, VB2006, October, Canada
Uemura M, Tabat T (2008) Design and evaluation of a Bayesian-filter-based image spam filtering method. In: International conference on information security and assurance, IEEE Press, Busan, Korea
Wang Z, Josephson W, Lv Q, Charikar M, Li K (2007) Filtering image spam with near-duplicate detection. In: Fourth conference on email and anti-spam, Mountain View, CA, USA
Wang C, Zhang F, Li F, Liu Q (2010) Image spam classifcation based on low-level image features. In: IEEE international conference on communications, circuits and systems (ICCCAS 2010), Chengdu, China, July, 2010
Wu C, Cheng K, Zhu Q, Wu Y (2005) Using visual features for anti-spam filtering. In: IEEE international conference on image processing III, pp 501–504
Ye M, Tao T, Mai FJ, Cheng XH (2008) An spam discrimination based on mail header feature and SVM. In: The 4th international conference on wireless communications, networking and mobile computing, WiCOM ’08, China, November, 2008
Youn S, McLeod D (2009) Improved spam filtering by extraction of information from text embedded image E-mail. In: Proceedings of the ACM symposium on applied computing, USA
Zinman A, Donath J (2007) Is Britney spears spam? In: Fourth conference on email and anti-spam mountain view, California, USA, August 2007
Zhen X, Hong-guo W, Zeng-zhen S (2009) Evaluation of image spam classification system based on AHP. In: International conference on computational intelligence and software engineering (CiSE), China
Zuo H, Hu W, Wu O, Chen Y, Luo G (2009) Detecting image spam using local invariant features and pyramid match kernel. In: 18th International world wide web conference (WWW), Spain
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Attar, A., Rad, R.M. & Atani, R.E. A survey of image spamming and filtering techniques. Artif Intell Rev 40, 71–105 (2013). https://doi.org/10.1007/s10462-011-9280-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-011-9280-4