Skip to main content
Log in

Image spam filtering using convolutional neural networks

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

Spammers often embed text into images in order to avoid filtering by text-based spam filters, which result in a large number of advertisement spam images. Garbage image recognition has become one of the hotspots in the field of Internet spam filtering research. Its goal is to solve the problem that traditional spam information filtering methods encounter a sharp performance decline or even failure when filtering spam image information. Based on the clustering algorithm, this paper proposes a method to expand the data samples, which greatly improves the number of high-quality training samples and meets the needs of model training. Then, we train a convolutional neural networks using the enlarged data samples to recognize the SPAM in real time. The experimental results show that the accuracy of the model is increased by more than 14% after using the method of data augmentation. The accuracy of the model can be improved by 6% compared with other methods of data augmentation. Combined with convolutional neural networks and the proposed method of data augmentation, the accuracy of our SPAM filtering model is 7–11% higher than that of the traditional method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Russakovsky O, Deng J, Su H, Fei-Fei L et al. (2015) imagenet large scale visual recognition challenge[J]. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  2. Le Cun Y et al (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems

  3. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks[C]. In: International conference on neural information processing systems, curran associates Inc., pp 1097–1105

  4. Sermanet P, Eigen D, Zhang X et al. (2013) OverFeat: integrated recognition, localization and detection using convolutional networks[J]. Eprint Arxiv

  5. He K, Zhang X, Ren S et al. (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  6. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks[C]. In: European conference on computer vision. Springer, Cham, pp 818–833

    Google Scholar 

  7. Szegedy C, Liu W, Jia Y et al. (2014) Going deeper with convolutions[J]. pp 1–9

  8. He K, Zhang X, Ren S et al. (2015) Deep residual learning for image recognition[J]. pp 770–778

  9. Le Cun Y et al (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems

  10. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical Report TR-2009 University of Toronto

  11. Russakovsky O, Deng J, Su H et al. (2014) ImageNet large scale visual recognition challenge[J]. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  12. Everingham M, Gool LV, Williams CKI et al. (2010) The pascal visual object classes (VOC) challenge[J]. Int J Comput Vision 88(2):303–338

    Article  Google Scholar 

  13. Lin T-Y, Maire M, Belongie SJ et al (2014) Microsoft COCO: common objects in context. CoRR. arXiv:1405.0312

  14. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121C2159

    MathSciNet  MATH  Google Scholar 

  15. Zeiler MD (2012) ADADELTA: an adaptive learning rate method[J]. Computer Science

  16. Kingma D, Adam BJ (2014) A method for stochastic optimization[J]. Computer Science

  17. Zhu X, Meng Q, Gu L (2017) J Real-time image proc. https://doi.org/10.1007/s11554-017-0743-y

  18. Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks[J]. Eprint Arxiv

  19. Boureau YL, Ponce J, Lecun Y (2010) A theoretical analysis of feature pooling in visual recognition. In: International conference on machine learning, DBLP, pp 111–118

  20. Donahue J, Jia Y, Vinyals O et al (2014) DeCAF: a deep convolutional activation feature for generic visual recognition[C]. In: International conference on international conference on machine learning. JMLR.org, pp I–647

  21. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR

  22. Uijlings J, van de Sande K, Gevers T, Smeulders A (2013) Selective search for object recognition. IJCV

  23. Attar A, Rad RM, Atani RE (2013) A survey of image spamming and filtering techniques[J]. Artif Intell Rev 40(1):71–105

    Article  Google Scholar 

  24. Zhang Y, Wang S, Phillips P et al. (2014) Binary PSO with mutation operator for feature selection using decision tree applied to spam detection[J]. Knowl-Based Syst 64(1):22–31

    Article  Google Scholar 

  25. Kim SY, Sohn KA (2015) Graph-based spam image detection for mobile phone spam image filtering[J]. Laryngoscope 3(4):72–86

    Google Scholar 

  26. Liu Q, Zhang FL, Qin ZG et al. (2010) Feature selection for image spam classification[J]. IEEE, pp 294–297

  27. Shen J, Deng RH, Cheng Z et al. (2015) On robust image spam filtering via comprehensive visual modeling[J]. Pattern Recogn 48(10):3227–3238

    Article  Google Scholar 

  28. Soranamageswari M, Meena DC (2010) Histogram based image spam detection using back propagation neural networks[C]. In: 2010 international conference on control automation and systems (ICCAS), IEEE, pp 3985–3988

  29. Amir A, Srinivasan B, Khan AI (2017) Distributed classification for image spam detection. Multimed Tools Appl 77:13249–13278

    Article  Google Scholar 

  30. Adarshya SP, Mekala R, Arayakkandiyil R et al. (2012) Image spam detection through server-client filtering by tracing the source IP of the spammer[J]. Digital Image Processing

  31. Yang G, Zhang Y, Yang J, Ji G, Dong Z, Wang S, Feng C, Wang Q (2016) Automated classification of brain images using wavelet-energy and biogeography-based optimization. Multimedia Tools and Applications 75 (33):15601C15617

    Google Scholar 

  32. Dredze M, Gevaryahu R, Elias-Bachrach A (2007) Learning fast classifiers for image spam. In: Proceedings of the conference on email and anti-spam (CEAS)

  33. Harada T, Ushiku Y, Yamashita Y et al. (2011) Discriminative spatial pyramid[C]. In: Computer vision and pattern recognition, IEEE, pp 1617–1624

  34. Zhu X, Meng Q, Ding B et al. (2018) Cluster Comput. https://doi.org/10.1007/s10586-018-2165-4

  35. Abadi M, Agarwal A, Barham P et al. (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems[J]

  36. Jia YS et al. (2014) Caffe: convolutional architecture for fast feature embedding[J]. Eprint Arxiv, pp 675–678

  37. Team TD, Alrfou R, Alain G et al. (2016) Theano: a python framework for fast computation of mathematical expressions[J]

Download references

Acknowledgments

The authors would like to thank the reviewers for their helpful advices. The National Youth Science Foundation project of China (Grant no. F020101), the Henan Province Science and Technology key Project (Grant no. 1521022101936), the Natural Science Foundation of Hunan Province, China(Grant No.2018JJ2023) and the Key projects of Science and Technology Research in Henan Education Department (grant nos. 15A520091, 17B520031) are gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fan Aiwan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aiwan, F., Zhaofeng, Y. Image spam filtering using convolutional neural networks. Pers Ubiquit Comput 22, 1029–1037 (2018). https://doi.org/10.1007/s00779-018-1168-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-018-1168-8

Keywords

Navigation