Abstract
Deep neural networks are more and more pervading many computer vision applications and in particular image classification. Notwithstanding that, recent works have demonstrated that it is quite easy to create adversarial examples, i.e., images malevolently modified to cause deep neural networks to fail. Such images contain changes unnoticeable to the human eye but sufficient to mislead the network. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguish between correctly classified authentic images and adversarial examples. These scores are obtained searching only between the very same images used for training the network. The results show that hidden layers activations can be used to reveal incorrect classifications caused by adversarial attacks.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amato G, Falchi F, Gennaro C, Rabitti F (2016) Yfcc100m-hnfc6: a large-scale deep features benchmark for similarity search. In: International conference on similarity search and applications. Springer, pp 196–209
Amato G, Falchi F, Vadicamo L (2016) Visual recognition of ancient inscriptions using convolutional neural network and fisher vector. J Comput Cultur Heritag (JOCCH) 9(4):21
Amato G, Carrara F, Falchi F, Gennaro C, Meghini C, Vairo C (2017) Deep learning for decentralized parking lot occupancy detection. Expert Syst Appl 72:327–334
Amerini I, Uricchio T, Ballan L, Caldelli R (2017) Localization of jpeg double compression through multi-domain convolutional neural networks. In: 2017 IEEE Conference on computer vision and pattern recognition workshops (CVPRW), pp 1865–1871. https://doi.org/10.1109/CVPRW.2017.233
Baraldi L, Grana C, Cucchiara R (2016) Hierarchical boundary-aware neural encoder for video captioning. arXiv:1611.09312
Bayar B, Stamm MC (2016) A deep learning approach to universal image manipulation detection using a new convolutional layer. In: 4th ACM Workshop on information hiding and multimedia security, pp 5–10
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Brain G (2017) NIPS 2017: competition on adversarial attacks and defenses. https://www.kaggle.com/nips-2017-adversarial-learning-competition. Online Accessed 19 Jan 2018
Carrara F, Esuli A, Fagni T, Falchi F, Fernández A M (2016) Picture it in your mind: generating high level visual representations from textual descriptions. arXiv:1606.07287
Carrara F, Falchi F, Caldelli R, Amato G, Fumarola R, Becarelli R (2017) Detecting adversarial example attacks to deep neural networks. In: Proceedings of the 15th international workshop on content-based multimedia indexing. ACM, p 38
Chandrasekhar V, Lin J, Morère O, Goh H, Veillard A (2015) A practical guide to cnns and fisher vectors for image instance retrieval. arXiv:1508.02496
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: Icml, vol 32, pp 647–655
Dong C, Loy CC, He K, Tang X (2016) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307
Dong J, Li X, Snoek CG (2016) Word2visualvec: cross-media retrieval by visual feature prediction. arXiv:1604.06838
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv:1412.6572
Gordo A, Almazán J, Revaud J, Larlus D (2016) Deep image retrieval: learning global representations for image search. In: European conference on computer vision. Springer, pp 241–257
Grosse K, Papernot N, Manoharan P, Backes M, McDaniel P (2016) Adversarial perturbations against deep neural networks for malware classification. arXiv:1606.04435
Grosse K, Manoharan P, Papernot N, Backes M, McDaniel P (2017) On the (statistical) detection of adversarial examples. arXiv:1702.06280
Gu S, Rigazio L (2014) Towards deep neural network architectures robust to adversarial examples. arXiv:1412.5068
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093
Klarreich E (2016) Learning securely. Commun ACM 59(11):12–14. https://doi.org/10.1145/2994577
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. arXiv:1607.02533
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521 (7553):436–444
Li X, Uricchio T, Ballan L, Bertini M, Snoek CG, Bimbo AD (2016) Socializing the semantic gap: a comparative survey on image tag assignment, refinement, and retrieval. ACM Comput Surv (CSUR) 49(1):14
Metzen JH, Genewein T, Fischer V, Bischoff B (2017) On detecting adversarial perturbations. arXiv:1702.04267
Moosavi-Dezfooli SM, Fawzi A, Fawzi O, Frossard P (2016) Universal adversarial perturbations. arXiv:1610.08401
Nguyen A, Yosinski J, Clune J (2015) Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: The IEEE Conference on computer vision and pattern recognition (CVPR)
Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2016) Practical black-box attacks against deep learning systems using adversarial examples. arXiv:1602.02697
Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A (2016) The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS&P). IEEE, pp 372–387
Papernot N, McDaniel P, Sinha A, Wellman M (2016) Towards the science of security and privacy in machine learning. arXiv:1611.03814
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on security and privacy (SP). IEEE, pp 582–597
Papernot N, Carlini N, Goodfellow I, Feinman R, Faghri F, Matyasko A, Hambardzumyan K, Juang YL, Kurakin A, Sheatsley R, Garg A, Lin YC (2017) cleverhans v2.0.0: an adversarial machine learning library. arXiv:1610.00768
Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: British machine vision conference
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: 2014 IEEE Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 512–519
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252. https://doi.org/10.1007/s11263-015-0816-y
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229
Sharif Razavian A, Azizpour H, Sullivan J, Carlsson S (2014) Cnn features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv:1312.6199
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tabacof P, Valle E (2016) Exploring the space of adversarial images. In: 2016 International joint conference on neural networks (IJCNN). IEEE, pp 426–433
Tuama A, Comby F, Chaumont M (2016) Camera model identification with the use of deep convolutional neural networks. In: 2016 IEEE International workshop on information forensics and security (WIFS)
Ying Z, Goha J, Wina L, Thinga V (2016) Image region forgery detection: a deep learning approach. In: Singapore cyber-security conference (SG-CRC), pp 1–11
Acknowledgements
This work was partially supported by Smart News, Social sensing for breaking news, co-founded by the Tuscany region under the FAR-FAS 2014 program, CUP CIPE D58C15000270008, and the project ESPRESS (Smartphone identification based on on-board sensors for security applications) co-funded by Fondazione Cassa di Risparmio di Firenze (Italy) within the Scientific Research and Technological Innovation framework. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Carrara, F., Falchi, F., Caldelli, R. et al. Adversarial image detection in deep neural networks. Multimed Tools Appl 78, 2815–2835 (2019). https://doi.org/10.1007/s11042-018-5853-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5853-4