Abstract
The existing methods for the detection of evasion attacks in machine learning systems are analyzed. An experimental comparison of the methods is carried out. The uncertainty method is universal; however, in this method, it is difficult to determine such uncertainty boundaries for adversarial examples that would enable the precise identification of evasion attacks, which would result in lower efficiency parameters with respect to the skip gradient method (SGM) attack, maps of significance (MS) attack, and boundary attack (BA) compared to the other methods. A new hybrid method representing the two-stage input data verification complemented with preliminary processing is developed. In the new method, the uncertainty boundary for adversarial objects has become distinguishable and quickly computable. The hybrid method makes it possible to detect out-of-distribution (OOD) evasion attacks with a precision of not less than 80%, and SGM, MS, and BA attacks with a precision of 93%.
REFERENCES
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z., Rethinking the inception architecture for computer vision, 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016, IEEE, 2016, pp. 2818–2826. https://doi.org/10.1109/cvpr.2016.308
Wu, Yo. et al., Google’s neural machine translation system: Bridging the gap between human and machine translation, 2016. https://doi.org/10.48550/arXiv.1609.08144
Bojarski, M. et al., End to end learning for self-driving cars, 2016. https://doi.org/10.48550/arXiv.1604.07316
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, Ia., and Fergus, R., Intriguing properties of neural networks, 2013. https://doi.org/10.48550/arXiv.1312.6199
Pavlenko, E., Zegzhda, D., and Poltavtseva, M., Ensuring the sustainability of cyberphysical systems based on dynamic reconfiguration, 2019 IEEE Int. Conf. on Industrial Cyber Physical Systems (ICPS), Taipei, 2019, IEEE, 2019, pp. 60–64. https://doi.org/10.1109/icphys.2019.8780193
Zegzhda, P.D., Zegzhda, D.P., and Nikolskiy, A.V., Using graph theory for cloud system security modeling, Computer Network Security. MMM-ACNS 2012, Kotenko, I. and Skormin, V., Eds., Lecture Notes in Computer Science, vol. 7531, Berlin: Springer, 2012, pp. 309–318. https://doi.org/10.1007/978-3-642-33704-8_26
Lavrova, D., Zaitceva, E., and Zegzhda, P., Bio-inspired approach to self-regulation for industrial dynamic network infrastructure, CEUR Workshop Proc., 2019, vol. 2603, pp. 34–39. https://ceur-ws.org/Vol-2603/short8.pdf.
Gu, Sh. and Rigazio, L., Towards deep neural network architectures robust to adversarial examples, 2014. https://doi.org/10.48550/arXiv.1412.5068
Jin, J., Dundar, A., and Culurciello, E., Robust convolutional neural networks under adversarial noise, 2015. https://doi.org/10.48550/arXiv.1511.06306
Shaham, U., Yamada, Yu., and Negahban, S., Understanding adversarial training: Increasing local stability of supervised models through robust optimization, Neurocomputing, 2015, vol. 307, pp. 195–204. https://doi.org/10.1016/j.neucom.2018.04.027
Zheng, S., Song, Ya., Leung, T., and Goodfellow, I., Improving the robustness of deep neural networks via stability training, 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016, IEEE, 2016, pp. 4480–4488. https://doi.org/10.1109/cvpr.2016.485
Bhagoji, A.N., Cullina, D., Sitawarin, C., and Mittal, P., Enhancing robustness of machine learning systems via data transformations, 2018 52nd Annu. Conf. on Information Sciences and Systems (CISS), Princeton, N.J., 2018, IEEE, 2018, pp. 1–5. https://doi.org/10.1109/CISS.2018.8362326
Feinman, R., Curtin, R.R., Shintre, S., and Gardner, A.B., Detecting adversarial samples from artifacts, 2017. https://doi.org/10.48550/arXiv.1703.00410
Gong, Z. and Wang, W., Adversarial and clean data are not twins, Proc. Sixth Int. Workshop on Exploiting Artificial Intelligence Techniques for Data Management, Seattle, Wash., New York: Association for Computing Machinery, 2023, p. 6. https://doi.org/10.1145/3593078.3593935
Grosse, K., Manoharan, P., Papernot, N., Backes, M., and McDaniel, P., On the (statistical) detection of adversarial examples, 2017. https://doi.org/10.48550/arXiv.1702.06280
Metzen, J.H., Genewein, T., Fischer, V., and Bischoff, B., On detecting adversarial perturbations, 2017. https://doi.org/10.48550/arXiv.1702.04267
Hendrycks, D. and Gimpel, K., Early methods for detecting adversarial images, 2016. https://doi.org/10.48550/arXiv.1608.00530
Li, X. and Li, F., Adversarial examples detection in deep networks with convolutional filter statistics, 2017 IEEE Int. Conf. on Computer Vision (ICCV), Venice, 2017, IEEE, 2017, pp. 5764–5772. https://doi.org/10.1109/iccv.2017.615
Hendrycks, D. and Gimpel, K., A baseline for detecting misclassified and outof-distribution examples in neural networks, 2016. https://doi.org/10.48550/arXiv.1610.02136
Lee, Kimin, Lee, Kibok, Lee, H., and Shin, J., A simple unified framework for detecting out-of-distribution samples and adversarial attacks, Adv. Neural Inf. Process. Syst., 2018, pp. 7167–7177.
Hsu, Ye.-Ch., Shen, Yi., Jin, H., and Kira, Zs., Generalized ODIN: Detecting out-of-distribution image without learning from out-of-distribution data, 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Seattle, Wash., 2020, IEEE, 2020, pp. 10951–10960. https://doi.org/10.1109/cvpr42600.2020.01096
Devries, T. and Taylor, G.W., Learning confidence for out-of-distribution detection in neural networks, 2018. https://doi.org/10.48550/arXiv.1802.04865
Liang, S., Li, Yi., and Srikant, R., Enhancing the reliability of out-of-distribution image detection in neural networks, 2017. https://doi.org/10.48550/arXiv.1706.02690
Feinman, R., Curtin, R.R., Shintre, S., and Gardner, A.B., Detecting adversarial samples from artifacts, 2017. https://doi.org/10.48550/arXiv.1703.00410
Brownlee, J., A gentle introduction to dropout for regularizing deep neural networks, Mach. Learn. Mastery, 2018. https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks.
Funding
This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Translated by E. Glushachenkova
Publisher’s Note.
Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Kalinin, M.O., Suprun, A.F. & Ivanova, O.D. Hybrid Method for the Detection of Evasion Attacks Aimed at Machine Learning Systems. Aut. Control Comp. Sci. 57, 983–988 (2023). https://doi.org/10.3103/S0146411623080072
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411623080072