ABSTRACT
The convolutional neural network is relied upon by the mainstream image classification model to be achieved, but the convolutional neural network itself has defects such as easy loss of data. At the same time, deep learning models are vulnerable to adversarial perturbations, resulting in a decline in model performance. In order to effectively solve the above problems, this paper presents STFGSM, an intelligent image classification model based on Swin Transformer and fast gradient sign method. The attention mechanism is utilized by the Swin Transformer to extract picture features, with the traditional convolution operation being replaced. The field of the image is enhanced and information loss is avoided by this. Furthermore, the anti-interference capability of the model is strengthened through adversarial training that uses adversarial samples generated via the fast gradient sign method algorithm. The experimental results show that the classification performance of STFGSM outperformed other mainstream image classification models, whose speed is faster and adaptability to adversarial samples is stronger. In the future, more complex adversarial training strategies can be introduced on the basis of the model or the model can be extended to tasks in other fields such as target detection and image generation.
- Fu Su, Qin Lv, and Renze Luo. Review of Image Classification Based on Deep Learning. Telecommunications Science 35, 11 (November 2019), 58-74. https://doi.org/10.11959/j.issn.1000-0801.2019268.Google ScholarCross Ref
- Y. LeCun, B. Boser, J. S. Denker, D.Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation 1, 4 (December 1989), 541-551. https://doi.org/10.1162/neco.1989.1.4.541.Google ScholarDigital Library
- Krizhevsky Alex, Sutskever Ilya, and E. Hinton Geoffrey. ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM 60, 6 (June 2017), 84-90. https://doi.org/10.1145/3065386.Google ScholarDigital Library
- Simonyan Karen, and Zisserman Andrew. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (April 2015). https://arxiv.org/abs/1409.1556.Google Scholar
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper wth Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Press, Piscataway, New Jersey, 1-9. https://doi.org/10.1109/CVPR.2015.7298594.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Press, Piscataway, New Jersey, 770-778. https://doi.org/10.1109/CVPR.2016.90.Google ScholarCross Ref
- Wenping Ma, Qifan Yang, Yue Wu, Wei Zhao, and Xiangrong Zhang. Double-Branch Multi-Attention Mechanism Network for Hyperspectral Image Classification. Remote Sensing 11, 11 (June 2019), 1307. https://doi.org/10.3390/rs11111307.Google ScholarCross Ref
- Dosovitskiy Alexey, Beyer Lucas, Kolesnikov Alexander, and 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv preprint arXiv:2010.11929 (October 2020). https://arxiv.org/abs/2010.11929.Google Scholar
- Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE Press, Piscataway, New Jersey, 10012-10022. https://doi.org/10.1109/ICCV48922.2021.00986.Google ScholarCross Ref
- J. Goodfellow Ian, Shlens Jonathon, and Szegedy Christian. 2014. Explaining and Harnessing Adversarial Examples. arXiv preprint arXiv:1412.6572 (December 2014). https://arxiv.org/abs/1412.6572.Google Scholar
Index Terms
- STFGSM: Intelligent Image Classification Model Based on Swin Transformer and Fast Gradient Sign Method
Recommendations
CSTGAN: Cycle Swin Transformer GAN for Unpaired Infrared Image Colorization
CCRIS '22: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent SystemInfrared images can be captured in harsh conditions such as low light and foggy weather, which provides an effective solution for image capture throughout the day. However, the low contrast and blurred object boundaries of infrared images hinder human ...
Swin transformer with multiscale 3D atrous convolution for hyperspectral image classification
AbstractHyperspectral image (HSI) classification has attracted significant interest among researchers owing to its diverse practical applications. Convolutional neural networks (CNNs) have been extensively utilized for HSI classification. However, the ...
A dyadic multi-resolution deep convolutional neural wavelet network for image classification
For almost the past four decades, image classification has gained a lot of attention in the field of pattern recognition due to its application in various fields. Given its importance, several approaches have been proposed up to now. In this paper, we ...
Comments