Attention cutting and padding learning for fine-grained image recognition

Cheng, Zhuo; Li, Hongjian; Duan, Xiaolin; Zeng, Xiangyan; He, Mingxuan; Luo, Hao

doi:10.1007/s11042-021-11314-z

Attention cutting and padding learning for fine-grained image recognition

Published: 06 August 2021

Volume 80, pages 32791–32805, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zhuo Cheng¹,
Hongjian Li ORCID: orcid.org/0000-0002-7003-7605¹,
Xiaolin Duan¹,
Xiangyan Zeng¹,
Mingxuan He¹ &
…
Hao Luo¹

271 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Fine-grained image recognition is an important task in the field of computer vision. In fine-grained image recognition, the difference between different categories is very small. Thus, fine-grained image recognition highly depends on local features. In this paper, a novel “Attention Cutting And Padding Learning” method is proposed to learn the local features. Firstly, the image is fed to Convolutional Neural Networks, and a saliency map is gotten. According to the saliency map, the attention image is obtained. Secondly, the attention image is cut into $N*N$ sub-images. Every sub-image is padded by 0 and the padding size is P. All sub-images are spliced into a Cutting And Padding image. Finally, the Cutting And Padding image and the attention image are fed to CNNs to train. In this method, more local features can be learned, and the high-level semantics is not damaged. Experimental results show that the recognition accuracy of Attention Cutting And Padding Learning is 87.9%, 94.6%, and 92.4% respectively on CUB-200-2011, Stanford Cars, and FGVC-Aircraft dataset. Moreover, this method can be easily applied to biodiversity automatic monitoring, intelligent retail, intelligent transportation, and other fields to improve recognition accuracy without changing the network structure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Group-Attention Transformer for Fine-Grained Image Recognition

Aggregate attention module for fine-grained image classification

Article 20 November 2021

Attention-based supervised contrastive learning on fine-grained image classification

Article 06 August 2024

References

Berg T, Liu J, Woo Lee S, Alexander ML, Jacobs DW, Belhumeur PN (2014) Birdsnap: Large-scale fine-grained visual categorization of birds. In Proc IEEE Conf Comput Vis Pattern Recognit 2011–2018
Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and construction learning for fine-grained image recognition. In Proc IEEE Conf Comput Vis Pattern Recognit 5157–5166
Cui Y, Song Y, Sun C, Howard A, Belongie S (2018) Large scale fine-grained categorization and domain-specific transfer learning. In Proc IEEE Conf Comput Vis Pattern Recognit 4109–4118
Dumoulin V, Visin F (2016) A guide to convolution arithmetic for deep learning. arXiv preprint. arXiv: 1603.07285
Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In Proc IEEE Conf Comput Vis Pattern Recognit 4438–4446
Guillaumin M, Küttel D, Ferrari V (2014) Imagenet auto-annotation with segmentation propagation. Int J Comput Vis 110(3):328–348
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proc IEEE Conf Comput Vis Pattern Recognit 770–778
Huang S, Xu Z, Tao D, Zhang Y (2016) Part-stacked cnn for fine-grained visual categorization. In Proc IEEE Conf Comput Vis Pattern Recognit 1173–1182
Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. In Adv Neural Inf Proces Syst 2017–2025
Krause J, Jin H, Yang J, Fei-Fei L (2015) Fine-grained recognition without part annotations. In Proc IEEE Conf Comput Vis Pattern Recognit 5546–5555
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Vision Workshops 554–561
Kuettel D, Guillaumin M, Ferrari V (2012) Segmentation propagation in imagenet. In European Conference on Computer Vision 459–473. Springer
LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278–2324
Article Google Scholar
Li Z, Yang Y, Liu X, Zhou F, Wen S, Xu W (2017) Dynamic computational time for visual attention. In Proceedings of the IEEE International Conference on Computer Vision 1199–1209
Liu X, Xia T, Wang J, Yang Y, Zhou F, Lin Y (2016) Fully convolutional attention networks for fine-grained recognition. arXiv preprint. arXiv: 1603.06765
Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv preprint. arXiv: 1306.5151
Peng Y, He X, Zhao J (2017) Object-part attention model for fine-grained image classification. IEEE Transactions on Image Processing 27(3):1487–1500
Article MathSciNet Google Scholar
Recasens A, Kellnhofer P, Stent S, Matusik W, Torralba A (2018) Learning to zoom: a saliency-based sampling layer for neural networks. In Proceedings of the European Conference on Computer Vision (ECCV) 51–66
Rodríguez P, Gonfaus JM, Cucurull G, XavierRoca F, Gonzalez J (2018) Attend and rectify: a gated attention mechanism for fine-grained recovery. In Proceedings of the European Conference on Computer Vision (ECCV) 349–364
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Sun M, Yuan Y, Zhou F, Ding E (2018) Multi-attention multi-class constraint for fine-grained image recognition. In Proceedings of the European Conference on Computer Vision (ECCV) 805–821
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Wang Y, Morariu VI, Davis LS (2018) Learning a discriminative filter bank within a cnn for fine-grained recognition. In Proc IEEE Conf Comput Vis Pattern Recognit 4148–4157
Wei X-S, Xie C-W, Wu J (2016) Mask-cnn: Localizing parts and selecting descriptors for fine-grained image recognition. arXiv preprint arXiv: 1605.06878
Wei X-S, Xie C-W, Wu J, Shen C (2018) Mask-cnn: Localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognit 76:704–714
Article Google Scholar
Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In Proc IEEE Conf Comput Vis Pattern Recognit 842–850
Yang Z, Luo T, Wang D, Hu Z, Gao J, Wang L (2018) Learning to navigate for fine-grained classification. In Proceedings of the European Conference on Computer Vision (ECCV) 420–435
Zhao B, Wu X, Feng J, Peng Q, Yan S (2017) Diversified visual attention networks for fine-grained object classification. IEEE Trans Multimedia 19(6):1245–1256
Article Google Scholar
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In Proc IEEE Conf Comput Vis Pattern Recognit 5209–5217
Zheng H, Fu J, Zha Z-J, Luo J (2019) Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition. In Proc IEEE Conf Comput Vis Pattern Recognit 5012–5021
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In Proc IEEE Conf Comput Vis Pattern Recognit 2921–2929

Download references

Acknowledgements

This work was supported by Chongqing Science and Technology Commission Project (Grant No:cstc2017jcyjAX0142 and cstc2018jcyjAX0525), Key Research and Development Projects of Sichuan Science and Technology Department (Grant No: 2019YFG0107).

Author information

Authors and Affiliations

Department of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, People’s Republic of China
Zhuo Cheng, Hongjian Li, Xiaolin Duan, Xiangyan Zeng, Mingxuan He & Hao Luo

Authors

Zhuo Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Hongjian Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Duan
View author publications
You can also search for this author in PubMed Google Scholar
Xiangyan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Mingxuan He
View author publications
You can also search for this author in PubMed Google Scholar
Hao Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongjian Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, Z., Li, H., Duan, X. et al. Attention cutting and padding learning for fine-grained image recognition. Multimed Tools Appl 80, 32791–32805 (2021). https://doi.org/10.1007/s11042-021-11314-z

Download citation

Received: 19 October 2020
Revised: 16 January 2021
Accepted: 20 July 2021
Published: 06 August 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11042-021-11314-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention cutting and padding learning for fine-grained image recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Group-Attention Transformer for Fine-Grained Image Recognition

Aggregate attention module for fine-grained image classification

Attention-based supervised contrastive learning on fine-grained image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Attention cutting and padding learning for fine-grained image recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Group-Attention Transformer for Fine-Grained Image Recognition

Aggregate attention module for fine-grained image classification

Attention-based supervised contrastive learning on fine-grained image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation