Weakly supervised fine-grained recognition based on spatial-channel aware attention filters

Yu, Nannan; Huang, Lei; Wei, Zhiqiang; Zhang, Wenfeng; Wang, Bin

doi:10.1007/s11042-020-10268-y

Weakly supervised fine-grained recognition based on spatial-channel aware attention filters

Published: 23 January 2021

Volume 80, pages 14409–14427, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Nannan Yu¹,
Lei Huang^1,2,
Zhiqiang Wei^1,2,
Wenfeng Zhang¹ &
…
Bin Wang¹

417 Accesses
1 Citation
Explore all metrics

Abstract

Fine-grained recognition is a very challenging issue, since it is difficulty to mine discriminative and subtle feature for objects with similar visual appearance. Because massive manual annotations (e.g., bounding box for discriminative regions) are time-consuming and labor-consuming, existing methods designed single form of attention model outputted discriminative regions in a weakly supervised way. In this paper, we proposed a novel method named a Spatial-Channel Aware Attention Filters (SCAF) to address the weakly supervised fine-grained recognition problem. Compared with the previous attention models, SCAF can obtain attentions-aware features from two dimensions, i.e., the spatial location of image and the channel of feature maps. With the proposed SCAF, the model can enhance the discriminative regions on both spatial and channel dimensions simultaneously. In addition, the multi-channel network multi-level structure are designed to extract richer regional features. Moreover, focal loss is introduced to balance the samples’ distribution of fine-grained image dataset. Comprehensive and comparable experiments are conducted in publicly available datasets, and the experimental results show that our method can achieve the state-of-the-art performance on fine-grained recognition tasks. For instance, we achieve 99.370%, 80.749% accuracy on two underwater datasets respectively, i.e., Fish4Knowlege and Wild Fish.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Aggregate attention module for fine-grained image classification

Article 20 November 2021

Lightweight underwater object detection based on image enhancement and multi-attention

Article 10 January 2024

DAF-Net: dense attention feature pyramid network for multiscale object detection

Article 08 April 2024

References

Branson S, Horn G Van, Perona P, Belongie S (2014) Improved Bird Species Recognition Using Pose Normalized Deep Convolutional Nets. In: Valstar M, French A, Pridmore T, Proceedings of the British Machine Vision Conference (BMVA Press). https://doi.org/10.5244/C.28.87
Chai Y, Rahtu E, Lempitsky V, Van Gool L., Zisserman A (2012) TriCoS: A Tri-level Class-Discriminative Co-segmentation Method for Image Classification. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer Berlin Heidelberg, Berlin, Heidelberg pp 794–807
Dai Y, Jin T, Song Y, Du H, Zhao D (Jul. 2019) CNN-based multiple-input multiple-output radar image enhancement method. J Eng 2019(20):6840–6844
Article Google Scholar
Donahue J et al (2013) DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of International Conference on International Conference on Machine Learning (ICML), Beijing, China, pp 647–655
Fu J, Zheng H, Mei T (2017) Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). https://www.microsoft.com/en-us/research/publication/look-closer-see-better-recurrent-attention-convolutional-neural-network-fine-grained-imagerecognition/
Ge W, Lin X, Yu Y (2019) Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification from the Bottom Up. In: Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2019.00315
He X, Peng Y, Zhao J (2017) Fine-grained discriminative localization via saliency-guided faster R-CNN. In: Proceedings of the 25th ACM international conference on Multimedia, California, pp 627–635
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2913372
Jaderberg M, Simonyan K, Zisserman A, and others (2015) Spatial transformer networks. In: Proceedings of the 28th international conference on neural information processing systems. MIT Press, Montreal, Canada, pp 2017–2025. https://doi.org/10.5555/2969442.2969465
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops. pp 554–561. https://doi.org/10.1109/ICCVW.2013.77
Lam M, Mahasseni B, Todorovic S (2017) Fine-grained recognition as hsnet search for informative image parts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2017.688
Li S, Liu X, Wu L, Ma H, H. Zhang (2016) A discriminative null space based deep learning approach for person re-identification. In: Proceedings of 2016 4th International Conference on Cloud Computing and Intelligence Systems (CCIS), Beijing, China, pp 480–484
Lin TY, RoyChowdhury A, Maji S (2015) Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Santiago, pp 1449–1457.
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision ICCV
Liu X, Xia T, Wang J, Yang Y, Zhou F, Lin Y (2017) Fully Convolutional Attention Networks for Fine-Grained Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas
Lopez PR, Dorta DV, Preixens GC, Sitjes JMG, Marva FXR, Gonzalez J (2020) Pay attention to the activations: a modular attention mechanism for fine-grained image recognition. IEEE Trans Multimed 22(2):502–514
Article Google Scholar
Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft
Nie J, Huang L, Zhang W, Wei G, Wei Z (2019) Deep feature ranking for person re-identification. IEEE Access, p 1. https://doi.org/10.1109/ACCESS.2019.2894347
Peng Y, He X, Zhao J (2018) Object-part attention model for fine-grained image classification. IEEE Trans Image Process Publ IEEE Signal Process Soc 27(3):1487–1500. https://doi.org/10.1109/TIP.2017.2774041
Peng Y, Qi J, Huang X (2019) Research status and Prospect of multimedia content understanding. J Comput Res Develop 56(1):183–208
Google Scholar
Qin H, Xiu L, Jian L, Peng Y, Zhang C (2016) DeepFish: Accurate underwater live fish recognition with a deep architecture. Neurocomputing 187:49–58. https://doi.org/10.1016/j.neucom.2015.10.122
Sermanet P, Frome A, Real E (Dec. 2014) Attention for fine-grained categorization. Comput Sci 10(1):224–230
Google Scholar
Shi Z, Hao H, Zhao M, Feng Y, He L, Wang Y, Suzuki K (2019) A deep CNN based transfer learning method for false positive reduction. Multim Tools Appl 78(1):1017–1033
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv
Sun M, Yuan Y, Zhou F, Ding E (2018) Multi-attention multi-class constraint for fine-grained image recognition. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer International Publishing, Cham, pp 834–850
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset. California Institute of Technology, Pasadena
Google Scholar
Wang Y, Morariu VI, Davis LS (2018) Learning a discriminative filter bank within a cnn for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, pp 4148–4157
Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Boston, pp 842–850
Yang Z, Luo T, Wang D, Hu Z, Gao J, Wang L (2018) Learning to navigate for fine-grained classification. In: Proceedings of the European Conference on Computer Vision (ECCV), Munich, pp 420–435
Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-Based R-CNNs for Fine-Grained Category Detection. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer International Publishing, Cham, pp 834–849. https://doi.org/10.1007/978-3-319-10590-1_54
Zhang X, Xiong H, Zhou W, Lin W, Qi T (2016) Picking Deep Filter Responses for Fine-Grained Image Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, pp 1134–1142
Zheng H, Fu J, Zha Z-J, Luo J (2019) Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, pp 5012–5021
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, pp 5219–5227. https://doi.org/10.1109/ICCV.2017.557
Zhuang P, Wang Y, Qiao Y (2018) Wildfish: A large benchmark for fish recognition in the wild. pp 1301–1309. https://doi.org/10.1145/3240508.3240616

Download references

Acknowledgments

This work is supported by the National Key R&D Program of China (2019YFD0900401); National Natural Science Foundation of China (No.61872326, No.61672475); Shandong Provincial Natural Science Foundation (ZR2019MF044).

Author information

Authors and Affiliations

Ocean University of China, Qingdao, 266000, China
Nannan Yu, Lei Huang, Zhiqiang Wei, Wenfeng Zhang & Bin Wang
Pilot National Laboratory for Marine Science and Technology (Qingdao), Qingdao, 266000, China
Lei Huang & Zhiqiang Wei

Authors

Nannan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Wei
View author publications
You can also search for this author in PubMed Google Scholar
Wenfeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, N., Huang, L., Wei, Z. et al. Weakly supervised fine-grained recognition based on spatial-channel aware attention filters. Multimed Tools Appl 80, 14409–14427 (2021). https://doi.org/10.1007/s11042-020-10268-y

Download citation

Received: 19 June 2020
Revised: 05 October 2020
Accepted: 09 December 2020
Published: 23 January 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s11042-020-10268-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weakly supervised fine-grained recognition based on spatial-channel aware attention filters

Abstract

Access this article

Similar content being viewed by others

Aggregate attention module for fine-grained image classification

Lightweight underwater object detection based on image enhancement and multi-attention

DAF-Net: dense attention feature pyramid network for multiscale object detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Weakly supervised fine-grained recognition based on spatial-channel aware attention filters

Abstract

Access this article

Similar content being viewed by others

Aggregate attention module for fine-grained image classification

Lightweight underwater object detection based on image enhancement and multi-attention

DAF-Net: dense attention feature pyramid network for multiscale object detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation