Fine-grained image recognition via trusted multi-granularity information fusion

Yu, Ying; Tang, Hong; Qian, Jin; Zhu, Zhiliang; Cai, Zhen; Lv, Jingqin

doi:10.1007/s13042-022-01685-6

Fine-grained image recognition via trusted multi-granularity information fusion

Original Article
Published: 22 October 2022

Volume 14, pages 1105–1117, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Ying Yu ORCID: orcid.org/0000-0002-3480-4571¹,
Hong Tang¹,
Jin Qian¹,
Zhiliang Zhu^1,2,
Zhen Cai¹ &
…
Jingqin Lv¹

635 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Fine-grained image recognition (FGIR) is more challenging than general image recognition tasks due to the inherently subtle object variation. The existing FGIR methods are mainly based on single-granularity feature fusion, the extracted fused features often cannot fully reflect the characteristics of the object, and the recognition results based on the fused feature also lack interpretability. To solve this problem, we propose a novel end-to-end trusted multi-granularity information fusion (TMGIF) model for weakly-supervised fine-grained image recognition. It can automatically extract multi-granularity information representation for a fine-grained image, further evaluate the quality of information granules, and then progressively fuse multi-granularity information according to the quality to obtain a reliable and interpretable recognition result. We evaluate TMGIF on three standard benchmark datasets, and demonstrate the proposed method can provide competitive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Deep learning models for digital image processing: a review

Article 07 January 2024

The improved YOLOv8 algorithm based on EMSPConv and SPE-head modules

Article 02 January 2024

References

Wei XS, Song YZ, Mac Aodha O, et al. Fine-grained image analysis with deep learning: a survey. IEEE transactions on pattern analysis and machine intelligence, 2021.
Wah C, Branson S, Welinder P, et al. The caltech-ucsd birds-200–2011 dataset. California Institute of Technology, Pasadena, 2011.
Khosla A, Jayadevaprakash N, Yao B, et al. Novel dataset for fine-grained image categorization: Stanford dogs [C] // Proceedings of CVPR Workshop on Fine-Grained Visual Categorization (FGVC). 2011, 2(1).
Krause J, Stark M, Deng J, et al. 3d object representations for fine-grained categorization. Proceedings of the IEEE international conference on computer vision workshops. 2013: 554–561.
Allegra D, Litrico M, Spatafora M A N, et al. Exploiting Egocentric Vision on Shopping Cart for Out-Of-Stock Detection in Retail Environments. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 1735–1740. https://doi.org/10.1109/ICCVW54120.2021.00199.
Ratnayake M N, Dyer A G, Dorin A. Towards Computer Vision and Deep Learning Facilitated Pollination Monitoring for Agriculture. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 2921–2930. https://doi.org/10.1109/CVPRW53098.2021.00327
Van Horn G, Cole E, Beery S, et al. Benchmarking representation learning for natural world image collections. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12884–12893. https://doi.org/10.1109/CVPR46437.2021.01269.
Liu C, Huynh DQ, Sun Y et al (2020) A vision-based pipeline for vehicle counting, speed estimation, and classification. IEEE Trans Intell Transp Syst 22(12):7547–7560
Article Google Scholar
Min S, Yao H, Xie H et al (2020) Multi-objective matrix normalization for fine-grained visual recognition. IEEE Trans Image Process 29:4996–5009
Article MATH Google Scholar
Zheng H, Fu J, Zha Z J, et al. Learning deep bilinear transformation for fine-grained image representation. Adv Neural Inform Process Syst 2019, 32.
Wei X, Zhang Y, Gong Y, et al. Grassmann pooling as compact homogeneous bilinear pooling for fine-grained visual classification. Proceedings of the European Conference on Computer Vision (ECCV), 2018: 355–370.
Li JH, Zhou XR (2022) Attribute reduction multi-granularity formal decision contexts. Pattern Recognition and Artifical Intelligence 35(5):387–400
Google Scholar
Xin Z, Chen G, Chen J et al (2022) MGPOOL: multi-granular graph pooling convolutional networks representation learning. Int J Mach Learn Cybern 13(3):783–796
Article Google Scholar
Berg T, Belhumeur P N. Poof: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013: 955–962.
Xie L, Tian Q, Hong R, et al. Hierarchical part matching for fine-grained visual categorization. Proceedings of the IEEE international conference on computer vision. 2013: 1641–1648.
Lei J, Duan J, Wu F et al (2016) Fast mode decision based on grayscale similarity and inter-view correlation for depth map coding in 3D-HEVC. IEEE Trans Circuits Syst Video Technol 28(3):706–718
Article Google Scholar
Huang S, Xu Z, Tao D, et al. Part-stacked cnn for fine-grained visual categorization. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 1173–1182.
Nauta M, van Bree R, Seifert C. Neural prototype trees for interpretable fine-grained image recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 14933–14943.
Zhang L, Huang S, Liu W. Intra-class part swapping for fine-grained image classification. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021: 3209–3218.
He G, Li F, Wang Q et al (2021) A hierarchical sampling based triplet network for fine-grained image classification. Pattern Recogn 115:107889
Article Google Scholar
Ding Y, Ma Z, Wen S et al (2021) AP-CNN: Weakly supervised attention pyramid convolutional neural network for fine-grained visual classification. IEEE Trans Image Process 30:2826–2836
Article Google Scholar
Cao S, Wang W, Zhang J, et al. A few-shot fine-grained image classification method leveraging global and local structures. International Journal of Machine Learning and Cybernetics, 2022: 1–9.
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770–778.
Wang J, Tu Z, Fu J et al (2022) Guest Editorial: Introduction to the Special Section on Fine-Grained Visual Categorization. IEEE Trans Pattern Anal Mach Intell 44(02):560–562
Article Google Scholar
Zhang N, Donahue J, Girshick R, et al. Part-based R-CNNs for fine-grained category detection. European conference on computer vision. Springer, Cham, 2014: 834–849.
Wei XS, Xie CW, Wu J et al (2018) Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recogn 76:704–714
Article Google Scholar
Wang Z, Wang S, Li H, et al. Graph-propagation based correlation learning for weakly supervised fine-grained image classification. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 12289–12296.
Lin T Y, RoyChowdhury A, Maji S. Bilinear cnn models for fine-grained visual recognition. Proceedings of the IEEE international conference on computer vision. 2015: 1449–1457.
Zhuang P, Wang Y, Qiao Y. Learning attentive pairwise interaction for fine-grained classification. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 13130–13137.
Chen Y, Bai Y, Zhang W, et al. Destruction and construction learning for fine-grained image recognition [C] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 5157–5166.
Du R, Chang D, Bhunia A K, et al. Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. European Conference on Computer Vision. Springer, Cham, 2020: 153–168.
Du R, Xie J, Ma Z, et al. Progressive Learning of Category-Consistent Multi-Granularity Features for Fine-Grained Visual Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
Zhang P, Li T, Wang G et al (2021) Multi-source information fusion based on rough set theory: a review. Information Fusion 68:85–117
Article Google Scholar
Meraner A, Ebel P, Zhu XX et al (2020) Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion. ISPRS J Photogramm Remote Sens 166:333–346
Article Google Scholar
Yu R, Ye D, Wang Z, et al. CFFNN: Cross feature fusion neural network for collaborative filtering. IEEE Transactions on Knowledge and Data Engineering, 2021.
Zhang Z, Zhang X, Peng C, et al. Exfuse: Enhancing feature fusion for semantic segmentation. Proceedings of the European conference on computer vision (ECCV). 2018: 269–284.
Pan Y, Zhang L, Li ZW et al (2019) Improved fuzzy Bayesian network-based risk analysis with interval-valued fuzzy sets and D-S evidence theory. IEEE Trans Fuzzy Syst 28(9):2063–2077
Article Google Scholar
Jøsang A (2002) The consensus operator for combining beliefs. Artif Intell 141(1–2):157–170
Article MathSciNet MATH Google Scholar
Jøsang A (2001) A logic for uncertain probabilities. Internat J Uncertain Fuzziness Knowl-Based Syst 9(03):279–311
Article MathSciNet MATH Google Scholar
Josang A, Cho J H, Chen F. Uncertainty characteristics of subjective opinions. Proceedings of the 21st International Conference on Information Fusion (FUSION), 2018: 1998–2005.
Wang X, Jiang X, Ding H et al (2019) Bi-directional dermoscopic feature learning and multi-scale consistent decision fusion for skin lesion segmentation [J]. IEEE Trans Image Process 29:3039–3051
Article MATH Google Scholar
Han Z, Zhang C, Fu H, et al. Trusted multi-view classification. International Conference on Learning Representations, 2020.
Maji S, Rahtu E, Kannala J, et al. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.
Ridnik T, Ben-Baruch E, Noy A, et al. Imagenet-21k pretraining for the masses. arXiv preprint arXiv:2104.10972, 2021.
Dubey A, Gupta O, Raskar R, et al. Maximum-entropy fine grained classification. Advances in neural information processing systems, 2018, 31.
Hu Y, Liu X, Zhang B, et al. Alignment Enhancement Network for Fine-grained Visual Categorization. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2021, 17(1s): 1–20.
Chang D, Ding Y, Xie J et al (2020) The devil is in the channels: Mutual-channel loss for fine-grained image classification. IEEE Trans Image Process 29:4683–4695
Article MATH Google Scholar
Joung S, Kim S, Kim M, et al. Learning Canonical 3D Object Representation for Fine-Grained Recognition. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 1035–1045.
Wang S, Li H, Wang Z, et al. Dynamic Position-aware Network for Fine-grained Image Recognition, Proceedings of the AAAI Conference on Artificial Intelligence. 2021, 35(4): 2791–2799.
Chang D, Pang K, Zheng Y, et al. Your" Flamingo" is My" Bird": Fine-Grained, or Not. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 11476–11485.
Zhang L, Huang S, Liu W, et al. Learning a mixture of granularity-specific experts for fine-grained categorization. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 8331–8340.
Gao Y, Han X, Wang X, et al. Channel interaction networks for fine-grained image categorization. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 10818–10825.
Tan M, Yuan F, Yu J, et al. Fine-grained image classification via multi-scale selective hierarchical biquadratic pooling. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022, 18(1s): 1–23.
Zhu H, Ke W, Li D, et al. Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 4692–4702.
Selvaraju R R, Cogswell M, Das A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision. 2017: 618–626.
Yu Y, Zhu H, Wang L et al (2021) Dense crowd counting based on adaptive scene division. Int J Mach Learn Cybern 12(4):931–942
Article Google Scholar
Yue X, Chen Y, Yuan B, et al. Three-way image classification with evidential deep convolutional neural networks. Cognitive Computation, 2021: 1–13.
Yue X, Zhang C, Fujita H et al (2021) Clothing fashion style recognition with design issue graph. Appl Intell 51(6):3548–3560
Article Google Scholar

Download references

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (No. 62163016, 62066014), the Natural Science Foundation of Jiangxi Province (20212ACB202001, 20202BABL202018), Double Thousand Plan of Jiangxi Province of China, the State Key Laboratory of Computer Science Open Subject Fund (CN) under Grant SYSKF2102.

Author information

Authors and Affiliations

College of Software, East China Jiaotong University, Nanchang, 330013, China
Ying Yu, Hong Tang, Jin Qian, Zhiliang Zhu, Zhen Cai & Jingqin Lv
The State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, 100190, China
Zhiliang Zhu

Authors

Ying Yu
View author publications
You can also search for this author in PubMed Google Scholar
Hong Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jin Qian
View author publications
You can also search for this author in PubMed Google Scholar
Zhiliang Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jingqin Lv
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Yu.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yu, Y., Tang, H., Qian, J. et al. Fine-grained image recognition via trusted multi-granularity information fusion. Int. J. Mach. Learn. & Cyber. 14, 1105–1117 (2023). https://doi.org/10.1007/s13042-022-01685-6

Download citation

Received: 18 June 2022
Accepted: 07 October 2022
Published: 22 October 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s13042-022-01685-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fine-grained image recognition via trusted multi-granularity information fusion

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Deep learning models for digital image processing: a review

The improved YOLOv8 algorithm based on EMSPConv and SPE-head modules

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fine-grained image recognition via trusted multi-granularity information fusion

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Deep learning models for digital image processing: a review

The improved YOLOv8 algorithm based on EMSPConv and SPE-head modules

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation