Skip to main content
Log in

Fine-grained image recognition via trusted multi-granularity information fusion

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Fine-grained image recognition (FGIR) is more challenging than general image recognition tasks due to the inherently subtle object variation. The existing FGIR methods are mainly based on single-granularity feature fusion, the extracted fused features often cannot fully reflect the characteristics of the object, and the recognition results based on the fused feature also lack interpretability. To solve this problem, we propose a novel end-to-end trusted multi-granularity information fusion (TMGIF) model for weakly-supervised fine-grained image recognition. It can automatically extract multi-granularity information representation for a fine-grained image, further evaluate the quality of information granules, and then progressively fuse multi-granularity information according to the quality to obtain a reliable and interpretable recognition result. We evaluate TMGIF on three standard benchmark datasets, and demonstrate the proposed method can provide competitive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Wei XS, Song YZ, Mac Aodha O, et al. Fine-grained image analysis with deep learning: a survey. IEEE transactions on pattern analysis and machine intelligence, 2021.

  2. Wah C, Branson S, Welinder P, et al. The caltech-ucsd birds-200–2011 dataset. California Institute of Technology, Pasadena, 2011.

  3. Khosla A, Jayadevaprakash N, Yao B, et al. Novel dataset for fine-grained image categorization: Stanford dogs [C] // Proceedings of CVPR Workshop on Fine-Grained Visual Categorization (FGVC). 2011, 2(1).

  4. Krause J, Stark M, Deng J, et al. 3d object representations for fine-grained categorization. Proceedings of the IEEE international conference on computer vision workshops. 2013: 554–561.

  5. Allegra D, Litrico M, Spatafora M A N, et al. Exploiting Egocentric Vision on Shopping Cart for Out-Of-Stock Detection in Retail Environments. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 1735–1740. https://doi.org/10.1109/ICCVW54120.2021.00199.

  6. Ratnayake M N, Dyer A G, Dorin A. Towards Computer Vision and Deep Learning Facilitated Pollination Monitoring for Agriculture. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 2921–2930. https://doi.org/10.1109/CVPRW53098.2021.00327

  7. Van Horn G, Cole E, Beery S, et al. Benchmarking representation learning for natural world image collections. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12884–12893. https://doi.org/10.1109/CVPR46437.2021.01269.

  8. Liu C, Huynh DQ, Sun Y et al (2020) A vision-based pipeline for vehicle counting, speed estimation, and classification. IEEE Trans Intell Transp Syst 22(12):7547–7560

    Article  Google Scholar 

  9. Min S, Yao H, Xie H et al (2020) Multi-objective matrix normalization for fine-grained visual recognition. IEEE Trans Image Process 29:4996–5009

    Article  MATH  Google Scholar 

  10. Zheng H, Fu J, Zha Z J, et al. Learning deep bilinear transformation for fine-grained image representation. Adv Neural Inform Process Syst 2019, 32.

  11. Wei X, Zhang Y, Gong Y, et al. Grassmann pooling as compact homogeneous bilinear pooling for fine-grained visual classification. Proceedings of the European Conference on Computer Vision (ECCV), 2018: 355–370.

  12. Li JH, Zhou XR (2022) Attribute reduction multi-granularity formal decision contexts. Pattern Recognition and Artifical Intelligence 35(5):387–400

    Google Scholar 

  13. Xin Z, Chen G, Chen J et al (2022) MGPOOL: multi-granular graph pooling convolutional networks representation learning. Int J Mach Learn Cybern 13(3):783–796

    Article  Google Scholar 

  14. Berg T, Belhumeur P N. Poof: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013: 955–962.

  15. Xie L, Tian Q, Hong R, et al. Hierarchical part matching for fine-grained visual categorization. Proceedings of the IEEE international conference on computer vision. 2013: 1641–1648.

  16. Lei J, Duan J, Wu F et al (2016) Fast mode decision based on grayscale similarity and inter-view correlation for depth map coding in 3D-HEVC. IEEE Trans Circuits Syst Video Technol 28(3):706–718

    Article  Google Scholar 

  17. Huang S, Xu Z, Tao D, et al. Part-stacked cnn for fine-grained visual categorization. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 1173–1182.

  18. Nauta M, van Bree R, Seifert C. Neural prototype trees for interpretable fine-grained image recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 14933–14943.

  19. Zhang L, Huang S, Liu W. Intra-class part swapping for fine-grained image classification. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021: 3209–3218.

  20. He G, Li F, Wang Q et al (2021) A hierarchical sampling based triplet network for fine-grained image classification. Pattern Recogn 115:107889

    Article  Google Scholar 

  21. Ding Y, Ma Z, Wen S et al (2021) AP-CNN: Weakly supervised attention pyramid convolutional neural network for fine-grained visual classification. IEEE Trans Image Process 30:2826–2836

    Article  Google Scholar 

  22. Cao S, Wang W, Zhang J, et al. A few-shot fine-grained image classification method leveraging global and local structures. International Journal of Machine Learning and Cybernetics, 2022: 1–9.

  23. He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770–778.

  24. Wang J, Tu Z, Fu J et al (2022) Guest Editorial: Introduction to the Special Section on Fine-Grained Visual Categorization. IEEE Trans Pattern Anal Mach Intell 44(02):560–562

    Article  Google Scholar 

  25. Zhang N, Donahue J, Girshick R, et al. Part-based R-CNNs for fine-grained category detection. European conference on computer vision. Springer, Cham, 2014: 834–849.

  26. Wei XS, Xie CW, Wu J et al (2018) Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recogn 76:704–714

    Article  Google Scholar 

  27. Wang Z, Wang S, Li H, et al. Graph-propagation based correlation learning for weakly supervised fine-grained image classification. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 12289–12296.

  28. Lin T Y, RoyChowdhury A, Maji S. Bilinear cnn models for fine-grained visual recognition. Proceedings of the IEEE international conference on computer vision. 2015: 1449–1457.

  29. Zhuang P, Wang Y, Qiao Y. Learning attentive pairwise interaction for fine-grained classification. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 13130–13137.

  30. Chen Y, Bai Y, Zhang W, et al. Destruction and construction learning for fine-grained image recognition [C] // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 5157–5166.

  31. Du R, Chang D, Bhunia A K, et al. Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. European Conference on Computer Vision. Springer, Cham, 2020: 153–168.

  32. Du R, Xie J, Ma Z, et al. Progressive Learning of Category-Consistent Multi-Granularity Features for Fine-Grained Visual Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

  33. Zhang P, Li T, Wang G et al (2021) Multi-source information fusion based on rough set theory: a review. Information Fusion 68:85–117

    Article  Google Scholar 

  34. Meraner A, Ebel P, Zhu XX et al (2020) Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion. ISPRS J Photogramm Remote Sens 166:333–346

    Article  Google Scholar 

  35. Yu R, Ye D, Wang Z, et al. CFFNN: Cross feature fusion neural network for collaborative filtering. IEEE Transactions on Knowledge and Data Engineering, 2021.

  36. Zhang Z, Zhang X, Peng C, et al. Exfuse: Enhancing feature fusion for semantic segmentation. Proceedings of the European conference on computer vision (ECCV). 2018: 269–284.

  37. Pan Y, Zhang L, Li ZW et al (2019) Improved fuzzy Bayesian network-based risk analysis with interval-valued fuzzy sets and D-S evidence theory. IEEE Trans Fuzzy Syst 28(9):2063–2077

    Article  Google Scholar 

  38. Jøsang A (2002) The consensus operator for combining beliefs. Artif Intell 141(1–2):157–170

    Article  MathSciNet  MATH  Google Scholar 

  39. Jøsang A (2001) A logic for uncertain probabilities. Internat J Uncertain Fuzziness Knowl-Based Syst 9(03):279–311

    Article  MathSciNet  MATH  Google Scholar 

  40. Josang A, Cho J H, Chen F. Uncertainty characteristics of subjective opinions. Proceedings of the 21st International Conference on Information Fusion (FUSION), 2018: 1998–2005.

  41. Wang X, Jiang X, Ding H et al (2019) Bi-directional dermoscopic feature learning and multi-scale consistent decision fusion for skin lesion segmentation [J]. IEEE Trans Image Process 29:3039–3051

    Article  MATH  Google Scholar 

  42. Han Z, Zhang C, Fu H, et al. Trusted multi-view classification. International Conference on Learning Representations, 2020.

  43. Maji S, Rahtu E, Kannala J, et al. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.

  44. Ridnik T, Ben-Baruch E, Noy A, et al. Imagenet-21k pretraining for the masses. arXiv preprint arXiv:2104.10972, 2021.

  45. Dubey A, Gupta O, Raskar R, et al. Maximum-entropy fine grained classification. Advances in neural information processing systems, 2018, 31.

  46. Hu Y, Liu X, Zhang B, et al. Alignment Enhancement Network for Fine-grained Visual Categorization. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2021, 17(1s): 1–20.

  47. Chang D, Ding Y, Xie J et al (2020) The devil is in the channels: Mutual-channel loss for fine-grained image classification. IEEE Trans Image Process 29:4683–4695

    Article  MATH  Google Scholar 

  48. Joung S, Kim S, Kim M, et al. Learning Canonical 3D Object Representation for Fine-Grained Recognition. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 1035–1045.

  49. Wang S, Li H, Wang Z, et al. Dynamic Position-aware Network for Fine-grained Image Recognition, Proceedings of the AAAI Conference on Artificial Intelligence. 2021, 35(4): 2791–2799.

  50. Chang D, Pang K, Zheng Y, et al. Your" Flamingo" is My" Bird": Fine-Grained, or Not. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 11476–11485.

  51. Zhang L, Huang S, Liu W, et al. Learning a mixture of granularity-specific experts for fine-grained categorization. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 8331–8340.

  52. Gao Y, Han X, Wang X, et al. Channel interaction networks for fine-grained image categorization. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 10818–10825.

  53. Tan M, Yuan F, Yu J, et al. Fine-grained image classification via multi-scale selective hierarchical biquadratic pooling. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022, 18(1s): 1–23.

  54. Zhu H, Ke W, Li D, et al. Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 4692–4702.

  55. Selvaraju R R, Cogswell M, Das A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision. 2017: 618–626.

  56. Yu Y, Zhu H, Wang L et al (2021) Dense crowd counting based on adaptive scene division. Int J Mach Learn Cybern 12(4):931–942

    Article  Google Scholar 

  57. Yue X, Chen Y, Yuan B, et al. Three-way image classification with evidential deep convolutional neural networks. Cognitive Computation, 2021: 1–13.

  58. Yue X, Zhang C, Fujita H et al (2021) Clothing fashion style recognition with design issue graph. Appl Intell 51(6):3548–3560

    Article  Google Scholar 

Download references

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (No. 62163016, 62066014), the Natural Science Foundation of Jiangxi Province (20212ACB202001, 20202BABL202018), Double Thousand Plan of Jiangxi Province of China, the State Key Laboratory of Computer Science Open Subject Fund (CN) under Grant SYSKF2102.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Yu.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, Y., Tang, H., Qian, J. et al. Fine-grained image recognition via trusted multi-granularity information fusion. Int. J. Mach. Learn. & Cyber. 14, 1105–1117 (2023). https://doi.org/10.1007/s13042-022-01685-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01685-6

Keywords

Navigation