Skip to main content

Advertisement

Graph-in-graph discriminative feature enhancement network for fine-grained visual classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Fine-grained visual classification (FGVC) seeks to identify sub-classes within the same meta-class. Prior efforts mainly mine the features of discriminative parts to enhance classification performance. However, we argue that most of these works ignore the spatial details inside each part and the spatial correlations between parts when extracting local features and fusing global features, inhibiting the further improvement of feature quality, especially for the irregular discriminative parts. To alleviate this issue, we rethink the feature generation route from pixels to parts and to objects, and propose a novel graph-in-graph discriminative feature enhancement network (G\(^{2}\)DFE-Net). Specifically, the G\(^{2}\)DFE-Net consists of two nested graph convolutional networks, where an internal graph is first developed based on the spatial attention strategy to highlight details of the irregular discriminative regions. Then, a KNN-based external graph is introduced to capture the spatial context correlation among independent discriminative parts. With the collaboration of internal and external graph, G\(^{2}\)DFE-Net boosts the class separability and compactness of global feature representation, thereby benefiting the accurate FGVC. We conduct thorough experiments on five benchmark datasets, and both quantitative and qualitative results confirm the superior accuracy of our G\(^{2}\)DFE-Net compared to previous state-of-the-art algorithms. The code is available at https://github.com/WangYuPeng1/G2DFE-Net.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability and access

Data will be made available on request.

References

  1. Xie J, Zhong Y, Zhang J et al (2023) A weakly supervised spatial group attention network for fine-grained visual recognition. Appl Intell 53(20):23301–23315

    Article  MATH  Google Scholar 

  2. Yu Y, Wang J, Pedrycz W et al (2024) Multi-level information fusion transformer with background filter for fine-grained image recognition. Appl Intell 1–12

  3. Wang L, He K, Feng X et al (2022) Multilayer feature fusion with parallel convolutional block for fine-grained image classification. Appl Intell 52(3):2872–2883

    Article  MATH  Google Scholar 

  4. Lin D, Shen X, Lu C et al (2015) Deep lac: Deep localization, alignment and classification for fine-grained recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1666–1674

  5. Huang S, Xu Z, Tao D et al (2016) Part-stacked cnn for fine-grained visual categorization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1173–1182

  6. Wang J, Li N, Luo Z et al (2021) High-order-interaction for weakly supervised fine-grained visual categorization. Neurocomputing 464:27–36

    Article  MATH  Google Scholar 

  7. Xu S, Chang D, Xie J et al (2021) Grad-cam guided channel-spatial attention module for fine-grained visual classification. In: 2021 IEEE 31st international workshop on machine learning for signal Processing (MLSP). IEEE. pp 1–6

  8. Guo C, Lin Y, Chen S et al (2022) From the whole to detail: Progressively sampling discriminative parts for fine-grained recognition. Knowl-Based Syst 235:107651

    Article  Google Scholar 

  9. Hu T, Qi H, Huang Q et al (2019) See better before looking closer: Weakly supervised data augmentation network for fine-grained visual classification. arXiv preprint arXiv:1901.09891https://doi.org/10.48550/arXiv.1901.09891

  10. Chen J, Li H, Liang J et al (2022) Attention-based cropping and erasing learning with coarse-to-fine refinement for fine-grained visual classification. Neurocomputing 501:359–369

    Article  MATH  Google Scholar 

  11. Rao Y, Chen G, Lu J et al (2021) Counterfactual attention learning for fine-grained visual categorization and re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 1025–1034

  12. Li W, Li S, Yin L et al (2022) A novel visual classification framework on panoramic attention mechanism network. IET Comput Vision 16:479–488

    Article  MATH  Google Scholar 

  13. He J, Chen JN, Liu S et al (2022) Transfg: A transformer architecture for fine-grained recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence. pp 852–860

  14. Liu D, Zhao L, Wang Y et al (2023) Learn from each other to classify better: Cross-layer mutual attention learning for fine-grained visual classification. Pattern Recogn 140:109550

    Article  Google Scholar 

  15. Ding Y, Ma Z, Wen S et al (2021) Ap-cnn: Weakly supervised attention pyramid convolutional neural network for fine-grained visual classification. IEEE Trans Image Process 30:2826–2836

    Article  MATH  Google Scholar 

  16. Zhuang G, Hu Y, Yan T et al (2024) Gcam: Gaussian and causal-attention model of food fine-grained recognition. Sig Image Video Process 1–12

  17. Guo C, Lin Y, Xu M et al (2023) Inverse transformation sampling-based attentive cutout for fine-grained visual recognition. Vis Comput 39:2597–2608

    Article  MATH  Google Scholar 

  18. Wang C, Qian Y, Gong W et al (2022) Cross-layer progressive attention bilinear fusion method for fine-grained visual classification. J Vis Commun Image Represent 82:103414

    Article  MATH  Google Scholar 

  19. Xu Q, Li S, Wang J et al (2024) Context-semantic quality awareness network for fine-grained visual categorization. arXiv preprint arXiv:2403.10298

  20. Gori M, Monfardini G, Scarselli F (2005) A new model for learning in graph domains. In: Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005, vol 2. IEEE, pp 729–734

  21. Bruna J, Zaremba W, Szlam A et al (2013) Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203

  22. Wang Y, Sun Y, Liu Z et al (2019) Dynamic graph cnn for learning on point clouds. ACM Trans Graphics (tog) 38:1–12

    MATH  Google Scholar 

  23. Ying Z, You J, Morris C et al (2018) Hierarchical graph representation learning with differentiable pooling. Adv Neural Inf Process Syst 31

  24. Jia S, Jiang S, Zhang S et al (2022) Graph-in-graph convolutional network for hyperspectral image classification. IEEE Trans. Neural Netw Learn Syst https://doi.org/10.1109/TNNLS.2022.3182715

  25. Ren H, Lu W, Xiao Y et al (2022) Graph convolutional networks in language and vision: A survey. Knowl-Based Syst 251:109250

    Article  MATH  Google Scholar 

  26. Wang M, Wu L, Li M et al (2022) Meta-learning based spatial-temporal graph attention network for traffic signal control. Knowl-Based Syst 250:109166

    Article  MATH  Google Scholar 

  27. Wang Z, Wu Z, Li X et al (2023) Attention-aware temporal-spatial graph neural network with multi-sensor information fusion for fault diagnosis. Knowl-Based Syst 278:110891

    Article  MATH  Google Scholar 

  28. Zhu H, Wang H, Kang D et al (2019) Study of joint temporal-spatial distribution of array output for large-scale photovoltaic plant and its fault diagnosis application. Sol Energy 181:137–147

    Article  MATH  Google Scholar 

  29. Bera A, Wharton Z, Liu Y et al (2022) Sr-gnn: Spatial relation-aware graph neural network for fine-grained image categorization. IEEE Trans Image Process 31:6017–6031

    Article  MATH  Google Scholar 

  30. Yang X, Wang Y, Chen K et al (2022) Fine-grained object classification via self-supervised pose alignment. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 7399–7408

  31. Zhao Y, Yan K, Huang F et al (2021) Graph-based high-order relation discovery for fine-grained recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 15079–15088

  32. Tang Z, Yang H, Chen CYC (2023) Weakly supervised posture mining for fine-grained classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 23735–23744

  33. Wang S, Wang Z, Li H et al (2024) Accurate fine-grained object recognition with structure-driven relation graph networks. Int J Comput Vision 132(1):137–160

    Article  MATH  Google Scholar 

  34. Wu F, Souza A, Zhang T et al (2019) Simplifying graph convolutional networks. In: International conference on machine learning. PMLR, pp 6861–6871

  35. Wen Y, Zhang K, Li Z et al (2016) A discriminative feature learning approach for deep face recognition. In: Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VII 14. Springer, pp 499–515

  36. Wah C, Branson S, Welinder P et al (2011) The caltech-ucsd birds-200-2011 dataset. California Institute of Technology

  37. Maji S, Rahtu E, Kannala J et al (2013) Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151https://doi.org/10.48550/arXiv.1306.5151

  38. Krause J, Stark M, Deng J et al (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561

  39. Khosla A, Jayadevaprakash N, Yao B et al (2011) Novel dataset for fine-grained image categorization: Stanford dogs. In: Proc. CVPR workshop on fine-grained visual categorization (FGVC). Citeseer

  40. Van Horn G, Branson S, Farrell R et al (2015) Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 595–604

  41. Do T, Tran H, Tjiputra E et al (2022) Fine-grained visual classification using self assessment classifier. arXiv preprint arXiv:2205.10529

  42. Yao H, Miao Q, Zhao P et al (2024) Exploration of class center for fine-grained visual classification. IEEE Trans Circuits Syst Video Technol 1–1. https://doi.org/10.1109/TCSVT.2024.3406443

  43. Ke X, Cai Y, Chen B et al (2023) Granularity-aware distillation and structure modeling region proposal network for fine-grained image classification. Pattern Recogn 137:109305

    Article  MATH  Google Scholar 

  44. Yu D, Fang Z, Jiang Y (2024) Foreground feature enhancement and peak & background suppression for fine-grained visual classification. In: International conference on multimedia modeling. Springer, pp 134–146

  45. Song W, Chen D (2024) Posture-guided part learning for fine-grained image categorization. J Electron Imaging 33(3):033013–033013

    Article  MATH  Google Scholar 

  46. Zhang T, Chang D, Ma Z et al (2021) Progressive co-attention network for fine-grained visual classification. In: 2021 International Conference on Visual Communications and Image Processing (VCIP). IEEE, pp 1–5

  47. Yang M, Xu Y, Wu Z et al (2022) Symmetrical irregular local features for fine-grained visual classification. Neurocomputing 505:304–314

    Article  MATH  Google Scholar 

  48. Zhao P, Li Y, Tang B et al (2023) Feature relocation network for fine-grained image classification. Neural Netw 161:306–317

  49. Ji R, Li J, Zhang L (2023) Siamese self-supervised learning for fine-grained visual classification. Comput Vis Image Underst 229:103658

  50. Zhuang P, Wang Y, Qiao Y (2020) Learning attentive pairwise interaction for fine-grained classification. In: Proceedings of the AAAI conference on artificial intelligence. pp 13130–13137

  51. Du R, Xie J, Ma Z et al (2022) Progressive learning of category-consistent multi-granularity features for fine-grained visual classification. IEEE Trans Pattern Anal Mach Intell 44(12):9521–9535

    Article  MATH  Google Scholar 

  52. Lin Z, Zheng Z, Jia J et al (2023) Ml-capsnet meets vb-di-d: A novel distortion-tolerant baseline for perturbed object recognition. Eng Appl Artif Intell 120:105937. https://doi.org/10.1016/j.engappai.2023.105937

    Article  MATH  Google Scholar 

  53. Zhu L, Chen T, Yin J et al (2023) Learning gabor texture features for fine-grained recognition. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 1621–1631

Download references

Acknowledgements

This article has been supported by the Jiangsu Province Key R&D Program (Modern Agriculture) Key Project (BE2023352), Key Medical Research Projects of Jiangsu Provincial Health Commission (ZD2022068), National Natural Science Foundation of China (61941113).

Author information

Authors and Affiliations

Authors

Contributions

Yupeng Wang: Conceptualization, Software, Writing - Original draft preparation; Can Xu: Writing - Review & Editing; Yongli Wang: Methodology, Funding acquisition, Supervision; Weiping Ding: Visualization, Formal analysis and investigation, Supervision; Xiaoli Wang: Methodology, Data curation.

Corresponding author

Correspondence to Yongli Wang.

Ethics declarations

Competing interests

The authors declare that they have no conflict of interest.

Ethical and informed consent for data used

The study utilized publicly available datasets, ensuring adherence to ethical standards and data privacy regulations.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Xu, C., Wang, Y. et al. Graph-in-graph discriminative feature enhancement network for fine-grained visual classification. Appl Intell 55, 22 (2025). https://doi.org/10.1007/s10489-024-05846-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05846-8

Keywords