Multi-level navigation network: advancing fine-grained visual classification

Liang, Hong; Li, Xian; Shao, Mingwen; Zhang, Qian

doi:10.1007/s11227-025-06933-4

Multi-level navigation network: advancing fine-grained visual classification

Published: 21 January 2025

Volume 81, article number 409, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Hong Liang¹,
Xian Li¹,
Mingwen Shao¹ &
…
Qian Zhang¹

85 Accesses
Explore all metrics

Abstract

Fine-grained visual classification (FGVC) is defined as the finer division of sub-categories within basic categories. The task is both valuable and challenging. Its difficulty primarily arises from its intrinsic slight inter-class variations and substantial intra-class differences. The crucial solution to FGVC lies in identifying local regions with subtle yet discriminative features and effectively representing them. Nevertheless, with the increasing prevalence of deep convolutional neural networks, researchers have primarily prioritized the use of high-level, abstract, semantic features to achieve FGVC, consequently overlooking low-level, detailed information, resulting in poor feature representation capabilities. Thus, we put forward the multi-level navigation network, denoted as MLNN, to enhance feature representation by incorporating both high-level semantics and low-level details. Specifically, MLNN is composed of (1) the feature refinement and attention enhancement module, which enables the network to learn detailed feature representations and further enhance features with attention mechanisms, and (2) the triplet-enhanced multi-level fusion module, which integrates the features of different levels, leading to a more comprehensive feature representation. Experimental outcomes reveal that our approach attains state-of-the-art performance on three widely-accepted benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-directional guidance network for fine-grained visual classification

Article 29 January 2024

Recombining Vision Transformer Architecture for Fine-Grained Visual Categorization

Feature reinforcement meets feature suppression: a hierarchical bilateral method for fine-grained visual classification

Article 04 April 2025

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Code availability

Some or all of the code used during the study is available on request from the corresponding author.

References

Wang W, Cui Y, Li G, Jiang C, Deng S (2020) A self-attention-based destruction and construction learning fine-grained image classification method for retail product recognition. Neural Comput Appl 32:14613–14622
Article MATH Google Scholar
Xin D, Chen YW, Li J (2020) Fine-grained butterfly classification in ecological images using squeeze-and-excitation and spatial attention modules. Appl Sci 10:1681
Article MATH Google Scholar
Yang G, He Y, Yang Y, Xu B (2020) Fine-grained image classification for crop disease based on attention mechanism. Front Plant Sci 11:600854
Article MATH Google Scholar
Berg T, Belhumeur (2013) PN. Poof: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 955–962
Xie L, Tian Q, Hong R, Yan S, Zhang B (2013) Hierarchical part matching for fine-grained visual categorization. In: Proceedings of the IEEE international conference on computer vision, pp 1641–1648
Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based R-CNNs for fine-grained category detection. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part I 13. Springer, pp 834–849
Branson S, Van Horn G, Belongie S, Perona P (2014) Bird species categorization using pose normalized deep convolutional nets. arXiv preprint arXiv:1406.2952
Lin TY, RoyChowdhury A, Maji S (2015) Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1449–1457
Wang D, Shen Z, Shao J, Zhang W, Xue X, Zhang Z (2015) Multiple granularity descriptors for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2399–2406
Zheng H, Fu J, Zha ZJ, Luo J, Mei T (2019) Learning rich part hierarchies with progressive attention networks for fine-grained image recognition. IEEE Trans Image Process 29:476–488
Article MathSciNet MATH Google Scholar
He J, Chen JN, Liu S, Kortylewski A, Yang C, Bai Y et al (2022) Transfg: A transformer architecture for fine-grained recognition. Proceedings of the AAAI Conference on Artificial Intelligence 36:852–860
Article Google Scholar
Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4438–4446
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5209–5217
Sun M, Yuan Y, Zhou F, Ding E (2018) Multi-attention multi-class constraint for fine-grained image recognition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 805–821
Gao Y, Han X, Wang X, Huang W, Scott M (2020) Channel interaction networks for fine-grained image categorization. Proceedings of the AAAI Conference on Artificial Intelligence 34:10818–10825
Article MATH Google Scholar
Wang L, He K, Feng X, Ma X (2022) Multilayer feature fusion with parallel convolutional block for fine-grained image classification. Appl Intell 52:2872–2883
Article MATH Google Scholar
Huang R, Wang Y, Yang H (2022) Cross-layer attention network for fine-grained visual categorization. arXiv preprint arXiv:2210.08784
Luo W, Yang X, Mo X, Lu Y, Davis LS, Li J, et al (2019) Cross-x learning for fine-grained visual categorization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 8242–8251
Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and construction learning for fine-grained image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5157–5166
Chen J, Yu S, Liang J (2023) A Cross-layer Self-attention Learning Network for Fine-grained Classification. In: 2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE). IEEE, pp 541–545
Liu M, Zhang C, Bai H, Zhang R, Zhao Y (2021) Cross-part learning for fine-grained image classification. IEEE Trans Image Process 31:748–758
Article MATH Google Scholar
Lei J, Yang X, Yang S (2022) Multiscale progressive complementary fusion network for fine-grained visual classification. IEEE Access 10:62800–62810
Article MATH Google Scholar
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125
Zhang F, Wang G, Wu M, Huang S (2023) Multi-branch selection fusion fine-grained classification algorithm based on coordinate attention localization. AI Commun 36:205–217
Article MathSciNet MATH Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3431–3440
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer, pp 234–241
Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 845–853
Zhu Q, Li Z, Kuang W, Ma H (2023) A multichannel location-aware interaction network for visual classification. Appl Intell 53:23049–23066
Article MATH Google Scholar
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 815–823
Yu B, Liu T, Gong M, Ding C, Tao D (2018) Correcting the triplet selection bias for triplet loss. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 71–87
Wah C, Branson S, Welinder P, Perona P, Belongie SJ (2011) The Caltech-UCSD Birds-200-2011 Dataset. California Institute of Technology; CIT Technical Report No. 2011-001. Technical Report. https://api.semanticscholar.org/CorpusID:16119123
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 554–561
Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Wang Y, Morariu VI, Davis LS (2018) Learning a discriminative filter bank within a CNN for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4148–4157
Liu C, Xie H, Zha ZJ, Ma L, Yu L, Zhang Y (2020) Filtration and distillation: Enhancing region attention for fine-grained visual categorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 11555–11562
Zheng H, Fu J, Zha ZJ, Luo J (2019) Learning deep bilinear transformation for fine-grained image representation. Advances in Neural Information Processing Systems, vol 32
Ding Y, Zhou Y, Zhu Y, Ye Q, Jiao J (2019) Selective sparse sampling for fine-grained image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6599–6608
Yang S, Yang X, Wu J, Feng B (2024) Significant feature suppression and cross-feature fusion networks for fine-grained visual classification. Sci Rep 14:24051
Article MATH Google Scholar
Du Y, Rui T, Li H, Yang C, Wang D (2023) DeepBP: a bilinear model integrating multi-order statistics for fine-grained recognition. Comput Electr Eng 105:108432
Article MATH Google Scholar
Zhuang P, Wang Y, Qiao Y (2020) Learning attentive pairwise interaction for fine-grained classification. Proceedings of the AAAI Conference on Artificial Intelligence 34:13130–13137
Article MATH Google Scholar
Zhang T, Chang D, Ma Z, Guo J (2021) Progressive co-attention network for fine-grained visual classification. In: 2021 International Conference on Visual Communications and Image Processing (VCIP). IEEE, pp 1–5

Download references

Acknowledgements

The authors are very indebted to the anonymous referees for their critical comments and suggestions for the improvement of this paper.

Funding

This work was supported by the National Natural Science Foundation of China (No. 61673396) and the Natural Science Foundation of Shandong Province (No. ZR2022MF260).

Author information

Authors and Affiliations

Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, Shandong, China
Hong Liang, Xian Li, Mingwen Shao & Qian Zhang

Authors

Hong Liang
View author publications
You can also search for this author inPubMed Google Scholar
Xian Li
View author publications
You can also search for this author inPubMed Google Scholar
Mingwen Shao
View author publications
You can also search for this author inPubMed Google Scholar
Qian Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

HL and XL were instrumental in devising the concept; HL, XL, and QZ contributed to the development of the methodology. The software development was overseen by XL and QZ. HL, XL, MS, and QZ conducted the formal analysis. XL was in charge of drafting the initial manuscript. HL and MS participated in revising the manuscript and provided editorial input, secured funding, and supervised the project. Additionally, HL, MS, and QZ provided the necessary resources for the study.

Corresponding author

Correspondence to Xian Li.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, H., Li, X., Shao, M. et al. Multi-level navigation network: advancing fine-grained visual classification. J Supercomput 81, 409 (2025). https://doi.org/10.1007/s11227-025-06933-4

Download citation

Accepted: 13 January 2025
Published: 21 January 2025
DOI: https://doi.org/10.1007/s11227-025-06933-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-level navigation network: advancing fine-grained visual classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-directional guidance network for fine-grained visual classification

Recombining Vision Transformer Architecture for Fine-Grained Visual Categorization

Feature reinforcement meets feature suppression: a hierarchical bilateral method for fine-grained visual classification

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now