Coordinate feature fusion networks for fine-grained image classification

Liao, Kaiyang; Huang, Gang; Zheng, Yuanlin; Lin, Guangfeng; Cao, Congjun

doi:10.1007/s11760-022-02291-3

Coordinate feature fusion networks for fine-grained image classification

Original Paper
Published: 20 July 2022

Volume 17, pages 807–815, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Kaiyang Liao^1,3,
Gang Huang¹,
Yuanlin Zheng^1,2,
Guangfeng Lin¹ &
…
Congjun Cao^1,3

358 Accesses
1 Altmetric
Explore all metrics

Abstract

Learning feature representations from discriminative local features plays a key role in fine-grained classification, but many methods tend to focus only on salient features in images and ignore most latent features. Exploiting rich features between channels and spaces helps capture this difference. Based on this idea, this paper proposes a Coordinate Feature Fusion Network (CFFN), which can be modeled by the channel and spatial feature interactions of images. CFFN consists of Feature Enhancement and Suppression Modules (FESM) and Coordinate Feature Interaction Module (CFIM). FESM gets saliency factors by aggregating the most salient parts in spatial and channel features, which obtains salient features through feature mapping, and suppresses the obtained salient features to force the network to mine the remaining latent features. Through the saliency and latent feature modules, more discriminative features can be effectively captured. The CFIM module can explore feature correlations in images, and the model learns complementary features from related channels and spaces, resulting in stronger fine-grained features. Our model can be trained in an end-to-end manner and does not require bounding boxes. It achieves 89.5, 93.4 and 94.8% accuracy on three benchmark datasets CUB-200–2011, FGVC-Aircraft and Stanford Cars, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Meng-Hao Guo, Tian-Xing Xu, … Shi-Min Hu

Data availability

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

References

Krizhevsky Alex, Ilya Sutskever, and Geoffrey E. Hinton.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems. 25:1097–1105 (2012)
Ren, Shaoqing, et al.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: International Conference on Neural Information Processing Systems. 28, 91–99 (2015)
Long, Jonathan, Evan Shelhamer, and Trevor Darrell.: Fully convolutional networks for semantic segmentation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence . 39(4), 640–651 (2015)
Xie, Lingxi, et al.: Hierarchical part matching for fine-grained visual categorization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1641–1648 (2013)
Chai, Y., Lempitsky, V. & Zisserman, A.: Symbiotic Segmentation and Part Localization for Fine-Grained Categorization. In: IEEE International Conference on Computer Vision, pp. 321–328 (2013). https://doi.org/10.1109/iccv.2013.47.
Zhang, N., et al.: Part-based R-CNNs for fine-grained category detection. In: European Conference on Computer Vision, pp. 834–849. Springer, Cham (2014)
Google Scholar
Zheng, Heliang, et al.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 5209–5217(2017)
Zhang, Xiaopeng, et al.: Picking deep filter responses for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1134–1142(2016)
Zhao, B., Wu, X., Feng, J., Peng, Q., Yan, S.: Diversified visual attention networks for fine-grained object classification. IEEE Trans. Multimedia 19(6), 1245–1256 (2017)
Article Google Scholar
Liu, W., Anguelov, D., et al.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision. Springer, Cham, pp. 21–37 (2016)
Lin, Tsung-Yi, et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Cai, Sijia, Wangmeng Zuo, and Lei Zhang.: Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–520 (2017)
Song, J. & Yang, R.: Feature Boosting, Suppression, and Diversification for Fine-Grained Visual Classification. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2021) https://doi.org/10.1109/ijcnn52387.2021.9534004
Lin, T. Y., RoyChowdhury, A., & Maji, S.: Bilinear cnn models for fine-grained visual recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 1449–1457 (2015)
Gao, Yu, et al.: Channel interaction networks for fine-grained image categorization. In: Proceedings of the AAAI Conference on Artificial Intelligence. 34(07), 10818–10825 (2020)
Diao Q, Jiang Y, Wen B, et al.: MetaFormer: A Unified Meta Framework for Fine-Grained Recognition. arXiv preprint arXiv:2203.02751. (2022)
Hou, Q., Zhou, D., & Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Radenovic, F., Tolias, G. & Chum, O.: Fine-Tuning CNN Image Retrieval with No Human Annotation. IEEE Trans Pattern Anal Mach Intell. 41(7), 1655–1668 (2019). https://www.ncbi.nlm.nih.gov/pubmed/29994246
Yang, Ze, et al.: Learning to navigate for fine-grained classification. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 420–435 (2018)
Zhuang, P., Wang, Y., & Qiao, Y.: Learning attentive pairwise interaction for fine-grained classification. In: Proceedings of the AAAI Conference on Artificial Intelligence. 34(07), 13130–13137 (2020)
Zhou, M., Bai, Y., Zhang, W., Zhao, T., & Mei, T.: Look-into-object: Self-supervised structure modeling for object recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11774–11783 (2020)
Tan, M., Yuan, F., Yu, J., Wang, G., Gu, X.: Fine-grained Image Classification via Multi-scale Selective Hierarchical Biquadratic Pooling. ACM Trans. Multimed. Comput. Commun. Appl. 18(1s), 1–23 (2022)
Article Google Scholar
Zhang, Y., et al.: MSEC: Multi-Scale Erasure and Confusion for fine-grained image classification. Neurocomputing 449, 1–14 (2021)
Article Google Scholar
Zhang, L., Huang, S., Liu, W.: Learning sequentially diversified representations for fine-grained categorization. Pattern Recognit. 121, 10821 (2022)
Article Google Scholar
Zhang, L., Huang, S., Liu, W., & Tao, D.: Learning a mixture of granularity-specific experts for fine-grained categorization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8331–8340 (2019)
Ding, Y., Zhou, Y., Zhu, Y., Ye, Q., & Jiao, J.: Selective sparse sampling for fine-grained image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6599–6608 (2019)
Liu, Chuanbin, et al.: Filtration and distillation: Enhancing region attention for fine-grained visual categorization. In: Proceedings of the AAAI Conference on Artificial Intelligence. 34(07), 11555–11562 (2020)

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China Project No.52075435, 61771386 and Natural Science Foundation of Shaanxi Province No.2021JM-340

Author information

Authors and Affiliations

College of Faculty of Printing, Packaging Engineering and Digital Media Technology, Xi’an University of Technology, Xi’an, 710048, China
Kaiyang Liao, Gang Huang, Yuanlin Zheng, Guangfeng Lin & Congjun Cao
Key Lab of Printing and Packaging Engineering of Shaanxi Province, Xi’an, 710048, China
Yuanlin Zheng
Printing and Packaging Engineering Technology Research Centre of Shaanxi Province, Xi’an, 710048, China
Kaiyang Liao & Congjun Cao

Authors

Kaiyang Liao
View author publications
You can also search for this author in PubMed Google Scholar
Gang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanlin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Guangfeng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Congjun Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kaiyang Liao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liao, K., Huang, G., Zheng, Y. et al. Coordinate feature fusion networks for fine-grained image classification. SIViP 17, 807–815 (2023). https://doi.org/10.1007/s11760-022-02291-3

Download citation

Received: 21 January 2022
Revised: 15 June 2022
Accepted: 15 June 2022
Published: 20 July 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11760-022-02291-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Coordinate feature fusion networks for fine-grained image classification

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Attention mechanisms in computer vision: A survey

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Coordinate feature fusion networks for fine-grained image classification

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Attention mechanisms in computer vision: A survey

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation