Skip to main content
Log in

Multilayer feature fusion with parallel convolutional block for fine-grained image classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Fine-grained image classification aims at classifying the image subclass under a certain category. It is a challenging task due to the similar features, different gestures and background interference of the images. A key issue in fine-grained image classification is to extract the discriminative regions of images accurately. This paper proposed a multilayer feature fusion (MFF) network with parallel convolutional block (PCB) mechanism to solve this problem. We use the bilinear matrix product to mix different layers’ feature matrixes and then add them to the fully connection layer and the softmax function. In addition, the original convolutional blocks are replaced by the proposed PCB, which has more effective residual connection ability in extracting the region of interest (ROI) and the parallel convolutions with different sizes of kernels. Experimental results on three international available fine-grained datasets demonstrate the effectiveness of the proposed model. Quantitative and visualized experimental results show that our model has higher classification precision compared with the state-of-the-arts ones. Our classification accuracy reaches 87.1%, 91.4% and 93.4% on the dataset CUB-200-2011, FGVC Aircraft and Stanford Cars, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, pp 1097–1105

  2. Simonyan K (2014) A Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  3. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, pp 448–456

  4. He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  5. Howard A-G, Zhu M, Kalenichenko B-D, Wang W, Weyand T, Andreetto M, AdamChen H (2017) MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:1704.04861

  6. Huang G, Liu Z, Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2261–2269

  7. Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-CNNs for fine-grained category detection. In: Proceedings of the European Conference on Computer Vision, pp 834– 849

  8. Branson S, Van Horn G, Belongie S, Perona P (2014) Bird species categorization using pose normalized deep convolutional nets. In: Proceedings of the BMVC 2014—British Machine Vision Conference

  9. Berg T, Belhumeur P -N (2013) POOF: Part-based One-vs-one features for fine-grained categorization, face verifification, and attribute estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 955–962

  10. Xie L, Tian Q, Hong R, Yan S, Zhang B (2013) Hierarchical part matching for fine-grained visual categorization. In: Proceedings of the IEEE Conference on International Conference on Computer Vision, pp 1641–1648

  11. Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp 842–850

  12. Zhang X, Xiong H, Zhou W, Lin W, Tian Q (2016) Picking deep filter responses for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1134–1142

  13. Lin T-Y, RoyChowdhury A, Maji S (2015) Bilinear cnn models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1449–1457

  14. Jaderberg M, Simonyan K, Zisserman A (2015) Spatial transformer networks. Advances in Neural information Processing Systems, pp 2017–2025

  15. Ji Z., Zhao K., Zhang S., Li M (2019) Classification of fine-grained fish images based on spatial transformation bilinear networks. Journal of TianJin University 52:475–482

    Google Scholar 

  16. Gao Y, Beijbom O, Zhang N, Darrell T (2016) Compact bilinear pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 317–326

  17. Kong S, Fowlkes C (2017) Low-rank bilinear pooling for fine-grained classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 365–374

  18. Yu C, Zhao X, Zheng Q, Zhang P, You X (2018) Hierarchical bilinear pooling for fine-grained visual recognition. In: Proceedings of the IEEE Conference on European Conference, pp 595–610

  19. Moghimi M, Belongie SJ, Saberian MJ, Yang J, Vasconcelos N, Li L-J (2016) Boosted convolutional neural networks. In: Proceedings of the British Machine Vision Conference

  20. Lin TY, Maji S (2017) Improved bilinear pooling with CNNs. In: Proceedings of British Machine Vision Conference, pp 395.1–395. 12

  21. Li Z, Yang Y, Liu X, Zhou F, Wen S, Xu W (2017) Dynamic computational time for visual attention. In: Proceedings of International Conference on Computer Vision Workshops, pp 1199–1209

  22. Cai S, Zuo W, Zhang L (2017) Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: Proceedings of the IEEE International Conference on Computer Vision, pp 511–520

  23. Cui Y, Zhou F, Wang J, Liu X, Lin Y, Belongie S (2017) Kernel pooling for convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2921–2930

  24. Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4438–4446

  25. Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5209–5217

  26. Peng Y, He X, Zhao J (2018) Object-part attention model for fine-grained image classification. IEEE Trans Image Process 27:1487–1500

    Article  MathSciNet  Google Scholar 

  27. Han K, Guo J, Zhang C, Zhu M (2018) Attribute-aware attention model for fine-grained representation learning. In: Proceedings of the Multimedia Conference on Multimedia Conference, pp 2040–2048

  28. He X, Peng Y, Zhao J (2019) Fast Fine-Grained image classification via weakly supervised discriminative localization. IEEE Trans Circuits Syst Video Technol 29:1394–1407

    Article  Google Scholar 

  29. Tan M, Wang G, Zhou J, Peng Z, Zheng M (2019) Fine-grained classification via hierarchical bilinear pooling with aggregated slack mask. IEEE Access 7:117944–117953

    Article  Google Scholar 

  30. Wang YM, Morariu VI, Davis LS (2018) Learning a discriminative filter bank within a CNN for fine-grained recognition. In: Proceedings of the IEEE Computer Vision and Pattern Recognition, pp 5209–5217

  31. Yang Z, Luo T, Wang D, Hu Z, Gao J, Wang L (2018) Learning to navigate for fine-grained classification. In: Proceedings of the European Conference, pp 420–435

  32. Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and construction learning for fine-grained image recognition. In: Proceedings of Computer Vision and Pattern Recognition, pp 5157—5166

  33. Xin Q, Lv T, Gao H (2019) Random part localization model for fine grained image classification. In: Proceedings of International Conference on Image Processing, pp 420–424

  34. Hu T, Xu J, Huang C, Qi H, Huang Q, Lu Y (2018) Weakly Supervised Bilinear Attention Network for Fine-Grained Visual Classification. arXiv:1808.02152

  35. Min S, Yao H, Xie H, Zha ZJ, Zhang Y (2020) Multi-objective matrix normalization for fine-grained visual recognition. IEEE Trans Image Process 29:4996–5009

    Article  Google Scholar 

  36. Zheng H, Fu J, Zha Z.J., Luo J (2019) Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition. In: Proceedings of Computer Vision and Pattern Recognition, pp 5012–5021

  37. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The Caltech-UCSD birds-200-2011 dataset, Comput Neural Syst California Inst Technol

  38. Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv:1306,5151

  39. Krause J, Stark M, Deng J, Fei-Fei L (2013) 3D object representations for fine-grained categorization. In: Proc IEEE Int Conf Comput Vis Workshops, pp 554–561

  40. Zheng H, Fu J, Zha ZJ, Luo J, Mei T (2020) Learning rich part hierarchies with progressive attention networks for fine-grained image recognition. IEEE Trans Image Process 29:476–488

    Article  MathSciNet  Google Scholar 

  41. Wei XS, Luo JH, Wu J, Zhou ZH (2017) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process 26:2868–2881

    Article  MathSciNet  Google Scholar 

  42. Rodríguez P, Velazquez D, Cucurull G, Gonfaus JM, Roca FX, Gonzàlez J (2020) Pay attention to the activations: a modular attention mechanism for Fine-Grained image recognition. IEEE Trans Image Process 22:502–514

    Google Scholar 

  43. Wang W, Zhang J, Wang F (2019) Attention bilinear pooling for fine-grained classification. Symmetry 11:1033

    Article  Google Scholar 

  44. Chen F, Huang G, Lan J, Wu Y, Pun C, Ling WK, Cheng L (2020) Weakly supervised Fine-Grained image classification via salient region localization and different layer feature fusion. Appl Sci 10:4652

    Article  Google Scholar 

  45. Ye Z, Hu F, Liu Y, Xia Z, Lyu F (2020) Pengqing Liu:Associating Multi-Scale Receptive Fields For Fine-Grained Recognition. ICIP: 1851–1855

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai He.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L., He, K., Feng, X. et al. Multilayer feature fusion with parallel convolutional block for fine-grained image classification. Appl Intell 52, 2872–2883 (2022). https://doi.org/10.1007/s10489-021-02573-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02573-2

Keywords

Navigation