Abstract
Few-shot fine-grained image classification aims to recognize sub-categories of the same super-category given only a few labeled samples. To deal with the low inter-class variation and the high intra-class discordance, both the supervised guidance from the global view and the detail information hidden in the local structure are necessary. However, such global structure and local detail are usually applied separately by existing methods, as a result, the features are not discriminative enough. To address this issue, we propose a novel few-shot fine-grained image classification framework which enhances the Discriminative ability of Local structures utilizing class-aware Global structures (DLG). Firstly, the DLG model calculates the global structures utilizing prototype representations of each class, and then constructs class-aware attention maps for query images to enhance their discriminative local structures with the aid of global structures. Finally, a classification module based on local structures is performed to make predictions. Results of case studies demonstrate that the class-aware attention maps can focus on class discriminative regions. Extensive experiments on fine-grained datasets demonstrate that DLG outperforms the state-of-the-art methods. Taking Stanford Dogs as an example, the proposed DLG outperforms the baselines. More specifically, DLG obtains at least 13.4% and 17.9% average gain on accuracy for 1-shot and 5-shot classification problem respectively. Code can be found at https://gitee.com/csy213/few-shot-dlg.
Similar content being viewed by others
References
Karlinsky L, Shtok J, Tzur Y, Tzadok A (2017) Fine-grained recognition of thousands of object categories with single-example training. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 965–974. https://doi.org/10.1109/CVPR.2017.109
Sochor J, Herout A, Havel J (2016) Boxcars: 3d boxes as cnn input for improved fine-grained vehicle recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3006–3015
Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and Construction Learning for Fine-grained Image Recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5157–5166
Dubey A, Gupta O, Guo P, Raskar R, Farrell R, Naik N (2018) Pairwise confusion for fine-grained visual classification. In: Proceedings of the European conference on computer vision (ECCV), pp 70–86
Ge W, Lin X, Yu Y (2019) Weakly supervised complementary parts models for fine-grained image classification from the bottom up. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3034–3043
Lin TY, Roy Chowdhury A, Maji S (2015) Bilinear cnn models for fine-grained visual recognition. In: Proceedings of the IEEE international conference on computer vision, pp 1449–1457
Yang Z, Luo T, Wang D, Hu Z, Gao J, Wang L (2018) Learning to navigate for fine-grained classification. In: Proceedings of the European conference on computer vision (ECCV), pp 420–435
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp 5209–5217
Wei XS, Wang P, Liu L, Shen C, Wu J (2019) Piecewise classifier mappings: learning fine-grained learners for novel categories with few examples. IEEE Trans Image Process 28(12):6116
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Adv Neural Inf Process Syst 30:4077–4087
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. Adv Neural Inf Process Syst 29:3630–3638
Hao F, He F, Cheng J, Wang L, Cao J, Tao D (2019) Collect and select: semantic alignment metric learning for few-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 8460–8469
Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7260–7268
Li W, Xu J, Huo J, Wang L, Gao Y, Luo J (2019) Distribution consistency based covariance metric networks for few-shot learning. Proc AAAI Conf Artif Intell 33:8642–8649
Zhang C, Cai Y, Lin G, Shen C (2020) DeepEMD: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12203–12213
Gao Y, Han X, Wang X, Huang W, Scott M (2020) Channel interaction networks for fine-grained image categorization. In: AAAI, pp 10818–10825
Zhuang P, Wang Y, Qiao Y (2020) Learning attentive pairwise interaction for fine-grained classification. In: AAAI, pp 13130–13137
Hu T, Qi H, Huang Q, Lu Y (2019) See better before looking closer: Weakly supervised data augmentation network for fine-grained visual classification. arXiv preprint arXiv:1901.09891
Sun G, Cholakkal H, Khan S, Khan FS, Shao L (2019) Fine-grained recognition: accounting for subtle differences between similar classes. arXiv preprint arXiv:1912.06842
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks, In: Proceedings of the 34th international conference on machine learning, Vol. 70 (JMLR. org, 2017), pp 1126–1135
Finn C, Xu K, Levine S (2018) Probabilistic model-agnostic meta-learning. Adv Neural Inf Process Syst 31:9516–9527
Sun Q, Liu Y, Chua TS, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 403–412
Andrychowicz M, Denil M, Gomez S, Hoffman MW, Pfau D, Schaul T, Shillingford B, De Freitas N (2016) Learning to learn by gradient descent by gradient descent. Adv Neural Inf Process Syst 29:3981–3989
Munkhdalai T, Yu H (2017) Meta networks. In: Proceedings of the 34th international conference on machine learning, Vol 70 (JMLR. org, 2017), pp 2554–2563
Sachin R, Hugo L (2017) Optimization as a model for few-shot learning. In: Proceedings of the international conference on learning representations (ICLR)
Hou R, Chang H, Bingpeng M, Shan S, Chen X (2019) Cross attention network for few-shot classification. Adv Neural Inf Process Syst 32:4005–4016
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset. California Institute of Technology, Pasadena
Khosla A, Jayadevaprakash N, Yao B, Li FF (2011) Novel dataset for fine-grained image categorization: stanford dogs. In: Proceedings of CVPR workshop on fine-grained visual categorization (FGVC)
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561
Acknowledgements
This work was supported in part by the Fundamental Research Funds for the Central Universities under Grant 2020JBZD010 and in part by the China railway R&D Program under Grant K2020G024.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cao, S., Wang, W., Zhang, J. et al. A few-shot fine-grained image classification method leveraging global and local structures. Int. J. Mach. Learn. & Cyber. 13, 2273–2281 (2022). https://doi.org/10.1007/s13042-022-01522-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01522-w