Abstract
The goal of few-shot learning is to use a small number of labeled samples to train a machine learning model and then classify the unlabeled samples. Recent works, especially the methods based on image local feature representation in metric learning have achieved superior performance by utilizing the local invariant features and their rich discriminative information. However, the learned local features in the existing methods are not aligned when calculating their similarities, resulting in larger intra-class divergence and smaller inter-class divergence. In fact, the dominant object (local feature) of one image should only compare with the semantically relevant local feature of the other image. To address these issues, this paper proposes a few-shot learning approach (SANet) based on semantic alignment of local features. Specifically, we firstly obtain the local features of the query and support images by using a feature extraction module, and then compute the relation matrices of these local features. Using the above relation matrices, we respectively design an intra-class divergence rectification (intraDR) module and an inter-class divergence rectification (interDR) module to implement the local feature alignment and reduce the effect of the noise local features. The experimental results on multiple datasets show that, by aligning the local features, the proposed model can effectively minimize the intra-class divergence while maximizing the inter-class divergence, thus achieving better classification performance. The code for this paper can be accessed via https://github.com/SongQCode/SANet.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18212-0/MediaObjects/11042_2024_18212_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18212-0/MediaObjects/11042_2024_18212_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18212-0/MediaObjects/11042_2024_18212_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18212-0/MediaObjects/11042_2024_18212_Figa_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18212-0/MediaObjects/11042_2024_18212_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-18212-0/MediaObjects/11042_2024_18212_Fig5_HTML.png)
Similar content being viewed by others
Database Availability
The data used in this study is sourced from a publicly available dataset. The download address of the dataset can be obtained through https://github.com/SongQCode/SANet.
References
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Proceedings of the 31st international conference on neural information processing systems, pp 4080–4090
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1199–1208. https://doi.org/10.1109/cvpr.2018.00131
Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7260–7268. https://doi.org/10.1109/cvpr.2019.00743
Zhang C, Cai Y, Lin G, Shen C (2020) Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12203–12213. https://doi.org/10.1109/cvpr42600.2020.01222
Li W, Xu J, Huo J, Wang L, Gao Y, Luo J (2019) Distribution consistency based covariance metric networks for few-shot learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8642–8649. https://doi.org/10.1609/aaai.v33i01.33018642
Huang H, Wu Z, Li W, Huo J, Gao Y (2021) Local descriptor-based multi-prototype network for few-shot learning. Pattern Recognit 116:107935. https://doi.org/10.1016/j.patcog.2021.107935
Shen Z, Liu Z, Qin J, Savvides M, Cheng K-T (2021) Partial is better than all: revisiting fine-tuning strategy for few-shot learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 9594–9602. https://doi.org/10.1609/aaai.v35i11.17155
Royle JA, Dorazio RM, Link WA (2007) Analysis of multinomial models with unknown index using data augmentation. J Comput Graph Stat 16(1):67–85. https://doi.org/10.1198/106186007x181425
Chen Z, Fu Y, Zhang Y, Jiang Y-G, Xue X, Sigal L (2018) Semantic feature augmentation in few-shot learning. 86(89):2. arXiv:1804.05298
Tian S, Li W, Ning X, Ran H, Qin H, Tiwari P (2023) Continuous transfer of neural network representational similarity for incremental learning. Neurocomputing 545:126300. https://doi.org/10.1016/j.neucom.2023.126300
Koch G, Zemel R, Salakhutdinov R, et al (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol 2
Chowdhury RR, Bathula DR (2022) Influential prototypical networks for few shot learning: A dermatological case study. In: 2022 IEEE 19th international symposium on biomedical imaging (ISBI), pp 1–4. https://doi.org/10.1109/isbi52829.2022.9761403
Tolstikhin, IO, Sriperumbudur, BK, Schölkopf, B (2016) Minimax estimation of maximum mean discrepancy with radial kernels. Proceedings of the 30st international conference on neural information processing systems, 1938–1946
Dong C, Li W, Huo J, Gu Z, Gao Y (2021) Learning task-aware local representations for few-shot learning. In: Proceedings of the twenty-ninth international conference on international joint conferences on artificial intelligence, pp 716–722. https://doi.org/10.24963/ijcai.2020/100
Chen H, Li H, Li Y, Chen C (2022) Multi-scale adaptive task attention network for few-shot learning. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp 4765–4771. https://doi.org/10.1109/icpr56361.2022.9955637
Hao F, He F, Cheng J, Wang L, Cao J, Tao D (2019) Collect and select: Semantic alignment metric learning for few-shot learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8460–8469. https://doi.org/10.1109/iccv.2019.00855
Zagoruyko S, Komodakis N (2016) Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928https://doi.org/10.48550/arXiv.1612.03928
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp 1126–1135
Li Z, Zhou F, Chen F, Li H (2017) Meta-sgd: Learning to learn quickly for few-shot learning. arXiv:1707.09835https://doi.org/10.48550/arXiv.1707.09835
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: International conference on machine learning, pp 1842–1850
Jamal MA, Qi G-J (2019) Task agnostic meta-learning for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11719–11727. https://doi.org/10.1109/cvpr.2019.01199
Sun Q, Liu Y, Chua T-S, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 403–412 https://doi.org/10.1109/CVPR.2019.00049
Liu Y, Zheng T, Song J, Cai D, He X (2022) Dmn4: Few-shot learning via discriminative mutual nearest neighbor neural network. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 1828–1836. https://doi.org/10.1609/aaai.v36i2.20076
Ning X, Tian W, He F, Bai X, Sun L, Li W (2023) Hyper-sausage coverage function neuron model and learning algorithm for image classification. Pattern Recognit 136:109216. https://doi.org/10.1016/j.patcog.2022.109216
Liao X, Yu Y, Li B, Li Z, Qin Z (2019) A new payload partition strategy in color image steganography. IEEE Trans Circuits Syst Video Technol 30(3):685–696. https://doi.org/10.1109/TCSVT.2019.2896270
Liao X, Li K, Zhu X, Liu KR (2020) Robust detection of image operator chain with two-stream convolutional neural network. IEEE J Sel Top Signal Process 14(5):955–968. https://doi.org/10.1109/JSTSP.2020.3002391
Liao X, Yin J, Chen M, Qin Z (2020) Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Trans Dependable Secure Comput 19(2):897–911. https://doi.org/10.1109/TDSC.2020.3004708
Khosla A, Jayadevaprakash N, Yao B, Li F-F (2011) Novel dataset for fine-grained image categorization: Stanford dogs. In: Proceedings of CVPR workshop on fine-grained visual categorization (FGVC), vol 2
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 554–561. https://doi.org/10.1109/iccvw.2013.77
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Gogoi M, Tiwari S, Verma S (2022) Adaptive prototypical networks. arXiv:2211.12479https://doi.org/10.48550/arXiv.2211.12479
Nguyen HQ, Nguyen CQ, Le DD, Pham HH (2023) Enhancing few-shot image classification with cosine transformer. IEEE Access 79659–79672. https://doi.org/10.1109/ACCESS.2023.3298299
Nakamura A, Harada T (2019) Revisiting fine-tuning for few-shot learning. arXiv:1910.00216https://doi.org/10.48550/arXiv.1910.00216
Garcia V, Bruna J (2017) Few-shot learning with graph neural networks. arXiv:1711.04043https://doi.org/10.48550/arXiv.1711.04043
Zheng Z, Feng X, Yu H, Li X, Gao M (2023) Bdla: Bi-directional local alignment for few-shot learning. Appl Intell 53(1):769–785. https://doi.org/10.1007/s10489-022-03479-3
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 770–778. https://doi.org/10.1109/cvpr.2016.90
Wu J, Chang D, Sain A, Li X, Ma Z, Cao J, Guo J, Song Y-Z (2022) Bi-directional feature reconstruction network for fine-grained few-shot image classification. arXiv:2211.17161https://doi.org/10.1609/aaai.v37i3.25383
Huang H, Zhang J, Zhang J, Xu J, Wu Q (2020) Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification. IEEE Trans Multimed 23:1666–1680. https://doi.org/10.1109/tmm.2020.3001510
Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4367–4375. https://doi.org/10.1109/cvpr.2018.00459
Mishra N, Rohaninejad M, Chen X, Abbeel P (2017) A simple neural attentive meta-learner. arXiv:1707.03141https://doi.org/10.48550/arXiv.1707.03141
Acknowledgements
This work was supported partially by the National Natural Science Foundation of China under grant number 62006126 and 61872190, the Natural Science Foundation of Jiangsu Province under grant number BK20200740, the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under grant number 20KJB520004, Natural Science Research Start-up Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications under grant number NY219150
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, P., Song, Q., Chen, L. et al. Local feature semantic alignment network for few-shot image classification. Multimed Tools Appl 83, 69489–69509 (2024). https://doi.org/10.1007/s11042-024-18212-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-024-18212-0