Abstract
Generating features from the most relevant image regions has shown great potential in solving the challenging few-shot image classification problem. Most of existing methods aggregate image regions weighted with attention maps to obtain category-specific features. Instead of using attention maps to indicate the relevance of image regions, we directly model the interdependencies between prototype features and image regions, resulting in a novel Semantic-Aware Feature Aggregation (SAFA) framework that can place more weights on category-relevant image regions. Specifically, we first design a “reduce and expand” block to extract category-relevant prototype features for each image. Then, we introduce an additive attention mechanism to highlight category-relevant image regions while suppressing the others. Finally, the weighted image regions are aggregated and used for classification. Extensive experiments show that our SAFA places more weights on category-relevant image regions and achieves state-of-the-art performance.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7132–7141
He K, Gkioxari G, Dollár P, Girshick R (2020) Mask R-CNN. IEEE Trans Pattern Anal Mach Intell (IEEE TPAMI) 42(2):386–397
Zhao H, Li Z, Fang L, Zhang T (2020) A balanced feature fusion ssd for object detection. Neural Process Lett (NPL) 51:2789–2806
He Y, Zang C, Zeng P, Dong Q, Liu D, Liu Y (2022) Convolutional shrinkage neural networks based model-agnostic meta-learning for few-shot learning. Neural Process Lett (NPL). https://doi.org/10.1007/s11063-022-10894-7
Zhao J, Tang T, Yu Y, Wang J, Yang T, Chen M, Wu J (2022) Adaptive meta transfer learning with efficient self-attention for few-shot bearing fault diagnosis. Neural Process Lett (NPL). https://doi.org/10.1007/s11063-022-10918-2
Nie L, Li X, Gong T, Zhan D (2022) Few shot learning-based fast adaptation for human activity recognition. Pattern Recogn Lett (PRL) 159:100–107
Xu W, Xu Y, Wang H, Tu Z (2021) Constellation nets for few-shot learning. In: International conference on learning representations (ICLR)
Nguyen VN, Løkse S, Wickstrøm K, Kampffmeyer M, Roverso D, Jenssen R (2020) SEN: a novel feature normalization dissimilarity measure for prototypical few-shot learning networks. In: European conference on computer vision (ECCV), pp 118–134
Yang S, Liu L, Xu M (2021) Free lunch for few-shot learning: distribution calibration. In: International conference on learning representations (ICLR)
Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 10657–10665
Li A, Huang W, Lan X, Feng J, Li Z, Wang L (2020) Boosting few-shot learning with adaptive margin loss. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 12576–12584
Li W, Xu J, Huo J, Wang L, Yang G, Luo J (2019) Distribution consistency based covariance metric networks for few-shot learning. In: AAAI conference on artificial intelligence (AAAI), pp 8642–8649
Li W, Wang L, Huo J, Shi Y, Gao Y, Luo J (2020) Asymmetric distribution measure for few-shot learning. Proceedings of the twenty-ninth international joint conference on artificial intelligence (IJCAI), pp 2957–2963
Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1199–1208
Hao F, He F, Cheng J, Wang L, Cao J, Tao D (2019) Collect and select: semantic alignment metric learning for few-shot learning. In: IEEE international conference on computer vision (ICCV), pp 8460–8469
Huang S, Zhang M, Kang Y, Wang D (2021) Attributes-guided and pure-visual attention alignment for few-shot recognition. In: AAAI conference on artificial intelligence (AAAI)
Zhang C, Cai Y, Lin G, Shen C (2020) DeepEMD: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 12203–12213
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems (NeurIPS)
Huynh D, Elhamifar E (2020) A shared multi-attention framework for multi-label zero-shot learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 8776–8786
Ye J, He J, Peng X, Wu W, Qiao Y (2020) Attention-driven dynamic graph convolutional network for multi-label image recognition. In: European conference on computer vision (ECCV), pp 649–665
Chen T, Lin L, Chen R, Hui X, Wu H (2022) Knowledge-guided multi-label few-shot learning for general image recognition. IEEE Trans Pattern Anal Mach Intell (IEEE TPAMI) 44(3):1371–1384
Jake S, Kevin S, Richard Z (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems (NeurIPS), pp 4077–4087
Laenen S, Bertinetto L (2021) On episodes, prototypical networks, and few-shot learning. In: Advances in neural information processing systems (NeurIPS)
Li W, Wang L, Xu J, Huo J, Yang G, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7253–7260
Wertheimer D, Tang L, Hariharan B (2021) Few-shot classification with feature map reconstruction networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 8012–8021
Ji Z, Hou Z, Liu X, Pang Y, Han J (2022) Information symmetry matters: a modal-alternating propagation network for few-shot learning. IEEE Trans Image Process (IEEE TIP) 31:1520–1531
Ji Z, An P, Liu X, Pang Y, Shao L, Zhang Z (2022) Task-oriented high-order context graph networks for few-shot human-object interaction recognition. IEEE Trans Syst Man Cybern Syst (IEEE TSMC) 52(9):5443–5455
Liu X, Ji Z, Pang Y, Han J, Li X (2022) Dgig-net: dynamic graph-in-graph networks for few-shot human-object interaction. IEEE Trans Cybern (IEEE TCYB) 52(8):7852–7864
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning (ICML), pp 1126–1135
Oriol V, Charles B, Tim L, Kavukcuoglu K, Daan W (2016) Matching networks for one shot learning. In: Advances in neural information processing systems (NeurIPS), pp 3630–3638
Li Z, Zhou F, Chen F, Li H (2017) Meta-sgd: learning to learn quickly for few-shot learning. Preprint arXiv:1707.09835
Dhillon GS, Chaudhari P, Ravichandran A, Soatto S (2020) A baseline for few-shot image classification. In: International conference on learning representations (ICLR)
Tian Y, Wang Y, Krishnan D, Tenenbaum JB, Isola P (2020) Rethinking few-shot image classification: a good embedding is all you need? In: European conference on computer vision (ECCV), pp 266–282
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Georg Heigold SG, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations (ICLR)
Hou R, Chang H, Ma B, Shan S, Chen X (2019) Cross attention network for few-shot classification. In: Advances in neural information processing systems (NeurIPS)
Ye H-J, Hu H, Zhan D-C, Sha F (2020) Few-shot learning via embedding adaptation with set-to-set functions. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 8808–8817
Doersch C, Gupta A, Zisserman A (2020) Crosstransformers: spatially-aware few-shot transfer. In: Advances in neural information processing systems (NeurIPS)
Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International conference on learning representations (ICLR)
Ren M, Triantafillou E, Ravi S, Snell J, Swersky K, Tenenbaum JB, Larochelle H, Zemel RS (2018) Meta-learning for semi-supervised few-shot classification. In: International conference on learning representations (ICLR)
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations (ICLR)
Oreshkin B, Lopez PR, Lacoste A (2018) TADAM: task dependent adaptive metric for improved few-shot learning. In: Advances in neural information processing systems (NeurIPS), pp 719–729
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2019) Meta-learning with latent embedding optimization. In: International conference on learning representations (ICLR)
Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 4367–4375
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations (ICLR)
Afrasiyabi A, Jean-François Gagné C (2021) Mixture-based feature space learning for few-shot image classification. In: IEEE international conference on computer vision (ICCV), pp 9041–9051
Hao F, He F, Cheng J, Tao D (2021) Global-local interplay in semantic alignment for few-shot learning. IEEE Trans Circuits Syst Video Technol (IEEE TCSVT). https://doi.org/10.1109/TCSVT.2021.3132912
Cheng J, Hao F, He F, Liu L, Zhang Q (2021) Mixer-based semantic spread for few-shot learning. IEEE Trans Multimed (IEEE TMM). https://doi.org/10.1109/TMM.2021.3123813
Ma R, Fang P, Drummond T, Harandi M (2022) Adaptive poincaré point to set distance for few-shot classification. In: AAAI conference on artificial intelligence (AAAI)
Huang H, Wu Z, Li W, Huo J, Gao Y (2021) Local descriptor-based multi-prototype network for few-shot learning. Pattern Recogn (PR). https://doi.org/10.1016/j.patcog.2021.107935
Gidaris S, Komodakis N (2019) Generating classification weights with gnn denoising autoencoders for few-shot learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 21–30
Chen K, Lee C-G (2022) Meta-free few-shot learning via representation learning with weight averaging. In: International joint conference on neural networks (IJCNN), pp 1–8
Ye H-J, Ming L, Zhan D-C, Chao W-L (2022) Few-shot learning with a strong teacher. IEEE Trans Pattern Anal Mach Intell (IEEE TPAMI). https://doi.org/10.1109/TPAMI.2022.3160362
Lu Y, Wen L, Liu J, Liu Y, Tian X (2022) Self-supervision can be a good few-shot learner. In: European conference on computer vision (ECCV), pp 740–758
Shen Z, Liu Z, Qin J, Savvides M, Cheng K-T (2021) Partial is better than all: revisiting fine-tuning strategy for few-shot learning. In: AAAI conference on artificial intelligence (AAAI)
Ji Z, Chai X, Yu Y, Pang Y, Zhang Z (2020) Improved prototypical networks for few-shot learning. Pattern Recogn Lett (PRL) 140:81–87
Zhang H, Koniusz P, Jian S, Li H, Torr PHS (2021) Rethinking class relations: absolute-relative supervised and unsupervised few-shot learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 9432–9441
Chen Z, Ge J, Zhan H, Huang S, Wang D (2021) Pareto self-supervised training for few-shot learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 13663–13672
Qiao S, Liu C, Shen W, Yuille A (2018) Few-shot image recognition by predicting parameters from activations. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7229–7238
Funding
This work was supported in part by National Natural Science Foundation of China (U21A20487, 62206268), in part by Shenzhen Technology Project (JCYJ20220818101206014), in part by CAS Key Technology Talent Program, in part by Shenzhen Engineering Laboratory for 3D Content Generating Technologies (NO. [2017]476), and in part by SIAT Innovation Program for Excellent Young Researchers (E1G032).
Author information
Authors and Affiliations
Contributions
FH, FW and FH conducted experiments. FH, FH and QZ wrote the main manuscript text. CS and JC prepared figures. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hao, F., Wu, F., He, F. et al. Semantic-Aware Feature Aggregation for Few-Shot Image Classification. Neural Process Lett 55, 6595–6609 (2023). https://doi.org/10.1007/s11063-023-11150-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11150-2