Abstract
To improve the representation ability of feature extractors in few-shot classification, in this paper, we propose a momentum memory contrastive few-shot learning method based on the distance metric and transfer learning. The proposed method adopts an external memory bank and a contrastive loss function to constrain the feature representation of the samples in training. The memory bank is maintained by the dynamic momentum update of current samples. In addition, a feature representation augmentation technique is used to improve the generalization of the feature representation centroid to the samples in the testing. Furthermore, we design a spatial pyramid fusion downscaling module to improve the extraction ability of multi-scale features. Experimental results show that our method outperforms the compared methods and achieves state-of-the-art accuracy in 5-way 1-shot and 5-way 5-shot tasks on datasets including miniImageNet, CUB-200, and CIFAR-FS. The extensive study with discussions verifies the effectiveness of each proposed component in our method.
Similar content being viewed by others
References
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 10012–10022
Wang X, Zheng Z, He Y, Yan F, Zeng Z, Yang Y (2021) Soft person reidentification network pruning via blockwise adjacent filter decaying. IEEE Transactions on cybernetics, https://doi.org/10.1109/TCYB.2021.3130047
Liu H, Fang S, Zhang Z, Li D, Lin K, Wang J (2021) MFDNet: Collaborative poses perception and matrix fisher distribution for head pose estimation. IEEE Transactions on multimedia, https://doi.org/10.1109/TMM.2021.3081873
Wang C-Y, Bochkovskiy A, Liao H-YM (2021) Scaled-YOLOv4: Scaling cross stage partial network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13029–13038
Qiao S, Chen L-C, Yuille A (2021) DetectoRS: Detecting objects with recursive feature pyramid and switchable atrous convolution. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10213–10224
Zhang X, Xu H, Mo H, Tan J, Yang C, Wang L, Ren W (2021) DCNAS: Densely connected neural architecture search for semantic image segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13956–13967
Huang B, Wei Z, Tang X, Fujita H, Cai Q, Gao Y, Wu T, Zhou L (2021) Deep learning network for medical volume data segmentation based on multi axial plane fusion. Comput Methods Prog Biomed 212:106480
Liu T, Liu H, Li Y, Zhang Z, Liu S (2019) Efficient blind signal reconstruction with wavelet transforms regularization for educational robot infrared vision sensing. IEEE/ASME Trans Mechatronics 24(1):384–394
Liu T, Liu H, Li Y-F, Chen Z, Zhang Z, Liu S (2020) Flexible FTIR spectral imaging enhancement for industrial robot infrared vision sensing. IEEE Trans Industrial Inform 16(1):544–554
Shen X, Yi B, Liu H, Zhang W, Zhang Z, Liu S, Xiong N (2021) Deep variational matrix factorization with knowledge embedding for recommendation system. IEEE Trans Knowl Data Eng 33 (5):1906–1918
Liu H, Zheng C, Li D, Shen X, Lin K, Wang J, Zhang Z, Zhang Z, Xiong NN (2021) EDMF: Efficient deep matrix factorization with review feature learning for industrial recommender system. IEEE Transactions on industrial informatics, https://doi.org/10.1109/TII.2021.3128240
Huisman M, van Rijn JN, Plaat A (2021) A survey of deep meta-learning. Artif Intell Rev 54(6):4483–4541
Li X, Sun Z, Xue J-H, Ma Z (2021) A concise review of recent few-shot meta-learning methods. Neurocomputing 456:463–468
Hayashi T, Fujita H (2021) One-class ensemble classifier for data imbalance problems. Appl Intell, https://doi.org/10.1007/s10489-021-02671-1
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 1126–1135
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2019) Meta-learning with latent embedding optimization. In: International conference on learning representations
Sun Q, Liu Y, Chua T-S, Schiele B (2019) Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Raghu A, Raghu M, Bengio S, Vinyals O (2020) Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. In: International conference on learning representations
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations
Hariharan B, Girshick R (2017) Low-shot visual recognition by shrinking and hallucinating features. In: Proceedings of the IEEE international conference on computer vision
Wang Y-X, Girshick R, Hebert M, Hariharan B (2018) Low-shot learning from imaginary data. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Schwartz E, Karlinsky L, Shtok J, Harary S, Marder M, Kumar A, Feris R, Giryes R, Bronstein A (2018) Delta-encoder: An effective sample synthesis method for few-shot object recognition. In: Advances in neural information processing systems, vol 31
Zhang R, Che T, Ghahramani Z, Bengio Y, Song Y (2018) MetaGAN: An adversarial approach to few-shot learning. In: Advances in neural information processing systems, vol 31
Gao H, Shou Z, Zareian A, Zhang H, Chang S-F (2018) Low-shot learning via covariance-preserving adversarial augmentation networks. In: 32nd Conference on neural information processing systems, pp 983–993
Zhang H, Zhang J, Koniusz P (2019) Few-shot learning via saliency-guided hallucination of samples. In: 2019 IEEE/CVF conference on computer vision and pattern recognition, pp 2765–2774
Chen Z, Fu Y, Wang Y-X, Ma L, Liu W, Hebert M (2019) Image deformation meta-networks for one-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: Proceedings of the 32nd international conference on machine learning, vol 37
Vinyals O, Blundell C, Lillicrap T, kavukcuoglu, Wierstra D (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, vol 29
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, vol 30
Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Bertinetto L, Henriques JF, Torr P, Vedaldi A (2019) Meta-learning with differentiable closed-form solvers. In: International conference on learning representations
Oreshkin BN, López PR, Lacoste A (2018) TADAM: Task dependent adaptive metric for improved few-shot learning. In: Proceedings of the 32nd conference on neural information processing systems, pp 719–729
Shyam P, Gupta S, Dukkipati A (2017) Attentive recurrent comparators. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 3173–3181
Qiao S, Liu C, Shen W, Yuille A L (2018) Few-shot image recognition by predicting parameters from activations. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Qi H, Brown M, Lowe DG (2018) Low-shot learning with imprinted weights. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B (2019) A closer look at few-shot classification. In: International Conference on Learning Representations
Mangla P, Kumari N, Sinha A, Singh M, Krishnamurthy B, Balasubramanian VN (2020) Charting the right manifold: Manifold mixup for few-shot learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision
Hu Y, Gripon V, Pateux S (2021) Graph-based interpolation of feature vectors for accurate few-shot classification. In: 2020 25th International conference on pattern recognition, pp 8164– 8171
Hu Y, Gripon V, Pateux S (2021) Leveraging the feature distribution in transfer-based few-shot learning. In: Artificial neural networks and machine learning – ICANN 2021, pp 487–499
Andrychowicz M, Denil M, Gómez S, Hoffman M W, Pfau D, Schaul T, Shillingford B, de Freitas N (2016) Learning to learn by gradient descent by gradient descent. In: Advances in neural information processing systems, vol 29
Graves A (2012) Long short-term memory. In: Supervised sequence labelling with recurrent neural networks, pp 37–45
Tian Y, Wang Y, Krishnan D, Tenenbaum J B, Isola P (2020) Rethinking few-shot image classification: A good embedding is all you need?. In: Computer vision – ECCV 2020, pp 266– 282
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. In: International conference on learning representations
Verma V, Lamb A, Beckham C, Najafi A, Mitliagkas I, Lopez-Paz D, Bengio Y (2019) Manifold mixup: Better representations by interpolating hidden states. In: Proceedings of the 36th international conference on machine learning, vol 97, pp 6438–6447
Dhillon G S, Chaudhari P, Ravichandran A, Soatto S (2020) A baseline for few-shot image classification. In: International conference on learning representations
Simon C, Koniusz P, Nock R, Harandi M (2020) Adaptive subspaces for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Hu S X, Moreno P G, Xiao Y, Shen X, Obozinski G, Lawrence N, Damianou A (2020) Empirical bayes transductive meta-learning with synthetic gradients. In: International conference on learning representations
Liu J, Song L, Qin Y (2020) Prototype rectification for few-shot learning. In: Computer vision – ECCV 2020, pp 741–756
Ziko I, Dolz J, Granger E, Ayed I B (2020) Laplacian regularized few-shot learning. In: Proceedings of the 37th International Conference on Machine Learning, vol 119, pp 11660– 11670
Lichtenstein M, Sattigeri P, Feris R, Giryes R, Karlinsky L (2020) TAFSSL: Task-adaptive feature sub-space learning for few-shot classification. In: Computer vision – ECCV 2020, pp 522–539
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the british machine vision conference (BMVC), pp 87.1–87.12
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Analysis and Machine Intel 37(9):1904–1916
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001. California Institute of Technology
Bertinetto L, Henriques JF, Torr PHS, Vedaldi A (2019) Meta learning with differentiable closed-form solvers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: International conference on learning representations
Das D, Lee CSG (2020) A two-stage approach to few-shot learning for image recognition. IEEE Trans Image Process 29:3336–3350
Mishra N, Rohaninejad M, Chen X, Abbeel P (2018) A simple neural attentive meta-learner. In: International conference on learning representations
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2019) Meta-learning with latent embedding optimization. In: International conference on learning representations
Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Gidaris S, Bursuc A, Komodakis N, Perez P, Cord M (2019) Boosting few-shot visual learning with self-supervision. In: Proceedings of the IEEE/CVF international conference on computer vision
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Analysis and Machine Intel 40(4):834–848
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: 2017 IEEE Conference on computer vision and pattern recognition, pp 6230–6239
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by the National Natural Science Foundation of China under Grant 52072026 and Grant 62076022.
Rights and permissions
About this article
Cite this article
Tian, R., Shi, H. Momentum memory contrastive learning for transfer-based few-shot classification. Appl Intell 53, 864–878 (2023). https://doi.org/10.1007/s10489-022-03506-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03506-3