Skip to main content

Advertisement

Enhancing few-shot learning using targeted mixup

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Irrespective of the attention that long-tailed classification has received over recent years, expectedly, the performance of the tail classes suffers more than the remaining classes. We address this problem by means of a novel data augmentation technique called Targeted Mixup. This is about mixing class samples based on the model’s performance regarding each class. Instances of classes that are difficult to distinguish are randomly chosen and linearly interpolated to produce a new sample such that the model can pay attention to those two classes. The expectation is that the model can learn the distinguishing features to improve classification of instances belonging to their respective classes. To prove the efficiency of our proposed methods empirically, we performed experiments using CIFAR-100-LT, Places-LT, and Speech Commands-LT datasets. From the results of the experiments, there was an improvement on the few-shot classes without sacrificing too much of the model performance on the many-shot and medium-shot classes. In fact, there was an increase in the overall accuracy as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

Publicly available datasets were analyzed in this study. These data can be found here (accessed on 9 October 2024):

https://www.cs.toronto.edu/~kriz/cifar.html

https://liuziwei7.github.io/projects/LongTail.html

Speech Commands-LT, the long-tailed version of the Speech Commands dataset we have made, is available in the following (accessed on 1 April 2024):

https://github.com/yd00/Speech-Commands-LT

References

  1. Ahn S, Ko J, Yun SY (2023) Cuda: Curriculum of data augmentation for long-tailed recognition. In: The eleventh international conference on learning representations

  2. Avramova V (2015) Curriculum learning with deep convolutional neural networks

  3. Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, pp 41–48

  4. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  5. Chu P, Bian, X, Liu S, Ling H (2020) Feature space augmentation for long-tailed data. In: Computer vision – ECCV 2020: 16th european conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX, pp 694–710

  6. Cui Y, Jia M, Lin, TY, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9268–9277

  7. Darkwah Jnr Y, Kang DK (2023) Triplet class-wise difficulty-based loss for long tail classification. Int J Internet Broadcast Commun 15(3):66–72

    Google Scholar 

  8. DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552

  9. Dvornik N, Mairal J, Schmid C (2018) Modeling visual context is key to augmenting object detection datasets. In: Proceedings of the european conference on computer vision (ECCV), pp 364–380

  10. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  11. Inoue H (2018) Data augmentation by pairing samples for images classification. arXiv:1801.02929

  12. Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y (2019) Decoupling representation and classifier for long-tailed recognition. In: Proceedings of the international conference on learning representations

  13. Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. University of Toronto, Tech. rep

    MATH  Google Scholar 

  14. Li J, Wang QF, Huang K, Yang X, Zhang R, Goulermas JY (2023) Towards better long-tailed oracle character recognition with adversarial data augmentation. Pattern Recognit 140:109534

    Article  Google Scholar 

  15. Lin TY, Goyal P, Girshick R, He K, Dollár, P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  16. Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu SX (2019) Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2537–2546

  17. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al (2019) Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32

  18. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823

  19. Shuai X, Shen Y, Jiang S, Zhao Z, Yan Z, Xing G (2022) BalanceFL: Addressing class imbalance in long-tail federated learning. In: Proceedings of the 21st ACM/IEEE international conference on information processing in sensor networks (IPSN), pp 271–284. IEEE

  20. Sinha S, Ohashi H, Nakamura K (2020) Class-wise difficulty-balanced loss for solving class-imbalance. In: Proceedings of the asian conference on computer vision

  21. Sinha S, Ohashi H, Nakamura K (2022) Class-difficulty based methods for long-tailed visual recognition. Int J Comput Vis 130(10):2517–2531

    Article  MATH  Google Scholar 

  22. Tan J, Wang C, Li B, Li Q, Ouyang W, Yin C, Yan J (2020) Equalization loss for long-tailed object recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11662–11671

  23. Verma V, Lamb A, Beckham C, Najafi A, Mitliagkas I, Lopez-Paz D, Bengio Y (2019) Manifold mixup: Better representations by interpolating hidden states. In: International conference on machine learning, pp 6438–6447. PMLR

  24. Wang X, Chen Y, Zhu W (2022) A survey on curriculum learning. IEEE Trans Pattern Anal & Mach Intell 44(09):4555–4576

    MATH  Google Scholar 

  25. Warden P (2018) Speech commands: A dataset for limited-vocabulary speech recognition. arXiv:1804.03209

  26. Winston J (2022) Warmup margin: Improving the performance of triplet loss with incremental margin. Master’s thesis, Dongseo University

  27. Xiang X, Zhang Z, Chen X (2024) Curricular-balanced long-tailed learning. Neurocomputing 571:127121

    Article  Google Scholar 

  28. Xu M, Yoon S, Fuentes A, Park DS (2023) A comprehensive survey of image augmentation techniques for deep learning. Pattern Recognit, pp 109347

  29. Xu Z, Meng A, Shi Z, Yang W, Chen Z, Huang L (2021) Continuous copy-paste for one-stage multi-object tracking and segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15323–15332

  30. Yao H, Wang Y, Zhang L, Zou JY, Finn C (2022) C-mixup: Improving generalization in regression. Adv Neural Inf Process Syst 35:3361–3376

    MATH  Google Scholar 

  31. Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y (2019) Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6023–6032

  32. Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) mixup: Beyond empirical risk minimization. In: Proceedings of the international conference on learning representations

  33. Zhou B, Cui Q, Wei XS, Chen ZM (2020) BBN: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9719–9728

  34. Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2017) Places: A 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1452–1464

Download references

Acknowledgements

The authors express gratitude to the Dongseo University Machine Learning/Deep Learning Research Lab members and the anonymous reviewers for their valuable insights and feedback on earlier versions of this paper.

Funding

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2022R1A2C2012243).

Author information

Authors and Affiliations

Authors

Contributions

The authors confirm contribution to the paper as follows:

− Conceptualization: Y. Darkwah Jnr., D.-K. Kang

− Methodology: Y. Darkwah Jnr.

− Software: Y. Darkwah Jnr.

− Validation: D.-K. Kang

− Formal analysis: Y. Darkwah Jnr.

− Investigation: Y. Darkwah Jnr.

− Resources: D.-K. Kang

− Data curation: Y. Darkwah Jnr.

− Writing-original draft preparation: Y. Darkwah Jnr.

− Writing-review and editing: D.-K. Kang

− Visualization: Y. Darkwah Jnr.

− Supervision: D.-K. Kang

− Project administration: D.-K. Kang

− Funding acquisition: D.-K. Kang

All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Dae-Ki Kang.

Ethics declarations

Competing Interests

The author(s) declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Darkwah Jnr., Y., Kang, DK. Enhancing few-shot learning using targeted mixup. Appl Intell 55, 279 (2025). https://doi.org/10.1007/s10489-024-06157-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06157-8

Keywords