Skip to main content

Zero-shot Automated Class Imbalanced Learning

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15324))

Included in the following conference series:

  • 160 Accesses

Abstract

Given a new class imbalanced dataset D and limited computational resources, the challenge arises of selecting promising class imbalanced learning (CIL) pipelines that include resampling methods, classification models, and their corresponding hyperparameters. To address this challenge, we study Zero-shot Automated Machine Learning and propose a new approach aiming at class imbalanced data, called Zero-shot Automated Class Imbalance Learning (ZAutoCIL). ZAutoCIL employs domain-independent meta-learning to develop a zero-shot surrogate model for automated class imbalanced learning. This model aims to recommend effective CIL pipelines for new unseen imbalanced datasets without requiring additional search. Specifically, we meta-train a two-tower model to serve as the surrogate model, adapted from recommender systems, using a pairwise ranking loss on the meta-dataset gained from collecting performance data across a wide range of CIL pipelines and a comprehensive repository of class imbalance datasets. We perform extensive experiments on 100 datasets grouped in 4 parts based on their imbalance ratio. The experimental results demonstrate the efficacy of our approach in automating the recommendation of CIL pipelines given any target imbalanced datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alcalá-Fdez, J., Sanchez, L., Garcia, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., et al.: Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft. Comput. 13, 307–318 (2009)

    Article  Google Scholar 

  2. Alcobaça, E., Siqueira, F., Rivolli, A., Garcia, L.P.F., Oliva, J.T., de Carvalho, A.C.P.L.F.: Mfe: Towards reproducible meta-feature extraction. Journal of Machine Learning Research 21(111), 1–5 (2020), http://jmlr.org/papers/v21/19-348.html

  3. Chawla, N.V.: Data mining for imbalanced datasets: An overview. Data mining and knowledge discovery handbook pp. 875–886 (2010)

    Google Scholar 

  4. Chen, W., Liu, T.Y., Lan, Y., Ma, Z.M., Li, H.: Ranking measures and loss functions in learning to rank. Advances in Neural Information Processing Systems 22 (2009)

    Google Scholar 

  5. Erickson, B.J., Kitamura, F.: Magician’s corner: 9. performance metrics for machine learning models (2021)

    Google Scholar 

  6. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. Advances in neural information processing systems 28 (2015)

    Google Scholar 

  7. Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)

    Article  Google Scholar 

  8. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  9. Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 14(8), 2 (2012)

    Google Scholar 

  10. Hutter, F., Kotthoff, L., Vanschoren, J.: Automated machine learning: methods, systems, challenges. Springer Nature (2019)

    Google Scholar 

  11. Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Progress in artificial intelligence 5(4), 221–232 (2016)

    Article  Google Scholar 

  12. LemaÃŽtre, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017)

    Google Scholar 

  13. Li, H.: A short introduction to learning to rank. IEICE Trans. Inf. Syst. 94(10), 1854–1862 (2011)

    Article  Google Scholar 

  14. Liu, X.Y., Zhou, Z.H.: Ensemble methods for class imbalance learning. Imbalanced learning: Foundations, algorithms, and applications pp. 61–82 (2013)

    Google Scholar 

  15. Moniz, N., Cerqueira, V.: Automated imbalanced classification via meta-learning. Expert Syst. Appl. 178, 115011 (2021)

    Article  Google Scholar 

  16. Nguyen, D.A., Kong, J., Wang, H., Menzel, S., Sendhoff, B., Kononova, A.V., Bäck, T.: Improved automated cash optimization with tree parzen estimators for class imbalance problems. In: 2021 IEEE 8th international conference on data science and advanced analytics (DSAA). pp. 1–9. IEEE (2021)

    Google Scholar 

  17. Öztürk, E., Ferreira, F., Jomaa, H., Schmidt-Thieme, L., Grabocka, J., Hutter, F.: Zero-shot automl with pretrained models. In: International Conference on Machine Learning. pp. 17138–17155. PMLR (2022)

    Google Scholar 

  18. Pasumarthi, R.K., Bruch, S., Wang, X., Li, C., Bendersky, M., Najork, M., Pfeifer, J., Golbandi, N., Anil, R., Wolf, S.: Tf-ranking: Scalable tensorflow library for learning-to-rank. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 2970–2978 (2019)

    Google Scholar 

  19. Rezvani, S., Wang, X.: A broad review on class imbalance learning techniques. Appl. Soft Comput. 143, 110415 (2023)

    Article  Google Scholar 

  20. Rivolli, A., Garcia, L.P., Soares, C., Vanschoren, J., de Carvalho, A.C.: Characterizing classification datasets: a study of meta-features for meta-learning. arXiv preprint arXiv:1808.10406 (2018)

  21. Singh, P., Vanschoren, J.: Automated imbalanced learning. arXiv preprint arXiv:2211.00376 (2022)

  22. Tornede, A., Wever, M., Hüllermeier, E.: Extreme algorithm selection with dyadic feature representation. In: International Conference on Discovery Science. pp. 309–324. Springer (2020)

    Google Scholar 

  23. Truong, A., Walters, A., Goodsitt, J., Hines, K., Bruss, C.B., Farivar, R.: Towards automated machine learning: Evaluation and comparison of automl approaches and tools. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI). pp. 1471–1479. IEEE (2019)

    Google Scholar 

  24. Vanschoren, J.: Meta-learning: A survey. arXiv preprint arXiv:1810.03548 (2018)

  25. Vieira, P.M., Rodrigues, F.: An automated approach for binary classification on imbalanced data. Knowledge and Information Systems pp. 1–21 (2024)

    Google Scholar 

  26. Wang, K., Xue, Q., Lu, J.J.: Risky driver recognition with class imbalance data and automated machine learning framework. Int. J. Environ. Res. Public Health 18(14), 7534 (2021)

    Article  Google Scholar 

  27. Wang, S., Yao, X.: Multiclass imbalance problems: Analysis and potential solutions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42(4), 1119–1130 (2012)

    Google Scholar 

  28. Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: Settings, methods, and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10(2), 1–37 (2019)

    Google Scholar 

  29. Wang, Z., Wang, S.: Online automated machine learning for class imbalanced data streams. In: 2023 International Joint Conference on Neural Networks (IJCNN). pp. 1–8. IEEE (2023)

    Google Scholar 

  30. Winkelmolen, F., Ivkin, N., Bozkurt, H.F., Karnin, Z.: Practical and sample efficient zero-shot hpo. arXiv preprint arXiv:2007.13382 (2020)

  31. Wistuba, M., Grabocka, J.: Few-shot bayesian optimization with deep kernel surrogates. arXiv preprint arXiv:2101.07667 (2021)

  32. XU, S., Wang, J.: On strong convergence of the two-tower model for recommender system (2021)

    Google Scholar 

  33. Yang, F., Zou, Q.: maml: an automated machine learning pipeline with a microbiome repository for human disease classification. Database 2020, baaa050 (2020)

    Google Scholar 

  34. Zhang, J., Sun, Z., Qi, Y.: Autoidl: Automated imbalanced data learning via collaborative filtering. In: International Conference on Knowledge Science, Engineering and Management. pp. 96–104. Springer (2020)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the EPSRC Early Career Researchers International Collaboration Grants [EP/Y002539/1].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuo Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Z., Wang, S. (2025). Zero-shot Automated Class Imbalanced Learning. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15324. Springer, Cham. https://doi.org/10.1007/978-3-031-78383-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78383-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78382-1

  • Online ISBN: 978-3-031-78383-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics