Zero-shot Automated Class Imbalanced Learning

Wang, Zhaoyang; Wang, Shuo

doi:10.1007/978-3-031-78383-8_10

Zhaoyang Wang¹³ &
Shuo Wang¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15324))

Included in the following conference series:

International Conference on Pattern Recognition

160 Accesses

Abstract

Given a new class imbalanced dataset D and limited computational resources, the challenge arises of selecting promising class imbalanced learning (CIL) pipelines that include resampling methods, classification models, and their corresponding hyperparameters. To address this challenge, we study Zero-shot Automated Machine Learning and propose a new approach aiming at class imbalanced data, called Zero-shot Automated Class Imbalance Learning (ZAutoCIL). ZAutoCIL employs domain-independent meta-learning to develop a zero-shot surrogate model for automated class imbalanced learning. This model aims to recommend effective CIL pipelines for new unseen imbalanced datasets without requiring additional search. Specifically, we meta-train a two-tower model to serve as the surrogate model, adapted from recommender systems, using a pairwise ranking loss on the meta-dataset gained from collecting performance data across a wide range of CIL pipelines and a comprehensive repository of class imbalance datasets. We perform extensive experiments on 100 datasets grouped in 4 parts based on their imbalance ratio. The experimental results demonstrate the efficacy of our approach in automating the recommendation of CIL pipelines given any target imbalanced datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Low-shot learning and class imbalance: a survey

Article Open access 02 January 2024

A systematic review for class-imbalance in semi-supervised learning

Article 04 September 2023

CR-IFSSL: Imbalanced Federated Semi-Supervised Learning with Class Rebalancing

References

Alcalá-Fdez, J., Sanchez, L., Garcia, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., et al.: Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft. Comput. 13, 307–318 (2009)
Article Google Scholar
Alcobaça, E., Siqueira, F., Rivolli, A., Garcia, L.P.F., Oliva, J.T., de Carvalho, A.C.P.L.F.: Mfe: Towards reproducible meta-feature extraction. Journal of Machine Learning Research 21(111), 1–5 (2020), http://jmlr.org/papers/v21/19-348.html
Chawla, N.V.: Data mining for imbalanced datasets: An overview. Data mining and knowledge discovery handbook pp. 875–886 (2010)
Google Scholar
Chen, W., Liu, T.Y., Lan, Y., Ma, Z.M., Li, H.: Ranking measures and loss functions in learning to rank. Advances in Neural Information Processing Systems 22 (2009)
Google Scholar
Erickson, B.J., Kitamura, F.: Magician’s corner: 9. performance metrics for machine learning models (2021)
Google Scholar
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. Advances in neural information processing systems 28 (2015)
Google Scholar
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)
Article Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 14(8), 2 (2012)
Google Scholar
Hutter, F., Kotthoff, L., Vanschoren, J.: Automated machine learning: methods, systems, challenges. Springer Nature (2019)
Google Scholar
Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Progress in artificial intelligence 5(4), 221–232 (2016)
Article Google Scholar
LemaÃŽtre, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017)
Google Scholar
Li, H.: A short introduction to learning to rank. IEICE Trans. Inf. Syst. 94(10), 1854–1862 (2011)
Article Google Scholar
Liu, X.Y., Zhou, Z.H.: Ensemble methods for class imbalance learning. Imbalanced learning: Foundations, algorithms, and applications pp. 61–82 (2013)
Google Scholar
Moniz, N., Cerqueira, V.: Automated imbalanced classification via meta-learning. Expert Syst. Appl. 178, 115011 (2021)
Article Google Scholar
Nguyen, D.A., Kong, J., Wang, H., Menzel, S., Sendhoff, B., Kononova, A.V., Bäck, T.: Improved automated cash optimization with tree parzen estimators for class imbalance problems. In: 2021 IEEE 8th international conference on data science and advanced analytics (DSAA). pp. 1–9. IEEE (2021)
Google Scholar
Öztürk, E., Ferreira, F., Jomaa, H., Schmidt-Thieme, L., Grabocka, J., Hutter, F.: Zero-shot automl with pretrained models. In: International Conference on Machine Learning. pp. 17138–17155. PMLR (2022)
Google Scholar
Pasumarthi, R.K., Bruch, S., Wang, X., Li, C., Bendersky, M., Najork, M., Pfeifer, J., Golbandi, N., Anil, R., Wolf, S.: Tf-ranking: Scalable tensorflow library for learning-to-rank. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 2970–2978 (2019)
Google Scholar
Rezvani, S., Wang, X.: A broad review on class imbalance learning techniques. Appl. Soft Comput. 143, 110415 (2023)
Article Google Scholar
Rivolli, A., Garcia, L.P., Soares, C., Vanschoren, J., de Carvalho, A.C.: Characterizing classification datasets: a study of meta-features for meta-learning. arXiv preprint arXiv:1808.10406 (2018)
Singh, P., Vanschoren, J.: Automated imbalanced learning. arXiv preprint arXiv:2211.00376 (2022)
Tornede, A., Wever, M., Hüllermeier, E.: Extreme algorithm selection with dyadic feature representation. In: International Conference on Discovery Science. pp. 309–324. Springer (2020)
Google Scholar
Truong, A., Walters, A., Goodsitt, J., Hines, K., Bruss, C.B., Farivar, R.: Towards automated machine learning: Evaluation and comparison of automl approaches and tools. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI). pp. 1471–1479. IEEE (2019)
Google Scholar
Vanschoren, J.: Meta-learning: A survey. arXiv preprint arXiv:1810.03548 (2018)
Vieira, P.M., Rodrigues, F.: An automated approach for binary classification on imbalanced data. Knowledge and Information Systems pp. 1–21 (2024)
Google Scholar
Wang, K., Xue, Q., Lu, J.J.: Risky driver recognition with class imbalance data and automated machine learning framework. Int. J. Environ. Res. Public Health 18(14), 7534 (2021)
Article Google Scholar
Wang, S., Yao, X.: Multiclass imbalance problems: Analysis and potential solutions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42(4), 1119–1130 (2012)
Google Scholar
Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: Settings, methods, and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10(2), 1–37 (2019)
Google Scholar
Wang, Z., Wang, S.: Online automated machine learning for class imbalanced data streams. In: 2023 International Joint Conference on Neural Networks (IJCNN). pp. 1–8. IEEE (2023)
Google Scholar
Winkelmolen, F., Ivkin, N., Bozkurt, H.F., Karnin, Z.: Practical and sample efficient zero-shot hpo. arXiv preprint arXiv:2007.13382 (2020)
Wistuba, M., Grabocka, J.: Few-shot bayesian optimization with deep kernel surrogates. arXiv preprint arXiv:2101.07667 (2021)
XU, S., Wang, J.: On strong convergence of the two-tower model for recommender system (2021)
Google Scholar
Yang, F., Zou, Q.: maml: an automated machine learning pipeline with a microbiome repository for human disease classification. Database 2020, baaa050 (2020)
Google Scholar
Zhang, J., Sun, Z., Qi, Y.: Autoidl: Automated imbalanced data learning via collaborative filtering. In: International Conference on Knowledge Science, Engineering and Management. pp. 96–104. Springer (2020)
Google Scholar

Download references

Acknowledgements

This work is supported by the EPSRC Early Career Researchers International Collaboration Grants [EP/Y002539/1].

Author information

Authors and Affiliations

School of Computer Science, University of Birmingham, Birmingham, B15 2TT, UK
Zhaoyang Wang & Shuo Wang

Authors

Zhaoyang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuo Wang .

Editor information

Editors and Affiliations

University of Salford, Salford, Lancashire, UK
Apostolos Antonacopoulos
Indian Institute of Technology Bombay, Mumbai, Maharashtra, India
Subhasis Chaudhuri
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
IIT Kharagpur, Kharagpur, West Bengal, India
Saumik Bhattacharya
Indian Statistical Institute Kolkata, kolkata, West Bengal, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Z., Wang, S. (2025). Zero-shot Automated Class Imbalanced Learning. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15324. Springer, Cham. https://doi.org/10.1007/978-3-031-78383-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-78383-8_10
Published: 02 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78382-1
Online ISBN: 978-3-031-78383-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)