ABSTRACT
Class imbalance is one of the problem easily encountered in the fields of data analysis and machine learning. When there is an imbalance in learning dataset, machine learning models become biased and learn inaccurate classifiers. To resolve such data imbalance problems, a strategy that increases the volume of data of minority classes is often used by applying the synthetic minority oversampling technique (SMOTE). Furthermore, the use of generative adversarial networks (GANs) for data oversampling has recently become more common. This research used a genetic algorithm to search and optimize the combinations of oversampling ratios based on the SMOTE and GAN techniques. The case in which the proposed method was used was compared with the cases in which a single technique was used to train either the imbalanced data or oversampled data. From the results, it was established that the classifier that learned the oversampled data with the optimized ratio using the proposed method was superior in classification performance.
Supplemental Material
Available for Download
Supplemental material.
- Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16 (2002), 321--357.Google ScholarCross Ref
- Hwi-Yeon Cho and Yong-Hyuk Kim. 2019. Stabilized training of generative adversarial networks by a genetic algorithm. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. ACM, 51--52.Google ScholarDigital Library
- Akhilesh Gangwar and Vadlamani Ravi. 2019. WiP: Generative Adversarial Network for Oversampling Data in Credit Card Fraud Detection. In Information Systems Security - Proceedings of the 15th International Conference (Lecture Notes in Computer Science), Vol. 11952. Springer, 123--134.Google Scholar
- Jae-Hyun Seo and Yong-Hyuk Kim. 2018. Machine-Learning Approach to Optimize SMOTE Ratio in Class Imbalance Dataset for Intrusion Detection. Computational Intelligence and Neuroscience 2018 (2018), 9704672:1--9704672:11.Google Scholar
Index Terms
- A genetic algorithm to optimize SMOTE and GAN ratios in class imbalanced datasets
Recommendations
A New Under-Sampling Method Using Genetic Algorithm for Imbalanced Data Classification
IMCOM '16: Proceedings of the 10th International Conference on Ubiquitous Information Management and CommunicationThe class imbalance problem is frequently found in many real-world domains, where many of traditional classifiers often fail to detect minority class objects due to paying less attention to those. In an effort to address this class imbalance problem, a ...
Applying Threshold SMOTE Algoritwith Attribute Bagging to Imbalanced Datasets
Proceedings of the 8th International Conference on Rough Sets and Knowledge Technology - Volume 8171Synthetic minority over-sampling technique SMOTE is an effective over-sampling technique and specifically designed for learning from imbalanced data sets. However, in the process of synthetic sample generation, SMOTE is of some blindness. This paper ...
A Novel Oversampling Technique for Imbalanced Learning Based on SMOTE and Genetic Algorithm
Neural Information ProcessingAbstractLearning from imbalanced datasets is a challenge in machine learning, oversampling is an effective method to solve the problem of class imbalance, owing to its easy-to-go capability of achieving the balance by synthesizing new samples. However ...
Comments