abstract

A genetic algorithm to optimize SMOTE and GAN ratios in class imbalanced datasets

Authors:
Hwi-Yeon Cho

Kwangwoon University, Seoul, Republic of Korea

Kwangwoon University, Seoul, Republic of Korea
View Profile

,
Yong-Hyuk Kim

Kwangwoon University, Seoul, Republic of Korea

Kwangwoon University, Seoul, Republic of Korea
View Profile

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference CompanionJuly 2020Pages 33–34https://doi.org/10.1145/3377929.3398153

Published:08 July 2020Publication History

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion

Pages 33–34

ABSTRACT

Class imbalance is one of the problem easily encountered in the fields of data analysis and machine learning. When there is an imbalance in learning dataset, machine learning models become biased and learn inaccurate classifiers. To resolve such data imbalance problems, a strategy that increases the volume of data of minority classes is often used by applying the synthetic minority oversampling technique (SMOTE). Furthermore, the use of generative adversarial networks (GANs) for data oversampling has recently become more common. This research used a genetic algorithm to search and optimize the combinations of oversampling ratios based on the SMOTE and GAN techniques. The case in which the proposed method was used was compared with the cases in which a single technique was used to train either the imbalanced data or oversampled data. From the results, it was established that the classifier that learned the oversampled data with the optimized ratio using the proposed method was superior in classification performance.

Supplemental Material

Available for Download

zip

p33_cho_suppl.zip (286.4 KB)

Supplemental material.

References

Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16 (2002), 321--357.Google ScholarCross Ref
Hwi-Yeon Cho and Yong-Hyuk Kim. 2019. Stabilized training of generative adversarial networks by a genetic algorithm. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. ACM, 51--52.Google ScholarDigital Library
Akhilesh Gangwar and Vadlamani Ravi. 2019. WiP: Generative Adversarial Network for Oversampling Data in Credit Card Fraud Detection. In Information Systems Security - Proceedings of the 15th International Conference (Lecture Notes in Computer Science), Vol. 11952. Springer, 123--134.Google Scholar
Jae-Hyun Seo and Yong-Hyuk Kim. 2018. Machine-Learning Approach to Optimize SMOTE Ratio in Class Imbalance Dataset for Intrusion Detection. Computational Intelligence and Neuroscience 2018 (2018), 9704672:1--9704672:11.Google Scholar

Index Terms

A genetic algorithm to optimize SMOTE and GAN ratios in class imbalanced datasets
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Bio-inspired approaches
        Genetic algorithms

Recommendations

A New Under-Sampling Method Using Genetic Algorithm for Imbalanced Data Classification
IMCOM '16: Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication

The class imbalance problem is frequently found in many real-world domains, where many of traditional classifiers often fail to detect minority class objects due to paying less attention to those. In an effort to address this class imbalance problem, a ...
Read More
Applying Threshold SMOTE Algoritwith Attribute Bagging to Imbalanced Datasets
Proceedings of the 8th International Conference on Rough Sets and Knowledge Technology - Volume 8171

Synthetic minority over-sampling technique SMOTE is an effective over-sampling technique and specifically designed for learning from imbalanced data sets. However, in the process of synthetic sample generation, SMOTE is of some blindness. This paper ...
Read More
A Novel Oversampling Technique for Imbalanced Learning Based on SMOTE and Genetic Algorithm
Neural Information Processing
Abstract
Learning from imbalanced datasets is a challenge in machine learning, oversampling is an effective method to solve the problem of class imbalance, owing to its easy-to-go capability of achieving the balance by synthesizing new samples. However ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion
July 2020
1982 pages
ISBN:9781450371278
DOI:10.1145/3377929
General Chair:
Carlos Artemio Coello Coello
CINVESTAV-IPN
Copyright © 2020 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 July 2020
Check for updates
Author Tags
genetic algorithm
machine learning
Qualifiers
- abstract
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 310
  Total Downloads
- Downloads (Last 12 months)58
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A genetic algorithm to optimize SMOTE and GAN ratios in class imbalanced datasets

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A New Under-Sampling Method Using Genetic Algorithm for Imbalanced Data Classification

Applying Threshold SMOTE Algoritwith Attribute Bagging to Imbalanced Datasets

A Novel Oversampling Technique for Imbalanced Learning Based on SMOTE and Genetic Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A genetic algorithm to optimize SMOTE and GAN ratios in class imbalanced datasets

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A New Under-Sampling Method Using Genetic Algorithm for Imbalanced Data Classification

Applying Threshold SMOTE Algoritwith Attribute Bagging to Imbalanced Datasets

A Novel Oversampling Technique for Imbalanced Learning Based on SMOTE and Genetic Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media