skip to main content
10.1145/3377929.3398153acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
abstract

A genetic algorithm to optimize SMOTE and GAN ratios in class imbalanced datasets

Published:08 July 2020Publication History

ABSTRACT

Class imbalance is one of the problem easily encountered in the fields of data analysis and machine learning. When there is an imbalance in learning dataset, machine learning models become biased and learn inaccurate classifiers. To resolve such data imbalance problems, a strategy that increases the volume of data of minority classes is often used by applying the synthetic minority oversampling technique (SMOTE). Furthermore, the use of generative adversarial networks (GANs) for data oversampling has recently become more common. This research used a genetic algorithm to search and optimize the combinations of oversampling ratios based on the SMOTE and GAN techniques. The case in which the proposed method was used was compared with the cases in which a single technique was used to train either the imbalanced data or oversampled data. From the results, it was established that the classifier that learned the oversampled data with the optimized ratio using the proposed method was superior in classification performance.

Skip Supplemental Material Section

Supplemental Material

References

  1. Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16 (2002), 321--357.Google ScholarGoogle ScholarCross RefCross Ref
  2. Hwi-Yeon Cho and Yong-Hyuk Kim. 2019. Stabilized training of generative adversarial networks by a genetic algorithm. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. ACM, 51--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Akhilesh Gangwar and Vadlamani Ravi. 2019. WiP: Generative Adversarial Network for Oversampling Data in Credit Card Fraud Detection. In Information Systems Security - Proceedings of the 15th International Conference (Lecture Notes in Computer Science), Vol. 11952. Springer, 123--134.Google ScholarGoogle Scholar
  4. Jae-Hyun Seo and Yong-Hyuk Kim. 2018. Machine-Learning Approach to Optimize SMOTE Ratio in Class Imbalance Dataset for Intrusion Detection. Computational Intelligence and Neuroscience 2018 (2018), 9704672:1--9704672:11.Google ScholarGoogle Scholar

Index Terms

  1. A genetic algorithm to optimize SMOTE and GAN ratios in class imbalanced datasets

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion
      July 2020
      1982 pages
      ISBN:9781450371278
      DOI:10.1145/3377929

      Copyright © 2020 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 July 2020

      Check for updates

      Qualifiers

      • abstract

      Acceptance Rates

      Overall Acceptance Rate1,669of4,410submissions,38%

      Upcoming Conference

      GECCO '24
      Genetic and Evolutionary Computation Conference
      July 14 - 18, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader