skip to main content
10.1145/2001576.2001756acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Evolving ensembles in multi-objective genetic programming for classification with unbalanced data

Published: 12 July 2011 Publication History

Abstract

Machine learning algorithms can suffer a performance bias when data sets are unbalanced. This paper proposes a Multi-objective Genetic Programming approach using negative correlation learning to evolve accurate and diverse ensembles of non-dominated solutions where members vote on class membership. We also compare two popular Pareto-based fitness schemes on the classification tasks. We show that the evolved ensembles achieve high accuracy on both classes using six unbalanced binary data sets, and that this performance is usually better than many of its individual members.

References

[1]
. A. Abbass. Pareto neuro-evolution:constructing ensemble of neural networks using multi-objectiveoptimization. In 2003 Congress on EvolutionaryComputation, volume 3, pages 2074--2080, 2003.
[2]
A. Asuncion and D. Newman. UCI Machine Learning Repository, 2007. University of California, Irvine, School of Information and Computer Sciences. "http://www.ics.uci.edu/~mlearn/MLRepository.html".
[3]
R. Barandela, J. S. Sanchez, V. Garcia, and E. Rangel. Strategies for learning in class imbalance problems. Pattern Recognition, 36(3):849--851, 2003.
[4]
U. Bhowan, M. Johnston, and M. Zhang. Multi-objective genetic programming for classification with unbalanced data. In Proceedings of the 22nd Australasian Joint Conference on Artificial Intelligence, pages 370--380. Springer, 2009.
[5]
A. Chandra and X. Yao. DIVACE: Diverse and accurate ensemble learning algorithm. In Z. R. Yang et al., editors, Intelligent Data Engineering and Automated Learning, pages 619--625. Springer, 2004.
[6]
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast elitist multi-objective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6:182--197, 2000.
[7]
J. Doucette and M. I. Heywood. GP classification under imbalanced data sets: Active sub-sampling and AUC approximation. In Proceedings of EuroGP'08, pages 266--277, 2008.
[8]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA data mining software: An update. SIGKDD Explorations, 11(1), 2009.
[9]
J. H. Holmes. Differential negative reinforcement improves classifier system learning rate in two-class problems with unequal base rates. In Proceedings of Genetic Programming 1998, pages 635--644.
[10]
N. Japcowicz and S. Stephen. The class imbalance problem: A systematic study. Intelligent Data Analysis, 6(5):429--449. 2002.
[11]
J. Knowles, L. Thiele, and E. Zitzler. A tutorial on the performance assessment of stochastic multiobjective optimizers. Technical report, February 2006. No. 214, Computer Engineering and Networks Laboratory, Swiss Federal Institute of Technology, Zurich.
[12]
J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1992.
[13]
A. McIntyre and M. Heywood. Multi-objective competitive coevolution for efficient GP classifier problem decomposition. In IEEE International Conference on Systems, Man and Cybernetics, pages 1930--1937, 2007.
[14]
S. Munder and D. Gavrila. An experimental study on pedestrian classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11):1863--1868, 2006.
[15]
S. J. Stolfo, D. W. Fan, W. Lee, A. L. Prodromidis, and P. K. Chan. Credit card fraud detection using meta-learning: Issues and initial results. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management, pages 83--90, 1997.
[16]
G. M. Weiss and F. Provost. Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research, 19:315--354, 2003.
[17]
J. Yaochu and B. Sendhoff. Pareto-based multiobjective machine learning: An overview and case studies. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(3):397--415, 2008.
[18]
E. Zitzler, M. Laumanns, and L. Thiele. Spea2: Improving the strength pareto evolutionary algorithm for multiobjective optimization. TIK-Report 103, Department of Electrical Engineering, Swiss Federal Institute of Technology. 2001.

Cited By

View all
  • (2024)A Survey on Unbalanced Classification: How Can Evolutionary Computation Help?IEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.325723028:2(353-373)Online publication date: Apr-2024
  • (2024)Enhancing continuous integration predictions: a hybrid LSTM-GRU deep learning framework with evolved DBSO algorithmComputing10.1007/s00607-024-01370-2107:1Online publication date: 26-Nov-2024
  • (2023)A survey of evolutionary algorithms for supervised ensemble learningThe Knowledge Engineering Review10.1017/S026988892300002438Online publication date: 1-Mar-2023
  • Show More Cited By

Index Terms

  1. Evolving ensembles in multi-objective genetic programming for classification with unbalanced data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GECCO '11: Proceedings of the 13th annual conference on Genetic and evolutionary computation
    July 2011
    2140 pages
    ISBN:9781450305570
    DOI:10.1145/2001576
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 July 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. class imbalance
    2. classification
    3. evolutionary multi-objective optimisation
    4. genetic programming

    Qualifiers

    • Research-article

    Conference

    GECCO '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Survey on Unbalanced Classification: How Can Evolutionary Computation Help?IEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.325723028:2(353-373)Online publication date: Apr-2024
    • (2024)Enhancing continuous integration predictions: a hybrid LSTM-GRU deep learning framework with evolved DBSO algorithmComputing10.1007/s00607-024-01370-2107:1Online publication date: 26-Nov-2024
    • (2023)A survey of evolutionary algorithms for supervised ensemble learningThe Knowledge Engineering Review10.1017/S026988892300002438Online publication date: 1-Mar-2023
    • (2022)High-Dimensional Unbalanced Binary Classification by Genetic Programming with Multi-Criterion Fitness Evaluation and SelectionEvolutionary Computation10.1162/evco_a_0030430:1(99-129)Online publication date: 1-Mar-2022
    • (2022)Detecting Continuous Integration Skip Commits Using Multi-Objective Evolutionary SearchIEEE Transactions on Software Engineering10.1109/TSE.2021.312916548:12(4873-4891)Online publication date: 1-Dec-2022
    • (2021)BF-detector: an automated tool for CI build failure detectionProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3473115(1530-1534)Online publication date: 20-Aug-2021
    • (2021)Genetic programming for borderline instance detection in high-dimensional unbalanced classificationProceedings of the Genetic and Evolutionary Computation Conference10.1145/3449639.3459284(349-357)Online publication date: 26-Jun-2021
    • (2021)Developing Interval-Based Cost-Sensitive Classifiers by Genetic Programming for Binary High-Dimensional Unbalanced Classification [Research Frontier]IEEE Computational Intelligence Magazine10.1109/MCI.2020.303907016:1(84-98)Online publication date: 1-Feb-2021
    • (2021)CIS Publication Spotlight [Publication Spotlight]IEEE Computational Intelligence Magazine10.1109/MCI.2020.303906516:1(18-20)Online publication date: 1-Feb-2021
    • (2021)The Wonders of Knowledge Transfer [Editor's Remarks]IEEE Computational Intelligence Magazine10.1109/MCI.2020.303902216:1(2-2)Online publication date: 1-Feb-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media