research-article

Optimizing clustering to promote data diversity when generating an ensemble classifier

Authors:
Zohaib M Jan

Central Queensland University, Brisbane, Australia

Central Queensland University, Brisbane, Australia
View Profile

,
Brijesh Verma

Central Queensland University, Brisbane, Australia

Central Queensland University, Brisbane, Australia
View Profile

,
Sam Fletcher

Central Queensland University, Brisbane, Australia

Central Queensland University, Brisbane, Australia
View Profile

GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference CompanionJuly 2018Pages 1402–1409https://doi.org/10.1145/3205651.3208245

Published:06 July 2018Publication History

GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Pages 1402–1409

ABSTRACT

In this paper, we propose a method to generate an optimized ensemble classifier. In the proposed method, a diverse input space is created by clustering training data incrementally within a cycle. A cycle is one complete round that includes clustering, training, and error calculation. In each cycle, a random upper bound of clustering is chosen and data clusters are generated. A set of heterogeneous classifiers are trained on all generated clusters to promote structural diversity. An ensemble classifier is formed in each cycle and generalization error of that ensemble is calculated. This process is optimized to find the set of classifiers which can have the lowest generalization error. The process of optimization terminates when generalization error can no longer be minimized. The cycle with the lowest error is then selected and all trained classifiers of that particular cycle are passed to the next stage. Any classifier having lower accuracy than the average accuracy of the pool is discarded, and the remaining classifiers form the proposed ensemble classifier. The proposed ensemble classifier is tested on classification benchmark datasets from UCI repository. The results are compared with existing state-of-the-art ensemble classifier methods including Bagging and Boosting. It is demonstrated that the proposed ensemble classifier performs better than the existing ensemble methods.

References

T. K. Ho, J. J. Hull, and S. N. Srihari, "Decision combination in multiple classifier systems," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 1, pp. 66--75, 1994. Google ScholarDigital Library
T. G. Dietterich, "Ensemble methods in machine learning," Multiple Classifier Systems, vol. 1857, pp. 1--15, 2000. Google ScholarDigital Library
L. I. Kuncheva and C. J. Whitaker, "Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy," Machine Learning, vol. 51, no. 2, pp. 181--207, 2003. Google ScholarDigital Library
Z.-H. Zhou, Ensemble methods: foundations and algorithms. CRC Press, 2012. Google ScholarDigital Library
M. Woźniak, M. Graña, and E. Corchado, "A survey of multiple classifier systems as hybrid systems," Information Fusion, vol. 16, pp. 3--17, 2014. Google ScholarDigital Library
D. H. Wolpert and W. G. Macready, "No free lunch theorems for optimization," IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67--82, 1997. Google ScholarDigital Library
J. Kittler, M. Hatef, R. P. Duin, and J. Matas, "On combining classifiers," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226--239, 1998. Google ScholarDigital Library
J. Abellán and J. G. Castellano, "A comparative study on base classifiers in ensemble methods for credit scoring," Expert Systems with Applications, vol. 73, pp. 1--10, 2017.Google ScholarCross Ref
M.-J. Kim and D.-K. Kang, "Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction," Expert Systems with Applications, vol. 39, no. 10, pp. 9308--9314, 2012. Google ScholarDigital Library
L. Breiman, "Bagging predictors," Machine Learning, vol. 24, no. 2, pp. 123--140, 1996. Google ScholarCross Ref
G. Rätsch, T. Onoda, and K.-R. Müller, "Soft margins for AdaBoost," Machine Learning, vol. 42, no. 3, pp. 287--320, 2001. Google ScholarDigital Library
Y. Freund and R. E. Schapire, "Experiments with a new boosting algorithm," in International Conference on Machine Learning, Bari, Italy, 1996, vol. 96, pp. 148--156. Google ScholarDigital Library
A. Vezhnevets and V. Vezhnevets, "Modest AdaBoost-teaching AdaBoost to generalize better," in Graphicon, 2005, vol. 12, no. 5, pp. 987--997.Google Scholar
C. Domingo and O. Watanabe, "MadaBoost: A modification of AdaBoost," in Conference on Learning Theory, 2000, pp. 180--189. Google ScholarDigital Library
S. Avidan, "Spatialboost: Adding spatial reasoning to adaboost," in European Conference on Computer Vision, 2006, pp. 386--396: Springer. Google ScholarDigital Library
L. Breiman, "Random forests," Machine Learning, vol. 45, no. 1, pp. 5--32, 2001. Google ScholarDigital Library
M. Gönen and E. Alpaydm, "Multiple kernel learning algorithms," Journal of Machine Learning Research, vol. 12, no. Jul, pp. 2211--2268, 2011. Google ScholarDigital Library
A. Rahman and B. Verma, "Ensemble classifier generation using non-uniform layered clustering and Genetic Algorithm," Knowledge-Based Systems, vol. 43, pp. 30--42, 2013. Google ScholarDigital Library
R. Xu and D. Wunsch, "Survey of clustering algorithms," IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 645--678, 2005. Google ScholarDigital Library
M. Asafuddoula, B. Verma, and M. Zhang, "An incremental ensemble classifier learning by means of a rule-based accuracy and diversity comparison," in International Joint Conference on Neural Networks, 2017, pp. 1924--1931.Google Scholar
S. Fletcher and B. Verma, "Removing Bias from Diverse Data Clusters for Ensemble Classification," in International Conference on Neural Information Processing, 2017, pp. 140--149: Springer.Google Scholar
A. Rahman and B. Verma, "Novel layered clustering-based approach for generating ensemble of classifiers," IEEE Transactions on Neural Networks, vol. 22, no. 5, pp. 781--92, May 2011. Google ScholarDigital Library
H. Kadkhodaei and A. M. E. Moghadam, "An entropy based approach to find the best combination of the base classifiers in ensemble classifiers based on stack generalization," in International Conference on Control, Instrumentation, and Automation, 2016, pp. 425--429.Google Scholar
L-y. Yang, J.-y. Zhang, and W.-j. Wang, "Cluster ensemble based on particle swarm optimization," in WRI Global Congress on Intelligent Systems, 2009, vol. 3, pp. 519--523. Google ScholarDigital Library
H. J. Escalante, M. Montes, and E. Sucar, "Ensemble particle swarm model selection," in International Joint Conference on Neural Networks, 2010, pp. 1--8.Google Scholar
H. J. Escalante, M. M. y Gómez, and L. E. Sucar, "Psms for neural networks on the ijcnn 2007 agnostic vs prior knowledge challenge," in International Joint Conference on Neural Networks, 2007, pp. 678--683.Google ScholarCross Ref
H. J. Escalante, M. Montes, and L. E. Sucar, "Particle swarm model selection," Journal of Machine Learning Research, vol. 10, pp. 405--440, 2009. Google ScholarDigital Library
H. J. Escalante, M. Montes, and L. Villaseñor, "Particle swarm model selection for authorship verification," in Iberoamerican Congress on Pattern Recognition, 2009, pp. 563--570. Google ScholarDigital Library
B. Verma and A. Rahman, " Cluster-oriented ensemble classifier: Impact of multicluster characterization on ensemble classifier learning," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 4, pp. 605--618, 2012. Google ScholarDigital Library
K. Bache and M. Lichman. (2013). UCI machine learning repository. Available: http://archive.ics.uci.edu/ml/Google Scholar
L. Zhang and P. N. Suganthan, "Oblique decision tree ensemble via multisurface proximal support vector machine," IEEE Transactions on Cybernetics, vol. 45, no. 10, pp. 2165--2176, 2015.Google ScholarCross Ref
L. I. Kuncheva and J. J. Rodríguez, "A weighted voting framework for classifiers ensembles," Knowledge and Information Systems, vol. 38, no. 2, pp. 259--275, 2014.Google Scholar
MATLAB, Statistics and Machine Learning Toolbox. Natick, Massachusetts: The MathWorks Inc., 2013.Google Scholar

Recommendations

Multiple Elimination of Base Classifiers in Ensemble Learning Using Accuracy and Diversity Comparisons
Survey Paper and Regular Paper

When generating ensemble classifiers, selecting the best set of classifiers from the base classifier pool is considered a combinatorial problem and an efficient classifier selection methodology must be utilized. Different researchers have used different ...
Read More
Evolutionary Classifier and Cluster Selection Approach for Ensemble Classification

Ensemble classifiers improve the classification performance by combining several classifiers using a suitable fusion methodology. Many ensemble classifier generation methods have been developed that allowed the training of multiple classifiers on a ...
Read More
Cluster-Oriented Ensemble Classifier: Impact of Multicluster Characterization on Ensemble Classifier Learning

This paper presents a novel cluster-oriented ensemble classifier. The proposed ensemble classifier is based on original concepts such as learning of cluster boundaries by the base classifiers and mapping of cluster confidences to class decision using a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference Companion
July 2018
1968 pages
ISBN:9781450357647
DOI:10.1145/3205651
Editor:
Hernan Aguirre
Shinshu University
,
General Chair:
Keiki Takadama
The University of Electro-Communications
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
ensemble classifiers
evolutionary algorithms
multi classifier systems
neural networks
particle swarm optimization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 97
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Optimizing clustering to promote data diversity when generating an ensemble classifier

GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference Companion

ABSTRACT

References

Cited By

Recommendations

Multiple Elimination of Base Classifiers in Ensemble Learning Using Accuracy and Diversity Comparisons

Evolutionary Classifier and Cluster Selection Approach for Ensemble Classification

Cluster-Oriented Ensemble Classifier: Impact of Multicluster Characterization on Ensemble Classifier Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Optimizing clustering to promote data diversity when generating an ensemble classifier

GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference Companion

ABSTRACT

References

Cited By

Recommendations

Multiple Elimination of Base Classifiers in Ensemble Learning Using Accuracy and Diversity Comparisons

Evolutionary Classifier and Cluster Selection Approach for Ensemble Classification

Cluster-Oriented Ensemble Classifier: Impact of Multicluster Characterization on Ensemble Classifier Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media