Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces

Doucette, John A.; McIntyre, Andrew R.; Lichodzijewski, Peter; Heywood, Malcolm I.

doi:10.1007/s10710-011-9151-4

Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces

Published: 27 November 2011

Volume 13, pages 71–101, (2012)
Cite this article

Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

John A. Doucette¹,
Andrew R. McIntyre²,
Peter Lichodzijewski² &
…
Malcolm I. Heywood²

431 Accesses
31 Citations
Explore all metrics

Abstract

Classification under large attribute spaces represents a dual learning problem in which attribute subspaces need to be identified at the same time as the classifier design is established. Embedded as opposed to filter or wrapper methodologies address both tasks simultaneously. The motivation for this work stems from the observation that team based approaches to Genetic Programming (GP) have the potential to design multiple classifiers per class—each with a potentially unique attribute subspace—without recourse to filter or wrapper style preprocessing steps. Specifically, competitive coevolution provides the basis for scaling the algorithm to data sets with large instance counts; whereas cooperative coevolution provides a framework for problem decomposition under a bid-based model for establishing program context. Symbiosis is used to separate the tasks of team/ensemble composition from the design of specific team members. Team composition is specified in terms of a combinatorial search performed by a Genetic Algorithm (GA); whereas the properties of individual team members and therefore subspace identification is established under an independent GP population. Teaming implies that the members of the resulting ensemble of classifiers should have explicitly non-overlapping behaviour. Performance evaluation is conducted over data sets taken from the UCI repository with 649–102,660 attributes and 2–10 classes. The resulting teams identify attribute spaces 1–4 orders of magnitude smaller than under the original data set. Moreover, team members generally consist of less than 10 instructions; thus, small attribute subspaces are not being traded for opaque models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An Empirical Study of Progressive Insular Cooperative GP

Article 07 January 2022

Karina Brotto Rebuli & Leonardo Vanneschi

On the Effects of Collaborators Selection and Aggregation in Cooperative Coevolution: An Experimental Analysis

An on-line Pittsburgh LCS for the Three-Cornered Coevolution Framework

Article 18 November 2015

Syahaneim Marzukhi, Will N. Browne & Mengjie Zhang

Notes

By ‘dimension reduction’ we recognize two generic forms for reducing the attribute space as seen by the classification stage: attribute selection or attribute transform. Attribute selection attempts to select a subset of the original attributes using some measure of inter attribute correlation e.g., F-measure, Gini index. Conversely, attribute transforms apply an operator to the original attribute space to transform this to a new coordinate frame such that various orthogonality properties are satisfied e.g., PCA. Naturally, attribute selection maintains an explicit link to the original attribute space, potentially retaining more insight into the application domain as the classifier builds a model relative to the original domain specific attributes.
Combining a simple subsampling heuristic with estimation of an AUC style fitness function may provide additional reinforcement for this tendency [8].
Section 6.2 provides a synopsis of previous/current research in cooperative coevolution in general.
For example, \(R[x] \leftarrow R[y] \langle op \rangle IP(z)\) where \(IP(z): z \in \{0, \ldots, A - 1\}\) denotes an attribute space index associated with the data set.
Should all points for a class already be present in the point population, the class label is reselected.
Where this represents all the dominated points (should any exist) and enough of the non-dominated points to complete P _gap. Prioritization of the latter as established by the ranking of non-dominated points as established by the sharing function.
In this case N _i counts the number of points in P ^t—as opposed to \(\mathcal{F}(P^t)\)—that make the same distinction as the ith entry.
As discussed in related work, Sect. 6.1, this does not preclude the coevolution of cascaded (hierarchical) models, but this is outside the scope of this paper.
http://web.cs.dal.ca/~mheywood/Code/SBB.
We distinguish between the ‘NIPS’ partition of the document repository from the ‘Advances in Neural Information Processing (ANIPS)’ conference venue.
The F-score filter and SVM implimentation is available from: http://www.csie.ntu.edu.tw/~cjlin/.
Attribute counts of zero appear and are indicative of members within a team establishing a constant bid value against which other team members have learnt to establish their bidding policy. Such a characteristic might enable further post training simplification of the team composition but was not considered here.
Feature selection (construction) was performed using a GA (GP) and an independent classification algorithm employed for constructing the classifier / validating the selected attributes.
Tree structured GP.
In terms of the total memory requirement, a million attribute data set with 91 exemplars is most similar to the NIPS data set as reported here (Table 1).

References

A. Asuncion, D.J. Newman, UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/mlrepository.html. University of California, Department of Information and Computer Science, Irvine, CA, (2008)
M. Brameier, W. Banzhaf, A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans. Evol. Comput. 5(1), 17–26 (2001)
Article Google Scholar
M. Brameier, W. Banzhaf, Evolving teams of predictors with linear genetic programming. Genetic Progr. Evol. Mach. 2(4), 381–407 (2001)
Article MATH Google Scholar
J. Cartlidge, D. Ait-Boudaoud, Autonomous virulence adaptation improves coevolutionary optimization. IEEE Trans. Evol. Comput. 15(2), 215–229 (2011)
Article Google Scholar
A. Chandra, H. Chen, X. Yao, Trade-off Between Diversity and Accuracy in Ensemble Generation, chapter 19, pp. 429–464, (2006). In ([19])
Y.-W. Chen, C.-J. Lin, Combining SVMs with various feature selection strategies, chapter 12, pp. 315–324, (2006). In ([15])
de E.D. Jong, A monotonic archive for Pareto-coevolution. Evol. Comput. 15(1), 61–93 (2007)
Article Google Scholar
J. Doucette, M.I. Heywood, in GP classification under imbalanced data sets: active sub-sampling and AUC approximation, ed. by M. O’Neill et al. European Conference on Genetic Programming, volume 4971 of LNCS, pp. 266–277, (2008)
J. Doucette, P. Lichodzijewski, M.I. Heywood, Evolving coevolutionary classifiers under large attribute spaces. In: R. Riolo, T. McConaghy, U.-M O’Reilly (eds) Genetic Programming Theory and Practice, volume VII, (Springer, Berlin, 2009) pp. 37–54.
Google Scholar
C. Drummond, Machine learning as an experimental science (revisited). AAAI Workshop on Evaluation Methods for Machine Learning ed. by C. Drummond, W. Elazmeh, and N. Japkowicz, pp. 1–5, (2006)
P.G. Espejo, S. Ventura, F. Herrera, A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C 40(2), 121–144 (2010)
Article Google Scholar
S.G. Ficici, J.B. Pollack, in Pareto optimality in coevolutionary learning, ed. by J. Kelemen, P. Sosik. Proceedings of the 6th European Conference on Advances in Artificial Life volume 2159 of LNAI, pp. 316–325 (2001)
C. Gagné, M. Sebag, M. Schoenauer, M. Tomassini, in Ensemble learning for free with evolutionary algorithms?, ed. by D. Thierens et al. Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1782–1788 (2007)
I. Guyon, S. Gunn, A.B. Hur, G. Dror, Design and Analysis of the NIPS2003 Challenge, chapter 9, pp. 237–263, (2006), In ([15])
I. Guyon, S. Gunn, M. Nikravesh, L. Zadeh (eds), Feature Selection: Foundations and Applications, volume 207 of Studies in Fuzziness and Soft Computing. (Spinger, Berlin, 2006)
Google Scholar
P. Haffner, Scaling large margin classifiers for spoken language understanding. Speech Commun. 48, 239–261 (2006)
Article Google Scholar
P. Haffner, S. Kanthak, Fast kernel learning with sparse inverted index. In: L. Bottou, O. Chapelle, D. DeCoste, J. Weston (eds) Large-scale Kernel Machines, (MIT Press, Cambridge, 2007) pp. 51–71.
Google Scholar
M.I. Heywood, P. Lichodzijewski, in Symbiogenesis as a mechanism for building complex adaptive systems: a review, ed. by C. Di Chio et al. EvoComplex, volume 6024 of LNCS, pp. 51–60, (2010)
Y. Jin (eds), Multi-Objective Machine Learning, volume 16 of Studies in Computational Intelligence. (Spinger, Berlin, 2006)
Google Scholar
K. Krawiec, Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genetic Progr. Evol. Mach. 3(4), 329–343 (2002)
Article MATH Google Scholar
R. Kumar, A.H. Joshi, K.K. Banka, P.I. Rockett, in Evolution of hyperheuristics for the biobjective 0/1 knapsack problem by multiobjective genetic programming, ed. by M. Keijzer et al. Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1227–1234 (2008)
T.N. Lal, O. Chapelle, J. Weston, A. Elisseeff, Embedded Methods, chapter 5, pp. 137–165, (2006). In ([15])
W.B. Langdon, A.P. Harrison, GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft. Comput. 12(12), 1169–1183 (2008)
Article Google Scholar
R. Law, The symbiotic phenotype: origins and evolution. In: L. Margulis, R. Fester (eds) Symbiosis as a Source of Evolutionary Innovation: Speciation and Morphogenesis, chapter 5, (MIT Press, Cambridge, 1991) pp. 57–71.
Google Scholar
P. Lichodzijewski, M.I. Heywood, Coevolutionary bid-based genetic programming for problem decomposition in classification. Genetic Progr. Evol. Mach. 9(4), 331–365 (2008)
Article Google Scholar
P. Lichodzijewski, M.I. Heywood, in Managing team-based problem solving with symbiotic bid-based genetic programming, ed. by M. Keijzer et al. ACM Proceedings of the Genetic and Evolutionary Computation Conference, pp. 363–370, (2008)
P. Lichodzijewski, M.I. Heywood, The Rubik Cube and GP temporal sequence learning: an initial study. In: R. Riolo, T. Soule, B.i.l.l. Worzel (eds) Genetic Programming Theory and Practice, volume VIII, chapter 3, (Springer, Berlin, 2010) pp. 35–54.
Google Scholar
Y. Liu, X. Yao, T. Higuchi, Evolutionary ensembles with negative correlation learning. IEEE Trans. Evol. Comput. 4(4), 380–387 (2000)
Article Google Scholar
L. Margulis, R. Fester (eds), Symbiosis as a Source of Evolutionary Innovation. (MIT Press, Cambridge, 1991)
Google Scholar
J. Maynard Smith. A Darwinian View of Symbiosis, chapter 3, pp. 26–39, (1991). In ([29])
A. McIntyre, M.I. Heywood, Pareto cooperative-competitive genetic programming: a classification benchmarking study. In: R. Riolo, T. Soule, B.i.l.l. Worzel (eds) Genetic Programming Theory and Practice, volume VI, chapter 4, (Springer, Berlin, 2008) pp. 41–60.
Google Scholar
A. McIntyre, M.I. Heywood, Classification as clustering: a Pareto cooperative-competitive GP approach. Evol. Comput. 19(1), 137 (2011)
Article Google Scholar
A.R. McIntyre, M.I. Heywood, in MOGE: GP classification problem decomposition using multi-objective optimization, eds by M. Keijzer et al. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), pp. 863–870 (2006)
A.R. McIntyre, M.I. Heywood. in Cooperative problem decomposition in Pareto Competitive classifier models of coevolution, ed. by M. O’Neill. European Conference on Genetic Programming, volume 4971 of LNCS, pp. 289–300 (2008)
D.E. Moriarty, R. Miikkulainen, Forming neural networks through efficient and adaptive coevolution. Evol. Comput. 5(4), 373–399 (1998)
Article Google Scholar
J. Noble, R. A. Watson. in Pareto coevolution: using performance against coevolved opponents in a game as dimensions for Pareto selection, eds. by L. Spector et al. Proceedings of the Genetic and Evolutionary Computation Conference, pp. 493–500 (2001)
M. Potter, de K. Jong, Cooperative coevolution: an architecture for evolving coadapted subcomponents. Evol. Comput. 8(1), 1–29 (2000)
Article Google Scholar
C.D. Rosin, R.K. Belew, New methods for competitive coevolution. Evol. Comput. 5(1), 1–29 (1997)
Article Google Scholar
M.G. Smith, L. Bull, Genetic programming with a genetic algorithm for feature construction and selection. Genetic Progr. Evol. Mach. 6(3), 265–281 (2005)
Article Google Scholar
R. Thomason, T. Soule, in Novel ways of improving cooperation and performance in ensemble classifiers, eds. by D. Therens et al. Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1708–1715 (2007)
G.M. Weiss, R. Provost, Learning when training data are costly: the effect of class distribution on tree induction. J. Articial Intell. Res. 19, 315–354 (2003)
MATH Google Scholar
T.H. Westerdale, Local reinforcement and recombination in classifier systems. Evol. Comput. 9(3), 259–281 (2001)
Article Google Scholar
D.H. Wolpert, Stacked generalization. Neural Netw. 5(2), 241–259 (1992)
Article MathSciNet Google Scholar
S. X. Wu and W. Banzhaf, in A hierarchical cooperative evolutionary algorithm, eds. by J. Branke et al. Proceedings of the Genetic and Evolutionary Computation Conference, pp. 233–240, (2010)
Y. Zhang, P.I. Rockett, Feature extraction using multi-objective Genetic Programming, chapter 4, pp. 75–99, (2006). In ([19])
Y. Zhang, P.I. Rockett, A generic multi-dimensional feature extraction method using multi-objective genetic programming. Evol. Comput. 17(1), 89–115 (2009)
Article Google Scholar

Download references

Acknowledgments

The authors gratefully acknowledge the very useful comments provided by the annoymous reviewers during the authoring of this work. Scholarships provided by MITACS and NSERC and equipment provided under the CFI New Opportunities program (Canada). This research was conducted while J. Doucette was an NSERC USRA at Dalhousie University.

Author information

Authors and Affiliations

David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada
John A. Doucette
Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada
Andrew R. McIntyre, Peter Lichodzijewski & Malcolm I. Heywood

Authors

John A. Doucette
View author publications
You can also search for this author in PubMed Google Scholar
Andrew R. McIntyre
View author publications
You can also search for this author in PubMed Google Scholar
Peter Lichodzijewski
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm I. Heywood
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Malcolm I. Heywood.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Doucette, J.A., McIntyre, A.R., Lichodzijewski, P. et al. Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces. Genet Program Evolvable Mach 13, 71–101 (2012). https://doi.org/10.1007/s10710-011-9151-4

Download citation

Received: 19 November 2010
Revised: 31 October 2011
Published: 27 November 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s10710-011-9151-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces

Abstract

Access this article

Similar content being viewed by others

An Empirical Study of Progressive Insular Cooperative GP

On the Effects of Collaborators Selection and Aggregation in Cooperative Coevolution: An Experimental Analysis

An on-line Pittsburgh LCS for the Three-Cornered Coevolution Framework

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces

Abstract

Access this article

Similar content being viewed by others

An Empirical Study of Progressive Insular Cooperative GP

On the Effects of Collaborators Selection and Aggregation in Cooperative Coevolution: An Experimental Analysis

An on-line Pittsburgh LCS for the Three-Cornered Coevolution Framework

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation