Skip to main content

Advertisement

Log in

Covariance matrix self-adaptation evolution strategies and other metaheuristic techniques for neural adaptive learning

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

A covariance matrix self-adaptation evolution strategy (CMSA-ES) was compared with several metaheuristic techniques for multilayer perceptron (MLP)-based function approximation and classification. Function approximation was based on simulations of several 2D functions and classification analysis was based on nine cancer DNA microarray data sets. Connection weight learning by MLPs was carried out using genetic algorithms (GA–MLP), covariance matrix self-adaptation-evolution strategies (CMSA-ES–MLP), back-propagation gradient-based learning (MLP), particle swarm optimization (PSO–MLP), and ant colony optimization (ACO–MLP). During function approximation runs, input-side activation functions evaluated included linear, logistic, tanh, Hermite, Laguerre, exponential, and radial basis functions, while the output-side function was always linear. For classification, the input-side activation function was always logistic, while the output-side function was always regularized softmax. Self-organizing maps and unsupervised neural gas were used to reduce dimensions of original gene expression input features used in classification. Results indicate that for function approximation, use of Hermite polynomials for activation functions at hidden nodes with CMSA-ES–MLP connection weight learning resulted in the greatest fitness levels. On average, the most elite chromosomes were observed for MLP (\({\rm MSE}=0.4977\)), CMSA-ES–MLP (0.6484), PSO–MLP (0.7472), ACO–MLP (1.3471), and GA–MLP (1.4845). For classification analysis, overall average performance of classifiers used was 92.64% (CMSA-ES–MLP), 92.22% (PSO–MLP), 91.30% (ACO–MLP), 89.36% (MLP), and 60.72% (GA–MLP). We have shown that a reliable approach to function approximation can be achieved through application of MLP connection weight learning when the assumed function is unknown. In this scenario, the MLP architecture itself defines the equation used for solving the unknown parameters relating input and output target values. A major drawback of implementing CMSA-ES into an MLP is that when the number of MLP weights is large, the \({{\mathcal{O}}}(N^3)\) Cholesky factorization becomes a bottleneck for performance. As an alternative, feature reduction using SOM and NG can greatly enhance performance of CMSA-ES–MLP by reducing \(N.\) Future research into the speeding up of Cholesky factorization for CMSA-ES will be helpful in overcoming time complexity problems related to a large number of connection weights.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Abu Hammad AM, Taha MO (2009) Pharmacophore modeling, quantitative structure-activity relationship analysis, and shape-complemented in silico screening allow access to novel influenza neuraminidase inhibitors. J Chem Inf Model 49(4):978–996

    Article  Google Scholar 

  • Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750

    Article  Google Scholar 

  • Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ (2001) MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30(1):41–47

    Article  Google Scholar 

  • Beyer H-G, Schwefel H-G (2002) Evolution strategies: Aa comprehensive introduction. Nat Comput 1:3–52

    Article  MATH  MathSciNet  Google Scholar 

  • Beyer H-G, Sendhoff B (2008) Covariance matrix adaptation revisted: the CMSA evolution strategy. LNCS 5199:123–132

    Google Scholar 

  • Bush K, Knight J, Anderson C (2005) Optimizing conductance parameters of cortical neural models via electrotonic partitions. Neural Netw 18(5–6):488–96

    Article  Google Scholar 

  • Carpenter WC, Hoffmann ME (1997) Guidelines for the selection of network architecture. Artif Intell Eng Des Anal Manufact. 11:395–408

    Article  Google Scholar 

  • Chen F, Johnston RL (2008) Energetic, electronic, and thermal effects on structural properties of Ag–Au nanoalloys. ACS Nano 2(1):165–175

    Article  Google Scholar 

  • Deb K, Anand A, Joshi D (2002) A computationally efficient evolutionary algorithm for real-parameter optimization. Evol Comput 10(4):371–95

    Article  Google Scholar 

  • De Jong KA (1992) Are genetic algorithms function optimizers? In: Manner R, Manderick B (eds) Parallel problem solving from nature, vol 2. North Holland, Amsterdam, pp 3–13

  • Dorigo M (1992) Optimization, learning and natural algorithms (in Italian). Ph.D. thesis, Dipartimento di Elettronica, Politecnico di Milano, Italy

  • Fadda D, Slezak E, Bijaoui A (1998) Density estimation with non-parametric methods. Astron Astrophys Suppl Ser 127:335–352

    Article  Google Scholar 

  • Fogel LJ (1962) Autonomous automata. Ind Res 4:14–19

    Google Scholar 

  • Fogel LJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley, New York

    MATH  Google Scholar 

  • Gil-Pita R, Yao X (2008) Evolving edited k-nearest neighbor classifiers. Int J Neural Syst 18(6):459–67

    Article  Google Scholar 

  • Goldberg DA (1989) Genetic algorithms in search, optimization, and machine learning. Addison Wesley, Boston

    MATH  Google Scholar 

  • Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh M, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression. Science 286:531–537

    Article  Google Scholar 

  • Gordon GJ, Jensen RV, Hsiao LL, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG, Sugarbaker DJ, Bueno R (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62(17):4963–5967

    Google Scholar 

  • Hansen N, Ostermeier A (1996) Adapting arbitrary normal mutation distributions in evolutionary strategies. The covariance matrix adaptation. In: Proceedings of the 1996 IEEE international conference on evolutionary Computation. IEEE Press, Piscataway, pp 312–317

  • Hansen N, Muller SD, Koumoutsakos P (2003) Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evol Comput 11(1):1–18

    Article  Google Scholar 

  • Hansen N, Ostermeier A (2001) Completely derandomized self-adaptation in evolution strategies. Evol Comput 9(2):159–95

    Article  Google Scholar 

  • Hedenfalk I, Duggan D, Chen Y et al (2001) Gene-expression profiles in hereditary breast cancer. N Engl J Med 344:539–548

    Article  Google Scholar 

  • Holland JH (1962) Outline for a logical theory of adaptive systems. JACM 9:297–314

    Article  MATH  Google Scholar 

  • Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor

    Google Scholar 

  • Holland JH (1992) Adaptation in natural and artificial systems. MIT Press, Cambridge

    Google Scholar 

  • Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366

    Article  Google Scholar 

  • Igel C, Suttorp T, Hansen N (2006) A computational efficient covariance matrix update and a (1+1)-CMA for evolution strategies. In: Proceedings of genetic and evolutionary computation conference (GECCO-2006). ACM Press, New York

  • Igel C, Hansen N, Roth S (2007) Covariance matrix adaptation for multi-objective optimization. Evol Comput 15(1):1–28

    Article  Google Scholar 

  • Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural Nntworks, Piscataway (NJ), pp 1942–1948

  • Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, Meltzer RS (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7:673–679

    Article  Google Scholar 

  • Kilic N, Ucan ON, Osman O (2009) Colonic polyp detection in CT colonography with fuzzy rule based 3D template matching. J Med Syst 33(1):9–18

    Article  Google Scholar 

  • Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680

    Article  MathSciNet  Google Scholar 

  • Ma L, Khorasani K (2005) Constructive feedforward neural networks using Hermite polynomial activation functions. IEEE Trans Neural Netw 16(6):821–833

    Article  Google Scholar 

  • Maurya MR, Bornheimer SJ, Venkatasubramanian V, Subramaniam S (2009) Mixed-integer nonlinear optimisation approach to coarse-graining biochemical networks. IET Syst Biol 3(1):24–39

    Article  Google Scholar 

  • Mersch B, Glasmachers T, Meinicke P, Igel C (2007) Evolutionary optimization of sequence kernels for detection of bacterial gene starts. Int J Neural Syst 17(5):369–81

    Article  Google Scholar 

  • Milano M, Koumoutsakos P, Schmidhuber J (2004) Self-organizing nets for optimization. IEEE Trans Neural Netw 15(3):758–765

    Article  Google Scholar 

  • Ostermeier A, Gawelczyk A, Hansen N (1994) A derandomized approach to self-adaptation of evolution strategies. Evol Comput 4(2):369–380

    Article  Google Scholar 

  • Peterson LE, Coleman MA (2009) Logistic ensembles of random spherical linear oracles for microarray classification. Int J Data Min Bioinform 3(4):382–397

    Article  Google Scholar 

  • Po MJ, Laine AF (2008) Leveraging genetic algorithm and neural network in automated protein crystal recognition. Conf Proc IEEE Eng Med Biol Soc 2008:1926–1929

    Google Scholar 

  • Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim J-YH, Goumnerovak LC, Blackk PM, Lau C, Allen JC, Zagzagi D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califanokk A, Stolovitzkykk G, Louis DN, Mesirov JP, Lander ES, Golub TR (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870):436–442

    Article  Google Scholar 

  • Ramalhinho-Lourenço H, Martin O, Stützle T (2002) Iterated local search. In: Glover F, Herausgeber GK (eds) Handbook of metaheuristics. Kluwer, Norwell, pp 321–353

    Google Scholar 

  • Rechenberg I (1965) Cybernetic solution of path of an experimental problem. Royal Aircraft Establishment, Farnborough, Library Translation, vol 1122

  • Rechenberg I (1973) Evolution strategy: optimization of technical systems according to principles of biological evolution. Frommann-Holzboog Verlag, Stuttgart

    Google Scholar 

  • Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie JP, Lander ES, Loda M, Kantoff PW, Golub TR, Sellers WR (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209

    Article  Google Scholar 

  • Schwefel H-P (1965) Kybernetische Evolution als Strategie der experimentellen Forschung in der Strömungstechnik. Master’s Thesis. Technical University of Berlin

  • Schwefel H-P (1975) Evolutionsstrategie und numerische Optimierung. Dissertation. Technical University of Berlin

  • Schwefel H-P (1981) Numerical optimization of computer models. Wiley, Chicester

    MATH  Google Scholar 

  • Socha K, Dorigo M (2008) Ant colony optimization for continuous domains. Eur J Oper Res 185:1155–1173

    Article  MATH  MathSciNet  Google Scholar 

  • van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536

    Article  Google Scholar 

  • Wilson JW, Schlup P, Lunacek M, Whitley D, Bartels RA (2008) Calibration of liquid crystal ultrafast pulse shaper with common-path spectral interferometry and application to coherent control with a covariance matrix adaptation evolutionary strategy. Rev Sci Instrum 79(3):033103

    Article  Google Scholar 

  • Xia Y, Wen L, Eberl S, Fulham M, Feng D (2008) Genetic algorithm-based PCA eigenvector selection and weighting for automated identification of dementia using FDG-PET imaging. Conf Proc IEEE Eng Med Biol Soc 2008:4812–4815

    Google Scholar 

  • Zare-Mirakabad F, Ahrabian H, Sadeghi M, Hashemifar S, Nowzari-Dalini A, Goliaei B (2009) Genetic algorithm for dyad pattern finding in DNA sequences. Genes Genet Syst 84(1):81–93

    Article  Google Scholar 

Download references

Acknowledgments

We are grateful to Hans-Georg Beyer for many encouraging discussions on implementation of the Cholesky factorization-based CMSA-ES method.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leif E. Peterson.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peterson, L.E. Covariance matrix self-adaptation evolution strategies and other metaheuristic techniques for neural adaptive learning. Soft Comput 15, 1483–1495 (2011). https://doi.org/10.1007/s00500-010-0598-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-010-0598-7

Keywords

Navigation