Abstract
The paper describes two schemes that follow the model of Lamarckian evolution and combine differential evolution (DE), which is a population-based stochastic global search method, with the local optimization algorithm of conjugate gradients (CG). In the first, each offspring is fine-tuned by CG before competing with their parents. In the other CG is used to improve both parents and offspring in a manner that is completely seamless for individuals that survive more than one generation. Experiments involved training weights of feed-forward neural networks to solve three synthetic and four real-life problems. In six out of seven cases the DE–CG hybrid, which preserves and uses information on each solution’s local optimization process, outperformed two recent variants of DE.
Similar content being viewed by others
References
Beyer HG, Schwefel HP (2002) Evolution strategies—a comprehensive introduction. Nat Comput 1(1): 3–52
Brest J, Greiner S, Boskovic B, Mernik M, Zumer V (2006) Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Trans Evol Comput 10(6): 646–657
Chakraborty, UK (eds) (2008) Advances in differential evolution. Springer, Berlin
Charalambous C (1992) Conjugate gradient algorithm for efficient training of artificial neural networks. Circuits, devices and systems. IEE Proc G 139: 301–310
Cortez P, Rocha M, Neves J (2002) A Lamarckian approach for neural network training. Neural Process Lett 15: 105–116
Das S, Abraham A, Chakraborty U, Konar A (2009) Differential evolution using a neighborhood-based mutation operator. IEEE Trans Evol Comput 13(3): 526–553
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley-Interscience, New York
Eiben AE, Hinterding R, Michalewicz Z (1999) Parameter control in evolutionary algorithms. IEEE Trans Evol Comput 3(2): 124–141
Fahlman SE (1988) An empirical study of learning speed in back-propagation networks. Technical Report CMU-CS-88-162. Carnegie-Mellon University, Pittsburgh, PA
Fletcher R, Reeves CM (1964) Function minimization by conjugate gradients. Comput J 7: 149–154
Floreano D, Dürr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intel 1(1): 47–62
Hamm L, Brorsen BW, Hagan MT (2007) Comparison of stochastic global optimization methods to estimate neural network weights. Neural Process Lett 26: 145–158
Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Natl Bur Stand 49: 409–436
Igel C, Kreutz M (2003) Operator adaptation in evolutionary computation and its application to structure optimization of neural networks. Neurocomputing 55(1–2): 347–361
Igel C, Erlhagen W, Jancke D (2001) Optimization of dynamic neural fields. Neurocomputing 36(1–4): 225–233
Ilonen J, Kamarainen JK, Lampinen J (2003) Differential evolution training algorithm for feed-forward neural networks. Neural Process Lett 17: 93–105
Kaynak C (1995) Methods of combining multiple classifiers and their applications to handwritten digit recognition. Master’s thesis, Institute of Graduate Studies in Science and Engineering, Bogazici University, Istanbul, Turkey
Kwedlo W, Bandurski K (2006) A parallel differential evolution algorithm for neural network training. In: International symposium on parallel computing in electrical engineering (PARELEC’06). IEEE Computer Society, Los Alamitos, CA, pp 319–324
Mandischer M (2002) A comparison of evolution strategies and backpropagation for neural network training. Neurocomputing 42: 87–117
Masters T, Land W (1997) A new training algorithm for the general regression neural network. In: IEEE international conference on systems, man, and cybernetics, vol 3, pp 1990–1994
Michalewicz Z (1996) Genetic algorithms + data structures = evolution programs. Springer, Berlin
Møller MF (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 6(4): 525–533
Montana D, Davis L (1989) Training feedforward neural networks using genetic algorithms. In: Proceedings of the 1989 International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp 762–767
Quinlan JR (1993) Combining instance-based and model-based learning. Proceedings of the tenth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 236–243
Riedmiller M (1994) Advanced supervised learning in multi-layer perceptrons—from backpropagation to adaptive learning algorithms. Int J Comput Stand Interfaces 5: 265–278
Riedmiller M, Braun H (1993) A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of the IEEE international conference on neural networks, pp 586–591
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
Ross BJ (1999) A Lamarckian evolution strategy for genetic algorithms. In: Chambers LD (eds) Practical handbook of genetic algorithms: complex coding systems, vol 3. CRC Press, Boca Raton, pp 1–16
Sherrod PH (2008) Dtreg predictive modelling software. http://www.dtreg.com/DTREG.pdf. Accessed 12 Dec 2008
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11: 341–359
Yao X (1999) Evolving artificial neural networks. Proc IEEE 87(9): 1423–1447
Yao X, Liu Y (1997) A new evolutionary system for evolving artificial neural networks. IEEE Trans Neural Netw 8(3): 694–713
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bandurski, K., Kwedlo, W. A Lamarckian Hybrid of Differential Evolution and Conjugate Gradients for Neural Network Training. Neural Process Lett 32, 31–44 (2010). https://doi.org/10.1007/s11063-010-9141-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-010-9141-1