Abstract
The use of multilayer perceptrons (MLP) with threshold functions (binary step function activations) greatly reduces the complexity of the hardware implementation of neural networks, provides tolerance to noise and improves the interpretation of the internal representations. In certain case, such as in learning stationary tasks, it may be sufficient to find appropriate weights for an MLP with threshold activation functions by software simulation and, then, transfer the weight values to the hardware implementation. Efficient training of these networks is a subject of considerable ongoing research. Methods available in the literature mainly focus on two-state (threshold) nodes and try to train the networks by approximating the gradient of the error function and modifying appropriately the gradient descent, or by progressively altering the shape of the activation functions. In this paper, we propose an evolution-motivated approach, which is eminently suitable for networks with threshold functions and compare its performance with four other methods. The proposed evolutionary strategy does not need gradient related information, it is applicable to a situation where threshold activations are used from the beginning of the training, as in “on-chip” training, and is able to train networks with integer weights.
Similar content being viewed by others
References
Bäck T, Schwefel HP (1993) An overview of evolutionary algorithms for parameter optimization. Evol Comput 1:1–23
Bartlett PL, Downs T (1992) Using random weights to train multilayer networks of hard–limiting units. IEEE Trans Neural Netw 3:202–210
Blake CL, Merz CJ (2005) UCI Repository of machine learning databases. University of California, Irvine, Department of Information and Computer Sciences, URL: http://www.ics.uci.edu/∼mlearn/MLRepository.html, last accessed 4/2005
Gibson GJ, Cowan FN (1990) On the decision regions of multi-layer perceptrons. Proc IEEE 78:1590–1594
Goodman R, Zeng Z (1994) A learning algorithm for multi-layer perceptrons with hard-limiting threshold units. In: Proceedings of the IEEE Neural Networks for Signal Processing, pp 219–228
Gorwin EM, Logar AM, Oldham WJB (1994) An iterative method for training multilayer networks with threshold functions. IEEE Trans Neural Netw 5:507–508
Hampson SE, Volper DJ (1990) Representing and learning boolean functions of multivalued features. IEEE Trans Syst Man Cybern 20:67–80
Hohil ME, Liu D, Smith SH (1999) Solving the N-bit parity problem using neural networks. Neural Networks 12:1321–1323
Khan AH (1996) Feedforward neural networks with constrained weights. PhD Thesis, University of Warwick, Department of Engineering
King RE (1999) Computational Intelligence in Control Engineering. Marcel Dekker Inc., New York
Law AM, Kelton WD (2000) Simulation modeling and analysis, 3rd edn. McGraw-Hill, New York
Magoulas GD, Vrahatis MN, Grapsa TN, Androulakis GS (1997) A training method for discrete multilayer neural networks. In: Ellacot SW, Mason JC, Anderson IJ (eds) Mathematics of neural networks: models, algorithms & applications, chapter 41. Kluwer, Operations Research/Computer Science Interfaces series
McCullough W, Pitts WH (1943) A logical calculus of the ideas imminent in nervous activity. Bull Math Biophys 5:115–133
Møller MF (1993) A scaled conjugate gradient algorithm, for fast supervised learning. Neural Networks 6:525–533
Nilsson NJ (1965) Learning Machines. McGraw-Hill, New York
Plagianakos VP, Magoulas GD, Nousis NK, Vrahatis MN (2001) Training multilayer networks with discrete activation functions. In: Proceedings of the INNS–IEEE international joint conference on neural networks (IJCNN2001), Washington DC, USA, pp 2805–2810
Plagianakos VP, Sotiropoulos DG, Vrahatis MN (1998) Integer weight training by differential evolution algorithms. In: Mastorakis NE (ed) recent advances in circuits and systems. World Scientific pp 327–331
Plagianakos VP, Vrahatis MN (1999) Training neural networks with 3-bit integer weights. In: Banzhaf W, Daida J, Eiben AE, Garzon MH, Honavar V, Jakiela M, Smith RE (eds) Proceedings of genetic and evolutionary computation conference (GECCO’99). Morgan Kaufmann, Orlando, pp 910–915
Plagianakos VP, Vrahatis MN (1999) Neural network training with constrained integer weights. In: Angeline PJ, Michalewicz Z, Schoenauer M, Yao X, Zalzala A (eds) Proceedings of congress on evolutionary computation (CEC’99). IEEE Press, Washington DC, pp 2007–2013
Plagianakos VP, Vrahatis MN (2002) Parallel evolutionary training algorithms for ‘hardware-friendly’ neural networks. Natural Computing 1:307–322
Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition. MIT Press, New york1:318–362
Srinivas M, Patnaik L (1994) Genetic algorithms: a survey. IEEE Computer, pp 17–26
Storn R, Price K (1997) Differential Evolution—A simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11:341–359
Thrun SB, Bala J, Bloedorn E, Bratko I, Cestnik B, Cheng J, De Jong K, Dzeroski S, Fahlman SE, Fisher D, Hamann R, Kaufmann K, Keller S, Kononenko I, Kreuziger J, Michalski RS, Mitchell T, Pachowicz P, Reich Y, Vafaie H, Van de Welde W, Wenzel W, Wnek J, Zhang J (1991) The MONK’s problems: a performance comparison of different learning algorithms. Technical Report, Carnegie Mellon University, CMU-CS-91-197
Toms DJ (1990) Training binary node feed forward neural networks by back-propagation of error. Electron Lett 26:1745–1746
Tomizuka M, Zhang S (1988) Modeling and conventional/adaptive PI control of a lathe cutting process. Trans ASME 110:305–354
Tsitouras G, King R (1997) Rule-based neural control of mechatronic systems. Int J Intelligent Mechatronics 2:1–11
Widrow B, Winter R (1988) Neural nets for adaptive filtering and adaptive pattern recognition. IEEE Computer, March, 25–39
Zeng Z, Goodman R, Smyth P (1993) Learning finite state machines with self-clustering recurrent networks. Neural Comput 5:976–990
Zeng Z, Goodman R, Smyth P (1994) Discrete recurrent neural networks for grammatical inference. IEEE Trans Neural Netwoks 5:320–330
Acknowledgements
The authors would like to thank the European Social Fund, Operational Program for Educational and Vocational Training II (EPEAEK II), and particularly the Program PYTHAGORAS for funding the above work. Dr V.P. Plagianakos and Prof. M.N. Vrahatis acknowledge the financial support of the University of Patras Research Committee through a “Karatheodoris” research grant. We also acknowledge the help of Prof. R.E. King of the Department of Electrical and Computer Engineering at the University of Patras, Greece, in the neuro-controller training experiment. The authors wish to thank the Editor and the referees for constructive comments and useful suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Plagianakos, V.P., Magoulas, G.D. & Vrahatis, M.N. Evolutionary training of hardware realizable multilayer perceptrons. Neural Comput & Applic 15, 33–40 (2006). https://doi.org/10.1007/s00521-005-0005-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-005-0005-y