Skip to main content

Two Frameworks for Improving Gradient-Based Learning Algorithms

  • Chapter
Oppositional Concepts in Computational Intelligence

Part of the book series: Studies in Computational Intelligence ((SCI,volume 155))

Summary

Backpropagation is the most popular algorithm for training neural networks. However, this gradient-based training method is known to have a tendency towards very long training times and convergence to local optima. Various methods have been proposed to alleviate these issues including, but not limited to, different training algorithms, automatic architecture design and different transfer functions. In this chapter we continue the exploration into improving gradient-based learning algorithms through dynamic transfer function modification. We propose opposite transfer functions as a means to improve the numerical conditioning of neural networks and extrapolate two backpropagation-based learning algorithms. Our experimental results show an improvement in accuracy and generalization ability on common benchmark functions. The experiments involve examining the sensitivity of the approach to learning parameters, type of transfer function and number of neurons in the network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albertini, F., Sontag, E.: For Neural Networks, Function Determines Form. Neural Networks 6, 975–990 (1993)

    Article  Google Scholar 

  2. Barakova, E.I., Spaanenburg, L.: Symmetry: Between Indecision and Equality of Choice. Biological and Artificial Computation: From Neuroscience to Technology (1997)

    Google Scholar 

  3. Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)

    Google Scholar 

  4. Bishop, C.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2007)

    Google Scholar 

  5. Chandra, P., Singh, Y.: An Activation Function Adapting Training Algorithm for Sigmoidal Feedforward Networks. Neurocomputing 61 (2004)

    Google Scholar 

  6. Chandra, P., Singh, Y.: A Case for the Self-Adaptation of Activation Functions in FFANNs. Neurocomputing 56, 447–545 (2004)

    Article  Google Scholar 

  7. Charalambous, C.: Conjugate Gradient Algorithm for Efficient Training of Artificial Neural Network 139(3), 301–310 (1992)

    Google Scholar 

  8. Chen, A., Hecht-Nielsen, R.: On the Geometry of Feedforward Neural Network Weight Spaces. In: Second International Conference on Artificial Neural Networks, pp. 1–4 (1991)

    Google Scholar 

  9. Chen, A.M., Lu, H., HechtNielsen, R.: On the Geometry of Feedforward Neural Networks Error Surfaces. Neural Computation 5(6), 910–927 (1993)

    Article  Google Scholar 

  10. Chen, C.T., Chang, W.D.: A Feedforward Neural Network with Function Shape Autotuning. Neural Networks 9 (1996)

    Google Scholar 

  11. Cleeremans, A., Servan-Schreiber, D., McClelland, J.L.: Finite State Automata and Simple Recurrent Networks. Neural Computation 1(3), 372–381 (1989)

    Article  Google Scholar 

  12. Drago, G.P., Morando, M., Ridella, S.: An Adaptive Momentum Back Propagation (AMBP). Neural Computation and Applications 3(4), 213–221 (1995)

    Article  Google Scholar 

  13. Duch, W., Jankowski, N.: Optimal Transfer Function Neural Networks. In: 9th European Symposium on Artificial Neural Networks, pp. 101–106 (2001)

    Google Scholar 

  14. Duch, W., Jankowski, N.: Transfer Functions: Hidden Possibilities for Better Neural Networks. In: 9th European Symposium on Artificial Neural Networks, pp. 81–94 (2001)

    Google Scholar 

  15. Duch, W., Jankowski, N.: New Neural Transfer Functions. Applied Mathematics and Computer Science 7, 639–658 (1997)

    MATH  MathSciNet  Google Scholar 

  16. El-Fallahi, A., Mart, R., Lasdon, L.: Path Relinking and GRG for Artificial Neural Networks. European Journal of Operational Research 169, 508–519 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  17. Elman, J.L.: Distributed Representations, Simple Recurrent Networks and Grammatical Structure. Machine Learning 7(2), 195–226 (1991)

    Google Scholar 

  18. Fahlman, S.E.: Faster-Learning Variations on Backpropagation: An Empirical Study. Morgan Kaufmann, San Francisco (1988)

    Google Scholar 

  19. Funahashi, K.: On the Approximate Realization of Continuous Mappings by Neural Networks. Neural Networks 2(3), 183–193 (1989)

    Article  Google Scholar 

  20. Goh, S.L., Mandic, D.: Recurrent Neural Networks with Trainable Amplitude of Activation Functions. Neural Networks 16 (2003)

    Google Scholar 

  21. Hagan, M.T., Menhaj, M.B.: Training Feedforward with the Marquardt Algorithm. IEEE Transactions on Neural Networks 5(6), 989–993 (1994)

    Article  Google Scholar 

  22. Hanna, A., Mandic, D.: A Complex-Valued Nonlinear Neural Adaptive Filter with a Gradient Adaptive Amplitude of the Activation Function. Neural Networks 16(3), 155–159 (2003)

    Article  Google Scholar 

  23. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1998)

    Google Scholar 

  24. Hoffmann, G.A.: Adaptive Transfer Functions in Radial Basis Function (RBF) Networks. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3037, pp. 682–686. Springer, Heidelberg (2004)

    Google Scholar 

  25. Hornik, K., Stinchcombe, M., White, H.: Multilayer Feedforward Networks are Universal Approximators. Neural Networks 2(3), 359–366 (1989)

    Article  Google Scholar 

  26. Hu, Z., Shao, H.: The Study of Neural Network Adaptive Control Systems. Control and Decision 7 (1992)

    Google Scholar 

  27. Jacobs, R.A.: Increased Rates of Convergence Through Learning Rate Adaptation. Neural Networks 1(4), 295–308 (1988)

    Article  Google Scholar 

  28. Jordan, F., Clement, G.: Using the Symmetries of Multilayered Network to Reduce the Weight Space. In: IEEE Second International Joint Conference on Neural Networks, pp. 391–396 (1991)

    Google Scholar 

  29. Lagaros, N., Papadrakakis, M.: Learning Improvement of Neural Networks used in Structural Optimization. Advances in Engineering Software 35(1), 9–25 (2004)

    Article  Google Scholar 

  30. Le Cun, Y., Denker, J.S., Solla, S.A.: Optimal Brain Damage. In: Advances in Neural Information Processing Systems, pp. 598–605 (1990)

    Google Scholar 

  31. Liang, Y.: Adaptive Neural Activation Functions in Multiresolution Learning. In: IEEE International Conference on Systems, Man, and Cybernetics, vol. 4, pp. 2601–2606 (2000)

    Google Scholar 

  32. Liu, Y., Yao, X.: Evolutionary Design of Artificial Neural Networks with Different Nodes. In: International Conference on Evolutionary Computation, pp. 670–675 (1996)

    Google Scholar 

  33. Magoulas, G.D., Vrahatis, M.N., Androulakis, G.S.: Improving the Convergence of the Backpropagation Algorithm Using Learning Rate Adaptation Methods. Neural Computation 11, 1769–1796 (1999)

    Article  Google Scholar 

  34. Moller, M.F.: A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning 6, 525–533 (1993)

    Google Scholar 

  35. Nguyen, D., Widrow, B.: Improving the Learning Speed of 2-Layer Neural Networks by Choosing Initial Values of the Adaptive Weights. In: IEEE Proceedings of the International Joint Conference on Neural Netowrks, vol. 3, pp. 21–26 (1990)

    Google Scholar 

  36. Omlin, C.W., Giles, C.L.: Constructing Deterministic Finite-State Automata in Recurrent Neural Networks. Journal of the ACM 43(6), 937–972 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  37. Reber, A.S.: Implicit Learning of Synthetic Languages: The Role of Instructional Set. Journal of Experimental Psychology: Human Learning and Memory 2(1), 88–94 (1976)

    Article  Google Scholar 

  38. Riedmiller, M., Braun, H.: A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. In: IEEE Conference on Neural Networks, pp. 586–591 (1993)

    Google Scholar 

  39. Rumelhart, D., Hinton, G., Williams, R.: Learning Internal Representations by Error Propagation. In: Parallel Distributed Processing, ch. 8. MIT Press, Cambridge (1986)

    Google Scholar 

  40. Saarinen, S., Bramley, R., Cybenko, G.: Ill-Conditioning in Neural Network Training Problems. SIAM Journal on Scientific Computing 14(3), 693–714 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  41. Schraudolph, N.: Centering Neural Network Gradient Factors. In: Orr, G., Muller, K.R. (eds.) Neural Networks: Tricks of the Trade, pp. 207–226. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  42. Sussmann, H.J.: Uniqueness of the Weights for Minimal Feedforward Nets with a Given Input-Output Map. Neural Networks 5(4), 589–593 (1992)

    Article  Google Scholar 

  43. Tezel, G., Özbay, Y.: A New Neural Network with Adaptive Activation Function for Classification of ECG Arrhythmias. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part I. LNCS (LNAI), vol. 4692, pp. 1–8. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  44. Thome, A.G., Tenorio, M.F.: Dynamic Adaptation of the Error Surface for the Acceleration of the Training of Neural Networks 1, 447–452 (1994)

    Google Scholar 

  45. Trentin, E.: Networks with Trainable Amplitude of Activation Functions. Neural Networks 14(4), 471–493 (2001)

    Article  Google Scholar 

  46. van der Smagt, P., Hirzinger, G.: Solving the Ill-Conditioning in Neural Network Learning. In: Orr, G., Muller, K.R. (eds.) Neural Networks: Tricks of the Trade, pp. 193–206. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  47. Vecci, L., Piazza, F., Uncini, A.: Learning and Approximation Capabilities of Adaptive Spline Activation Function Neural Networks. Neural Networks 11, 259–270 (1998)

    Article  Google Scholar 

  48. Ventresca, M., Tizhoosh, H.: Numerical Condition of Feedforward Networks with Opposite Transfer Functions. In: International Joint Conference on Neural Networks (to appear, 2008)

    Google Scholar 

  49. Ventresca, M., Tizhoosh, H.R.: Opposite Transfer Functions and Backpropagation Through Time. In: IEEE Symposium on Foundations of Computational Intelligence, pp. 570–577 (2007)

    Google Scholar 

  50. Ventrescda, M., Tizhoosh, H.: Improving the Convergence of Backpropagation by Opposite Transfer Functions. In: International Joint Conference on Neural Networks, pp. 9527–9534 (2006)

    Google Scholar 

  51. Duch, W., Jankowski, N.: Survey of Neural Transfer Functions. Neural Computing Surveys 2, 163–213 (1999)

    Google Scholar 

  52. Werbos, P.J.: Backpropagation Through Time: What it Does and how to do it. Proceedings of the IEEE 78, 1550–1560

    Google Scholar 

  53. Werbos, P.J.: Back-propagation: Past and future. In: Proceedings of the IEEE International Conference on Neural Networks, pp. 343–353 (1988)

    Google Scholar 

  54. Wolpert, D.H., Macready, W.G.: No Free Lunch Theorems for Optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)

    Article  Google Scholar 

  55. Xu, S., Zhang, M.: Adaptive Higher-Order Feedforward Neural Networks. In: International Joint Conference on Neural Networks, vol. 1, pp. 328–332 (1999)

    Google Scholar 

  56. Xu, S., Zhang, M.: Justification of a Neuron-Adaptive Activation Function. In: International Joint Conference on Neural Networks, vol. 3, pp. 465–470 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hamid R. Tizhoosh Mario Ventresca

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Ventresca, M., Tizhoosh, H.R. (2008). Two Frameworks for Improving Gradient-Based Learning Algorithms. In: Tizhoosh, H.R., Ventresca, M. (eds) Oppositional Concepts in Computational Intelligence. Studies in Computational Intelligence, vol 155. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70829-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70829-2_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70826-1

  • Online ISBN: 978-3-540-70829-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics