How to Train Neural Networks

Neuneier, Ralph; Zimmermann, Hans Georg

doi:10.1007/3-540-49430-8_18

Ralph Neuneier⁶ &
Hans Georg Zimmermann⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1524))

5691 Accesses
35 Citations

Abstract

The purpose of this paper is to give a guidance in neural network modeling. Starting with the preprocessing of the data, we discuss different types of network architecture and show how these can be combined effectively. We analyze several cost functions to avoid unstable learning due to outliers and heteroscedasticity. The Observer - Observation Dilemma is solved by forcing the network to construct smooth approximation functions. Furthermore, we propose some pruning algorithms to optimize the network architecture. All these features and techniques are linked up to a complete and consistent training procedure (see figure 17.25 for an overview), such that the synergy of the methods is maximized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. H. Ackley, G. E. Hinton, and T. J. Sejnowski. A learning algorithm for Boltzmann machines. Cognitive Science, 9:147–169, 1985. [Reprinted in [2]].
Article Google Scholar
J. A. Anderson and E. Rosenfeld, editors. Neurocomputing: Foundations of Research. The MIT Press, Cambridge, MA, 1988.
Google Scholar
C. M. Bishop. Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995.
Google Scholar
L. Breiman. Bagging predictors. Technical Report TR No. 421, Department of Statistics, University of California, 1994.
Google Scholar
H. Bunke and O. Bunke. Nonlinear Regression, Functional Analysis and Robust Methods, volume 2. John Wiley and Sons, 1989.
Google Scholar
R. Caruana. Multitask learning. Machine Learning, 28:41, 1997.
Article Google Scholar
E. J. Elton and M. J. Gruber. Modern Portfolio Theory and Investment Analysis. John Wiley & Sons, 1995.
Google Scholar
W. Finnoff, F. Hergert, and H. G. Zimmermann. Improving generalization performance by nonconvergent model selection methods. In I. Aleksander and J Taylor, editors, Neural Networks 2: Proc. of the Inter. Conference on Artificial Neural Networks, ICANN-92, pages 233–236, 1992.
Google Scholar
W. Finnoff, F. Hergert, and H. G. Zimmermann. Neuronale Lernverfahren mit variabler Schrittweite. 1993. Tech. report, Siemens AG.
Google Scholar
G. Flake. Square unit augmented, radially extended multilayer perceptrons. Technical report, Siemens Corporate Research Center, Princeton, and this volume. 1997.
Google Scholar
N. A. Gershenfeld. An experimentalist’s introduction to the observation of dynamical systems. In B. L. Hao, editor, Directions in Chaos, volume 2, pages 310–384. World Scientific, Singapore, 1989.
Google Scholar
P. Herve, P. Naim, and H. G. Zimmermann. Advanced Adaptive Architectures for Asset Allocation: A Trial Application. In Forecasting Financial Markets, 1996.
Google Scholar
S. Hochreiter and J. Schmidhuber. Flat minima. Neural Computation, 9(1):1–42, 1997.
Article MATH Google Scholar
K. Hornik. Approximation Capabilities of Multilayer Feedforward Networks. Neural Networks, 4:251–257, 1991.
Article Google Scholar
Y. le Cun, J. S. Denker, and S. A. Solla. Optimal brain damage. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2 (NIPS*89), pages 598–605, San Mateo, CA, 1990. Morgan Kaufmann.
Google Scholar
J. E. Moody and T. S. Rögnvaldsson. Smoothing regularizers for projective basis function networks. In Michael C. Mozer, Michael I. Jordan, and Thomas Petsche, editors, Advances in Neural Information Processing Systems, volume 9, page 585. The MIT Press, 1997.
Google Scholar
P. M. Williams. Using Neural Networks to Model Conditional Multivariate Densities. Technical Report CSRP 371, School of Cognitive and Computing Sciences, Univ. of Sussex, February 1995.
Google Scholar
R. Neuneier. Optimal asset allocation using adaptive dynamic programming. In Advances in Neural Information Processing Systems, volume 8. MIT Press, 1996.
Google Scholar
R. Neuneier. Optimale Investitionsentscheidungen mit Neuronalen Netzen. PhD thesis, UniversitÄt Kaiserslautern, Institut für Informatik, 1998.
Google Scholar
R. Neuneier, W. Finnoff, F. Hergert, and D. Ormoneit. Estimation of Conditional Densities: A Comparison of Neural Network Approaches. In Intern. Conf. on Artificial Neural Networks, ICANN, volume 1, pages 689–692. Springer Verlag, 1994.
Google Scholar
D.A. Nix and A.S. Weigend. Estimating the mean and variance of the target probability distribution. In World Congress of Neural Networks. Lawrence Erlbaum Associates, 1994.
Google Scholar
D. Ormoneit. Estimation of Probability Densities using Neural Networks. Master’s thesis, FakultÄt für Informatik, Technische UniversitÄt München, 1993.
Google Scholar
A. Papoulis. Probability, Random Variables, and Stochastic Processes. McGraw Hill, Inc., 3 edition, 1991.
Google Scholar
M. P. Perrone. Improving Regression Estimates: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization. PhD thesis, Brown University, 1993.
Google Scholar
A. P. Refenes, editor. Neural Networks in the Capital Market. Wiley & Sons, 1994.
Google Scholar
T. D. Sanger. Optimal unsupervised learning in a single-layer linear feedforward network. Neural Networks, 2:459–473, 1989.
Article Google Scholar
G. A. F. Seber and C. J. Wild. Nonlinear Regression. John Wiley & Sons, New York, 1989.
MATH Google Scholar
A. N. Srivastava and A. S. Weigend. Computing the probability density in connectionist regression. In M. Marinaro and P. G. Morasso, editors, Proceedings of the International Conference on Artificial Neural Networks, Sorrento, Italy (ICANN 94), pages 685–688. Springer-Verlag, 1994. Also in Proceedings of the IEEE International Conference on Neural Networks, Orlando, FL (IEEE-ICNN’94), p. 3786–3789, IEEE-Press.
Google Scholar
F. Takens. Detecting strange attractors in turbulence. In D. A. Rand and L. S. Young, editors, Dynamical Systems and Turbulence, volume 898 of Lecture Notes in Mathematics, pages 366–381. Springer, 1981.
Google Scholar
B. Tang, W. Hsieh, and F. Tangang. Clearning neural networks with continuity constraints for prediction of noisy time series. In Progres in Neural Information Processing (ICONIP’96), pages 722–725, Berlin, 1996. Springer.
Google Scholar
V. Tresp, R. Neuneier, and H. G. Zimmermann. Early brain damage. In Advances in Neural Information Processing Systems, volume 9. MIT Press, 1997.
Google Scholar
A. S.Weigend and H. G. Zimmermann. Exploiting local relations as soft constraints to improve forecasting. Computational Intelligence in Finance, 6(1), January 1998.
Google Scholar
A. S. Weigend, H. G. Zimmermann, and R. Neuneier. The observer-observation dilemma in neuro-forecasting: Reliable models from unreliable data through clearning. In R. Freedman, editor, AI Applications on Wall Street, June 1995, New York., pages 308–317. Software Engineering Press. (Tel. 301 948-5390), 1995.
Google Scholar
A. S. Weigend, D. E. Rumelhart, and B. A. Huberman. Generalization by weightelimination with application to forecasting. In Richard P. Lippmann, John E. Moody, and David S. Touretzky, editors, Advances in Neural Information Processing Systems, volume 3, pages 875–882. Morgan Kaufmann, San Mateo, 1991.
Google Scholar
H. White. Parametrical statistical estimation with artificial neural networks. Technical report, University of California, San Diego, 1991.
Google Scholar
H. G. Zimmermann and A. S. Weigend. Representing dynamical systems in feedforward networks: A six layer architecture. In A. S. Weigend, Y. Abu-Mostafa, and A.-P. N. Refenes, editors, Decision Technologies for Financial Engineering: Proceedings of the Fourth International Conference on Neural Networks in the Capital Markets (NNCM-96), Singapore, 1997. World Scientific.
Google Scholar
H. G. Zimmermann. Neuronale Netze als Entscheidungskalkül. In H. Rehkugler and H. G. Zimmermann, editor, Neuronale Netze in der ökonomie. Verlag Franz Vahlen, 1994.
Google Scholar
H. G. Zimmermann and R. Neuneier. The observer-observation dilemma in neuroforecasting. In Advances in Neural Information Processing Systems, volume 10. MIT Press, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Siemens AG, Corporate Technology, D-81730, München, Germany
Ralph Neuneier & Hans Georg Zimmermann

Authors

Ralph Neuneier
View author publications
You can also search for this author in PubMed Google Scholar
Hans Georg Zimmermann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Willamette University, Salem, OR, 97301, USA
Genevieve B. Orr
GMD First (Forschungszentrum Informationstechnik), Rudower Chaussee 5, D-12489, Berlin, Germany
Klaus-Robert Müller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Neuneier, R., Zimmermann, H.G. (1998). How to Train Neural Networks. In: Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 1524. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49430-8_18

Download citation

DOI: https://doi.org/10.1007/3-540-49430-8_18
Published: 28 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65311-0
Online ISBN: 978-3-540-49430-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics