Abstract
This contribution presents an overview of the theoretical and practical aspects of the broad family of learning algorithms based on Stochastic Gradient Descent, including Perceptrons, Adalines, K-Means, LVQ, Multi-Layer Networks, and Graph Transformer Networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amari, S.-I.: Differential-geometrical methods in statistics. Springer, Berlin (1990)
Amari, S.I.: A theory of adaptive pattern classifiers. IEEE Transactions on Electronic Computers EC-16, 299–307 (1967)
Amari, S.-I.: Natural learning in structured parameter spaces – natural riemannian gradient. In: Neural Information Processing Systems, vol. 9, pp. 127–133. MIT Press, Cambridge (1996)
Battiti, R.: First- and second-order methods for learning: Between steepest descent and newton’s method. Neural Computation 4, 141–166 (1992)
Becker, S., Le Cun, Y.: Improving the convergence of back-propagation learning with second-order methods. In: Touretzky, D., Hinton, G., Sejnowski, T. (eds.) Proceedings of the 1988 Connectionist Models Summer School, pp. 29–37. Morgan Kaufmann, San Mateo (1989)
Bengio, Y., LeCun, Y., Nohl, C., Burges, C.: Lerec: A nn/hmm hybrid for on-line handwriting recognition. Neural Computation 7(6) (November 1995)
Benveniste, A., Metivier, M., Priouret, P.: Adaptive Algorithms and Stochastic Approximations. Springer, Berlin (1990)
Bottou, L., Le Cun, Y., Bengio, Y.: Global training of document processing systems using graph transformer networks. In: Proc. of Computer Vision and Pattern Recognition, pp. 489–493. IEEE, Puerto-Rico (1997)
Bottou, L.: Une Approche théorique de l’Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole. PhD thesis, Université de Paris XI, Orsay, France (1991)
Bottou, L.: Online algorithms and stochastic approximations. In: Saad, D. (ed.) Online Learning and Neural Networks. Cambridge University Press, Cambridge (1998)
Bottou, L., Bengio, Y.: Convergence properties of the kmeans algorithm. In: Advances in Neural Information Processing Systems, Denver, vol. 7. MIT Press, Cambridge (1995)
Bottou, L., Le Cun, Y.: Large scale online learning. In: Advances in Neural Information Processing Systems, vol. 16. MIT Press, Cambridge (2004)
Bottou, L., Le Cun, Y.: On-line learning for very large datasets. In: Applied Stochastic Models in Business and Industry, Special issue (to appear, 2004)
Bottou, L., Murata, N.: Stochastic approximations and efficient learning. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, 2nd edn. The MIT Press, Cambridge (2002)
Dennis Jr., J.E., Schnabel, R.B.: Numerical Methods For Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Inc., Englewood Cliffs (1983)
Duda, R.O., Hart, P.E.: Pattern Classification And Scene Analysis. Wiley and Sons, Chichester (1973)
Gentile, C., Warmuth, M.K.: Linear hinge loss and average margin. In: Neural Information Processing Systems, vol. 11, pp. 231–255. MIT Press, Cambridge (1999)
Hebb, D.O.: The Organization of Behavior. Wiley, New York (1949)
Kohonen, T.: Self-organized formation of topologically correct feature maps. Biological Cybernetics 43, 59–69 (1982)
Kohonen, T., Barna, G., Chrisley, R.: Statistical pattern recognition with neural network: Benchmarking studies. In: Proceedings of the IEEE Second International Conference on Neural Networks, San Diego, vol. 1, pp. 61–68 (1988)
Krasovskii, A.A.: Dynamic of continuous self-Organizing Systems. Fizmatgiz, Moscow (1963) (in russian)
Kushner, H.J., Clark, D.S.: Stochastic Approximation for Constrained and Unconstrained Systems. In: Applied Math. Sci., vol. 26. Springer, Berlin, New York (1978)
Le Cun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Computation 1(4), 541–551 (1989) (Winter)
Le Cun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient based learning applied to document recognition. Proceedings of IEEE 86(11), 2278–2324 (1998)
LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backProp. In: Orr, G.B., Müller, K.-R. (eds.) NIPS-WS 1996. LNCS, vol. 1524, p. 9. Springer, Heidelberg (1998)
Le Cun, Y., Bottou, L., HuangFu, J.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proc. of Computer Vision and Pattern Recognition, Washington, D.C. IEEE, Los Alamitos (2004)
Ljung, L., Söderström, T.: Theory and Practice of recursive identification. MIT Press, Cambridge (1983)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: LeCam, L.M., Neyman, J. (eds.) Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics, and Probabilities, vol. 1, pp. 281–297. University of California Press, Berkeley and Los Angeles (Calif) (1967)
Minsky, M., Papert, S.: Perceptrons. MIT Press, Cambridge (1969)
Müller, U., Gunzinger, A., Guggenbühl, W.: Fast neural net simulation with a DSP processor array. IEEE Trans. on Neural Networks 6(1), 203–213 (1995)
Murata, N., Amari, S.-i.: Statistical analysis of learning dynamics. Signal Processing 74(1), 3–28 (1999)
Orr, G.B., Leen, T.K.: Momentum and optimal stochastic search. In: Mozer, M.C., Smolensky, P., Touretzky, D.S., Elman, J.L., Weigend, A.S. (eds.) Proceedings of the 1993 Connectionist Models Summer School, pp. 351–357. Lawrence Erlbaum Associates, Mahwah (1994)
Robbins, H., Monro, S.: A stochastic approximation model. Ann. Math. Stat. 22, 400–407 (1951)
Rosenblatt, F.: The perceptron: A perceiving and recognizing automaton. Technical Report 85-460-1, Project PARA, Cornell Aeronautical Lab (1957)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Parallel distributed processing: Explorations in the microstructure of cognition, vol. I, pp. 318–362. Bradford Books, Cambridge (1986)
Rosset, J.Z.S., Hastie, T.: Margin maximizing loss functions. In: Advances in Neural Information Processing Systems, vol. 16. MIT Press, Cambridge (2004)
Schenkel, M., Weissman, H., Guyon, I., Nohl, C., Henderson, D.: Recognition-based segmentation of on-line hand-printed words. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems, Denver, CO, vol. 5, pp. 723–730 (1993)
Schraudolph, N.N., Graepel, T.: Conjugate directions for stochastic gradient descent. In: Dorronsoro, J.R. (ed.) ICANN 2002. LNCS, vol. 2415, p. 1351. Springer, Heidelberg (2002)
Sejnowski, T.J., Rosenberg, C.R.: Parallel networks that learn to pronounce english text. Complex Systems 1, 145–168 (1987)
Tsypkin, Y.: Adaptation and Learning in automatic systems. Academic Press, New York (1971)
Tsypkin, Y.: Foundations of the theory of learning systems. Academic Press, New York (1973)
Vapnik, V.N.: Estimation of dependences based on empirical data. Series in Statistics. Springer, Berlin, New York (1982)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
Widrow, B., Hoff, M.E.: Adaptive switching circuits. In: IRE WESCON Conv. Record, Part 4, pp. 96–104 (1960)
Widrow, B., Stearns, S.D.: Adaptive Signal Processing. Prentice-Hall, Englewood Cliffs (1985)
Wolf, R., Platt, J.: Postal address block location using a convolutional locator network. In: Cowan, J.D., Tesauro, G., Alspector, J. (eds.) Advances in Neural Information Processing Systems, vol. 6, pp. 745–752 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bottou, L. (2004). Stochastic Learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds) Advanced Lectures on Machine Learning. ML 2003. Lecture Notes in Computer Science(), vol 3176. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28650-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-28650-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23122-6
Online ISBN: 978-3-540-28650-9
eBook Packages: Springer Book Archive