Training of Sparsely Connected MLPs

Thom, Markus; Schweiger, Roland; Palm, Günther

doi:10.1007/978-3-642-23123-0_36

Markus Thom¹⁸,
Roland Schweiger¹⁸ &
Günther Palm¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6835))

Included in the following conference series:

Joint Pattern Recognition Symposium

1909 Accesses
1 Citations

Abstract

Sparsely connected Multi-Layer Perceptrons (MLPs) differ from conventional MLPs in that only a small fraction of entries in their weight matrices are nonzero. Using sparse matrix-vector multiplication algorithms reduces the computational complexity of classification. Training of sparsely connected MLPs is achieved in two consecutive stages. In the first stage, initial values for the network’s parameters are given by the solution to an unsupervised matrix factorization problem, minimizing the reconstruction error. In the second stage, a modified version of the supervised backpropagation algorithm optimizes the MLP’s parameters with respect to the classification error. Experiments on the MNIST database of handwritten digits show that the proposed approach achieves equal classification performance compared to a densely connected MLP while speeding-up classification by a factor of seven.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bishop, C.M.: Neural Networks for Pattern Recognition. Clarendon Press, Oxford (1995)
Google Scholar
Burges, C.J.C., Schölkopf, B.: Improving the Accuracy and Speed of Support Vector Machines. In: NIPS, vol. 9, pp. 375–381 (1997)
Google Scholar
Cireşan, D.C., Meier, U., Gambardella, L.M., Schmidhuber, J.: Deep, Big, Simple Neural Nets for Handwritten Digit Recognition. Neural Computation 22(12), 3207–3220 (2010)
Article Google Scholar
DeCoste, D., Schölkopf, B.: Training Invariant Support Vector Machines. Machine Learning 46, 161–190 (2002)
Article MATH Google Scholar
Elliott, D.: A Better Activation Function for Artificial Neural Networks. Tech. Rep. ISR TR 93-8, Institute for Systems Research, University of Maryland (1993)
Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A Library for Large Linear Classification. JMLR 9, 1871–1874 (2008)
Google Scholar
Field, D.J.: What is the Goal of Sensory Coding? Neural Computation 6, 559–601 (1994)
Article Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the Dimensionality of Data with Neural Networks. Science 313(5786), 1527–1554 (2006)
Article MathSciNet Google Scholar
Hoyer, P.O.: Non-negative Matrix Factorization with Sparseness Constraints. JMLR 5, 1457–1469 (2004)
MathSciNet Google Scholar
LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., Denker, J., Drucker, H., Guyon, I., Müller, U., Säckinger, E., Simard, P., Vapnik, V.: Comparison Of Learning Algorithms For Handwritten Digit Recognition. In: Proceedings of ICANN, pp. 53–60 (1995)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE 86, 2278–2324 (1998)
Article Google Scholar
LeCun, Y., Cortes, C.: The MNIST Database of Handwritten Digits, http://yann.lecun.com/exdb/mnist
LeCun, Y., Kanter, I., Solla, S.A.: Eigenvalues of Covariance Matrices: Application to Neural-Network Learning. Physical Review Letters 66(18), 2396–2399 (1991)
Article Google Scholar
Lee, D.D., Seung, H.S.: Learning the parts of objects by nonnegative matrix factorization. Nature 401, 788–791 (1999)
Article Google Scholar
Ortigosa, E.M., Cañas, A., Rodríguez, R., Díaz, J., Mota, S.: Towards an Optimal Implementation of MLP in FPGA. In: Bertels, K., Cardoso, J.M.P., Vassiliadis, S. (eds.) ARC 2006. LNCS, vol. 3985, pp. 46–51. Springer, Heidelberg (2006)
Chapter Google Scholar
Ranzato, M., Boureau, Y., LeCun, Y.: Sparse Feature Learning for Deep Belief Networks. In: NIPS, vol. 20, pp. 1185–1192 (2008)
Google Scholar
Rast, A.D., Welbourne, S., Jin, X., Furber, S.: Optimal Connectivity In Hardware-Targetted MLP Networks. In: Proceedings of IJCNN, pp. 2619–2626 (2009)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article Google Scholar
Simard, P.Y., Steinkraus, D., Platt, J.C.: Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. In: Proceedings of ICDAR, pp. 958–962 (2003)
Google Scholar
Theis, F.J., Stadlthanner, K., Tanaka, T.: First results on uniqueness of sparse non-negative matrix factorization. In: Proceedings of EUSIPCO (2005)
Google Scholar
Thom, M., Schweiger, R., Palm, G.: Supervised Matrix Factorization with Sparseness Constraints and Fast Inference. In: Proceedings of IJCNN (to appear, 2011)
Google Scholar
Yoshimura, Y., Dantzker, J.L.M., Callaway, E.M.: Excitatory cortical neurons form fine-scale functional networks. Nature 433(7028), 868–873 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department Environment Perception (GR/PAP), Daimler AG, Ulm, Germany
Markus Thom & Roland Schweiger
Institute of Neural Information Processing, University of Ulm, Germany
Günther Palm

Authors

Markus Thom
View author publications
You can also search for this author in PubMed Google Scholar
Roland Schweiger
View author publications
You can also search for this author in PubMed Google Scholar
Günther Palm
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Visual Sensorics and Information Processing Lab, Johann-Wolfgang Goethe University, 60054, Frankfurt/Main, Germany
Rudolf Mester
Computer Vision Laboratory, Linköping University, 58183, Linköping, Sweden
Michael Felsberg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thom, M., Schweiger, R., Palm, G. (2011). Training of Sparsely Connected MLPs. In: Mester, R., Felsberg, M. (eds) Pattern Recognition. DAGM 2011. Lecture Notes in Computer Science, vol 6835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23123-0_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-23123-0_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23122-3
Online ISBN: 978-3-642-23123-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics