Structuring Neural Networks through Bidirectional Clustering of Weights

Saito, Kazumi; Nakano, Ryohei

doi:10.1007/3-540-36182-0_19

Kazumi Saito⁷ &
Ryohei Nakano⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2534))

Included in the following conference series:

International Conference on Discovery Science

953 Accesses
5 Citations

Abstract

We present a method for succinctly structuring neural networks having a few thousands weights. Here structuring means weight sharing where weights in a network are divided into clusters and weights within the same cluster are constrained to have the same value. Our method employs a newly developed weight sharing technique called bidirectional clustering of weights (BCW), together with second-order optimal criteria for both cluster merge and split. Our experiments using two artificial data sets showed that the BCW method works well to find a succinct network structure from an original network having about two thousands weights in both regression and classification problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. M. Bishop. Neural networks for pattern recognition. Clarendon Press, Oxford, 1995.
Google Scholar
C. L. Blake and C. J. Merz. UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. 1998.
R. O. Duda and P. E. Hart. Pattern classification and scene analysis. John Wiley & Sons, 1973.
Google Scholar
B. Hassibi, D. G. Stork, and G. Wolf. Optimal brain surgeon and general network pruning. In Proc. IEEE Int. Conf. on Neural Networks, pages 293–299, 1992.
Google Scholar
S. Haykin. Neural networks-a comprehensive foundation, 2nd edition. Prentice-Hall, 1999.
Google Scholar
M. Ishikawa. Structural learning and rule discovery. In Knowledge-based Neurocomputing, pages 153–206. MIT Press, 2000.
Google Scholar
Y. LeCun, J. S. Denker, and S. A. Solla. Optimal brain damage. In Advances in Neural Information Processing Systems 2, pages 598–605, 1990.
Google Scholar
R. Nakano and K. Saito. Discovering polynomials to fit multivariate data having numeric and nominal variables. In Progress in Discovery Science, LNAI 2281, pages 482–493, 2002.
Google Scholar
S. J. Nowlan and G. E. Hinton. Simplifying neural networks by soft weight sharing. Neural Computation, 4(4):473–493, 1992.
Article Google Scholar
R. S. Sutton and C. J. Matheus. Learning polynomial functions by feature construction. In Proc. 8th Int. Conf. on Machine Learning, pages 208–212, 1991.
Google Scholar
S. B. Thrun, J. Bala, and et al. The Monk’s problem-a performance comparison of different learning algorithm. Technical Report CMU-CS-91-197, CMU, 1991.
Google Scholar
G. G. Towell and J. W. Shavlik. Extracting refined rules from knowledge-based neural networks. Machine Learning, 13:71–101, 1993.
Google Scholar
N. Ueda, R. Nakano, Z. Ghahramani, and G. E. Hinton. SMEM algorithm for mixture models. Neural Computation, 12(9):2109–2128, 2000.
Article Google Scholar

Download references

Author information

Authors and Affiliations

NTT Corporation, NTT Communication Science Laboratories, 2-4 Hikaridai, Seika, Soraku, 619-0237, Kyoto, Japan
Kazumi Saito
Nagoya Institute of Technology, Nagoya 466-8555, Gokiso-cho, Showa-ku, Japan
Ryohei Nakano

Authors

Kazumi Saito
View author publications
You can also search for this author in PubMed Google Scholar
Ryohei Nakano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Deutsches Forschungszentrum für Künstliche Intelligenz, Stuhlsatzenhausweg 3, 66123, Saarbrücken, Germany
Steffen Lange
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Ken Satoh
Department of Computer Science, University of Maryland, College Park, 20742, Maryland, MD, USA
Carl H. Smith

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saito, K., Nakano, R. (2002). Structuring Neural Networks through Bidirectional Clustering of Weights. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_19

Download citation

DOI: https://doi.org/10.1007/3-540-36182-0_19
Published: 08 November 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00188-1
Online ISBN: 978-3-540-36182-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics