Skip to main content

Structuring Neural Networks through Bidirectional Clustering of Weights

  • Conference paper
  • First Online:
Discovery Science (DS 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2534))

Included in the following conference series:

Abstract

We present a method for succinctly structuring neural networks having a few thousands weights. Here structuring means weight sharing where weights in a network are divided into clusters and weights within the same cluster are constrained to have the same value. Our method employs a newly developed weight sharing technique called bidirectional clustering of weights (BCW), together with second-order optimal criteria for both cluster merge and split. Our experiments using two artificial data sets showed that the BCW method works well to find a succinct network structure from an original network having about two thousands weights in both regression and classification problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. M. Bishop. Neural networks for pattern recognition. Clarendon Press, Oxford, 1995.

    Google Scholar 

  2. C. L. Blake and C. J. Merz. UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. 1998.

  3. R. O. Duda and P. E. Hart. Pattern classification and scene analysis. John Wiley & Sons, 1973.

    Google Scholar 

  4. B. Hassibi, D. G. Stork, and G. Wolf. Optimal brain surgeon and general network pruning. In Proc. IEEE Int. Conf. on Neural Networks, pages 293–299, 1992.

    Google Scholar 

  5. S. Haykin. Neural networks-a comprehensive foundation, 2nd edition. Prentice-Hall, 1999.

    Google Scholar 

  6. M. Ishikawa. Structural learning and rule discovery. In Knowledge-based Neurocomputing, pages 153–206. MIT Press, 2000.

    Google Scholar 

  7. Y. LeCun, J. S. Denker, and S. A. Solla. Optimal brain damage. In Advances in Neural Information Processing Systems 2, pages 598–605, 1990.

    Google Scholar 

  8. R. Nakano and K. Saito. Discovering polynomials to fit multivariate data having numeric and nominal variables. In Progress in Discovery Science, LNAI 2281, pages 482–493, 2002.

    Google Scholar 

  9. S. J. Nowlan and G. E. Hinton. Simplifying neural networks by soft weight sharing. Neural Computation, 4(4):473–493, 1992.

    Article  Google Scholar 

  10. R. S. Sutton and C. J. Matheus. Learning polynomial functions by feature construction. In Proc. 8th Int. Conf. on Machine Learning, pages 208–212, 1991.

    Google Scholar 

  11. S. B. Thrun, J. Bala, and et al. The Monk’s problem-a performance comparison of different learning algorithm. Technical Report CMU-CS-91-197, CMU, 1991.

    Google Scholar 

  12. G. G. Towell and J. W. Shavlik. Extracting refined rules from knowledge-based neural networks. Machine Learning, 13:71–101, 1993.

    Google Scholar 

  13. N. Ueda, R. Nakano, Z. Ghahramani, and G. E. Hinton. SMEM algorithm for mixture models. Neural Computation, 12(9):2109–2128, 2000.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saito, K., Nakano, R. (2002). Structuring Neural Networks through Bidirectional Clustering of Weights. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_19

Download citation

  • DOI: https://doi.org/10.1007/3-540-36182-0_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00188-1

  • Online ISBN: 978-3-540-36182-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics