Skip to main content
Log in

Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Recently, deep learning is receiving renewed attention in the field of artificial intelligence. Deep belief network (DBN) has a deep network architecture that can represent multiple features of input patterns hierarchically, using pre-trained restricted Boltzmann machines (RBMs). Such deep network architectures enable extremely high classification accuracy in many tasks compared to previous methods. However, determining various parameters to design effective deep network architectures is a difficult task even for experienced designers, since traditional RBM and DBN cannot change their network structure during the training. The adaptive structure learning method has been previously proposed for finding the optimum number of hidden neurons in multilayered neural networks. The method employs the neuron generation–annihilation algorithm by observing the variance of weight decays. We develop the adaptive structure learning method of RBM and DBN using the neuron generation–annihilation and layer generation algorithm by observing the variance of some parameters. The effectiveness of our proposed model was verified by tenfold cross-validation on benchmark data sets CIFAR-10 and CIFAR-100. The adaptive DBN achieved the highest classification accuracy (97.4% for CIFAR-10, 81.2% for CIFAR-100) among several latest DBN- and CNN-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of advances in neural information processing systems 19 (NIPS 2007), pp 153–160

  2. Ranzato M, Boureau Y, LeCun Y (2007) Sparse feature learning for deep belief networks. In: Proceedings of advances in neural information processing systems 20 (NIPS 2007), pp 1185–1192

  3. Grosse LR, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference in machine learning (ICML 2009), pp 609–616

  4. Lyons T, Skitmore M (2012) Project risk management in the Queensland engineering construction industry : a survey. Int J Proj Manag 22(1):51–61

    Article  Google Scholar 

  5. Lane ND, Miluzzo E, Hong L, Peebles D, Choudhury T, Campbell AT (2010) A survey of mobile phone sensing. IEEE Commun Mag 48(9):140–150

    Article  Google Scholar 

  6. Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531

    Article  Google Scholar 

  7. Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951

    Article  Google Scholar 

  8. Bengio Y (2009) Learning Deep Architectures for AI. Found Trends Mach Learn Arch 2(1):1–127

    Article  MathSciNet  Google Scholar 

  9. Quoc VL, Marc’s Aurelio R et al (2013) Building high-level features using large scale unsupervised learning. Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 8595–8598

  10. Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  Google Scholar 

  11. Hinton GE (2012) A practical guide to training restricted boltzmann machines. Neural Networks, Tricks of the Trade, Lecture Notes in Computer Science (LNCS, vol 7700), pp 599–619

  12. Ichimura T, Yoshida K, (eds) (2004) Knowledge-based intelligent systems for health care. Advanced knowledge international, ISBN 0-9751004-4-0, pp 11–50

  13. Ichimura T, Oeda S, Suka M, Hara A, Mackin KJ, Yoshida K (2005) Knowledge discovery and data mining in medicine. In: Pal N, Jain L (eds) Advanced techniques in knowledge discovery and data mining (advanced information and knowledge processing). Springer, Berlin, pp 177–210

    Chapter  Google Scholar 

  14. Ichimura T, Oeda S, Suka Ma, Yoshida K (2005) A learning method of immune multi-agent neural networks. Neural Comput Appl 14(2):132–148

    Article  Google Scholar 

  15. Kamada S, Ichimura T (2016) An adaptive learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. In: Proceedings of IEEE international conference on systems, man, and cybernetics (SMC), pp 1273–1278

  16. Kamada S, Ichimura T (2016) A structural learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. Neural information processing, lecture notes in computer science (LNCS, vol 9950), pp 372–380

  17. Kamada S, Ichimura T (2016) An adaptive learning method of deep belief network by layer generation algorithm. In: Proceedings of IEEE region 10 conference (TENCON), pp 2971–2974

  18. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master of thesis, University of Toronto

  19. KyungHyun C, Alexander I, Tapani R (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Proceedings of international conference on artificial neural networks (ICANN 2011), Part 1, pp 14–17

  20. Courville A, Desjardins G, Bergstra J, Bengio Y (2014) The spike-and-slab RBM and extensions to discrete and sparse data distributions. IEEE Trans Pattern Anal Mach Intell 36(9):1874–1887

    Article  Google Scholar 

  21. Yogeswaran A, Payeur P (2016) Improving visual feature representations by Biasing restricted Boltzmann machines with Gaussian Filters. In: Proceedings advances in visual computing: 12th international symposium, ISVC 2016. Part I: 825–835

  22. Li Z, Cai X, Liang T (2016) Gaussian–Bernoulli based convolutional restricted boltzmann machine for images feature extraction. In: Proceedings of the 23rd International Conference on Neural Information Processing 9948:593–602

  23. Krizhevsky A (2010) Convolutional deep belief networks on CIFAR-10

  24. Sohn K, Lee H (2012) Learning invariant representations with local transformations. In: Proceedings of the 29th international conference on machine learning (ICML 2012), pp 1339–1346

  25. Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. Proc Mach Learn Res 15:215–223

    Google Scholar 

  26. Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2017) Evolutionary training of sparse artificial neural networks: a network science perspective. arXiv:1707.04780

  27. Anush S, Gaurav G, Mayank V, Richa S, Angshul M (2017) Class sparsity signature based Restricted Boltzmann Machine. Pattern Recognit 61:674–685

    Article  Google Scholar 

  28. Zhang L, Subbarayan G (2002) An evaluation of back-propagation neural networks for the optimal design of structural systems: Part II. Numerical evaluation. Comput Methods Appl Mech Eng 191(25–26):2887–2904

    Article  Google Scholar 

  29. Ichimura T, Takano T, Tazaki E (1995) Reasoning and learning method for fuzzy rules using neural networks with adaptive structured genetic algorithm. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC’95) 4:3269–3274

  30. Zenga X, Yeungb DS (2006) Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69(7–9):825–837

    Article  Google Scholar 

  31. Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. In: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(3):705–722

  32. Bruzzone L, Prieto DF (1999) A technique for the selection of kernel-function parameters in RBF neural networks for classification of remote-sensing images. IEEE Trans Geosci Remote Sens 37(2):1179–1184

    Article  Google Scholar 

  33. Ichimura T, Tazaki E, Yoshida K (1995) Extraction of fuzzy rules using neural networks with structure level adaptation verification to the diagnosis of hepatobiliary disorders. Int J Bio-Med Comput 40(2):139–146

    Article  Google Scholar 

  34. Kristiansen G, Gonzalvo X (2017) EnergyNet: energy-based adaptive structural learning of artificial neural network architectures. arXiv:1711.03130 [cs.LG]

  35. Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. Proc Adv Neural Inf Process Syst 2(NIPS 1989):524–532

  36. Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cogn Sci 9(1):147–169

    Article  Google Scholar 

  37. Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800

    Article  Google Scholar 

  38. Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference in machine learning (ICML 2008), pp 1064–1071

  39. Kawaguchi K (2016) Deep learning without poor local minima. In: Proceedings of advances in neural information processing systems 29 (NIPS 2016):586–594

  40. Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the 18th international conference on artificial intelligence and statistics, pp 111–119

  41. LeCun Y et al (2015) THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/. Accessed 26 June 2017

  42. Cortes C et al (2016) AdaNet: adaptive structural learning of artificial neural networks. arXiv:1607.01097

  43. Kamada S, Ichimura T (2016) Fine tuning method by using knowledge acquisition from deep belief network. In: Proceedings of IEEE 9th international workshop on computational intelligence and applications (IWCIA2016), pp 119–124

  44. Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc Mach Learn Res (PMLR) 28(3):1319–1327

    Google Scholar 

  45. Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of ICRL (2016)

  46. Benjamin G (2015) Fractional max-pooling. arXiv:1412.6071

  47. Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British machine vision conference (BMVC), 87.1–87.12

Download references

Funding

This study was funded by JAPAN MIC SCOPE (Grant Number 162308002), Artificial Intelligence Research Promotion Foundation, and JSPS KAKENHI (Grant Number JP17J11178).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shin Kamada.

Ethics declarations

Conflict of interest

Author Takumi Ichimura has received research grants from JAPAN MIC SCOPE and Artificial Intelligence Research Promotion Foundation. Author Shin Kamada has received a research grant from JSPS KAKENHI.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kamada, S., Ichimura, T., Hara, A. et al. Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation. Neural Comput & Applic 31, 8035–8049 (2019). https://doi.org/10.1007/s00521-018-3622-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3622-y

Keywords

Navigation