ABSTRACT
The deep Boltzmann machine (DBM) is a powerful "deep" probabilistic model which learns a hierarchical representation of the data. However, choosing the size of each hidden layer of a DBM is difficult as the proper size of the model varies according to different tasks. Choosing a proper model size is a essential model selection problem for latent variable graphical models. This paper provides a new variant of DBM, called the infinite deep Boltzmann machine (iDBM), which can freely change the number of hidden units participating in the energy function of each layer. A greedy training method is proposed to pre-train our model, after which the size of each layer is fixed, and the model is transferred into an ordinary DBM. Experimental results on MNIST and CalTech101 Silhouettes indicate that iDBM can learn a generative and discriminative model as good as the original DBM, and has successfully eliminated the requirement of model selection for hidden layer sizes of DBMs.
- Salakhutdinov, R., and G. Hinton. 2009. Deep Boltzmann machines. Journal of Machine Learning Research, 5,2 (2009), 448--455.Google Scholar
- Salakhutdinov, R. R., and G. E. Hinton. 2012. A better way to pretrain deep Boltzmann machines. Advances in Neural Information Processing Systems 3 (2012), 2447--2455. Google ScholarDigital Library
- Cho, Kyung Hyun, T. Raiko, and A. Ilin. 2013. Gaussian-Bernoulli deep Boltzmann machine. The 2013 International Joint Conference on Neural Networks, 1--7.Google Scholar
- Srivastava, Nitish, and R. Salakhutdinov. 2014. Multimodal learning with deep Boltzmann machines. Journal of Machine Learning Research, 15, 8 (2014), 1967 - 2006. Google ScholarDigital Library
- Srivastava, Nitish, R. R. Salakhutdinov, and G. E. Hinton. 2013. Modeling documents with deep Boltzmann machines,. eprint arXiv:1309.6865. Google ScholarDigital Library
- ND Chi, K Luu, KG Quach, and TD Bui. 2016. Longitudinal face modeling via temporal deep restricted Boltzmann machines. eprint arXiv:1606.02254, 2016.Google Scholar
- Srivastava N, Hinton G, Krizhevsky A, et al. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- MA Côté, H Larochelle. 2016. An infinite restricted Boltzmann machine. Neural Computation, 28, 7 (2016), 1265--1288. Google ScholarDigital Library
- G. E. Hinton. 2002. Training products of experts by minimizing contrastive divergence. Neural Computation, 14 (2002), 1771--1800,. Google ScholarDigital Library
- Ackley D H, Hinton G E, Sejnowski T J. 1985. A learning algorithm for Boltzmann machines. Cognitive Science, 9, 1 (1985), 147--169.Google ScholarCross Ref
- R. Salakhutdinov and G. E. Hinton. 2012. An efficient learning procedure for deep Boltzmann machines. Neural Computation, 24 (2012), 1967--2006. Google ScholarDigital Library
- Ruslan Salakhutdinov. 2015. Learning deep generative models. Statistics and Its Application, 2, 2 (2015), 361--385.Google ScholarCross Ref
- Desjardins G, Courville A, Bengio Y. 2013. On training deep Boltzmann machines. eprint arXiv:1203.4416.Google Scholar
- Goodfellow I J, Courville A, Bengio Y. 2013. Joint training deep Boltzmann machines for classification. eprint arXiv:1301.3568.Google Scholar
- Salakhutdinov, R. and Murray, I. 2008. On the quantitative analysis of deep belief networks. Proceedings of the 25th Annual International Conference on Machine Learning, 872--879. Google ScholarDigital Library
- Marlin, B. M., Swersky, K., Chen, B., & Freitas, N. D. 2011. Inductive principles for restricted boltzmann machine learning. Journal of Machine Learning Research − Proceedings Track for Artificial Intelligence & Statistics, 9, 9 (2011), 509--516.Google Scholar
- Xuan Peng, Xunzhang Gao and Xiang Li. 2017. On better training the infinite restricted Boltzmann machines. eprint arXiv: 1709.03239.Google Scholar
- Duchi, John, E. Hazan, and Y. Singer. 2011. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization." Journal of Machine Learning Research, 12 (2011). 2121--2159. Google ScholarDigital Library
- Radford M Neal. 2001. Annealed importance sampling. Statistics and Computing, 11 (2001), 125--139. Google ScholarDigital Library
- Hinton, G. E., and R. R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science, 313 (2006), 504--507.Google ScholarCross Ref
- KyungHyun Cho, Tapani Raiko, and Alexander Ilin. 2013. Enhanced Gradient for Training Restricted Boltzmann Machines. Neural Computation, 25, 3 (2013), 805--831 Google ScholarDigital Library
Index Terms
- An Infinite Deep Boltzmann Machine
Recommendations
Research on Point-wise Gated Deep Networks
Display Omitted We introduce pgRBMs into DBNs and present Point-wise Gated Deep Belief Networks.Similar to pgDBNs, Point-wise Gated Deep Boltzmann Machines are presented.We introduce dropout and weight uncertainty methods into pgRBMs.We discuss the ...
Contractive Slab and Spike Convolutional Deep Boltzmann Machine
AbstractDeep unsupervised learning for robust and effective feature extractions from high-resolution images still keeps greatly challenging. Although Deep Boltzmann Machine (DBM) has demonstrated the impressive capacity of feature extractions, ...
Boltzmann Machines for Image Denoising
Proceedings of the 23rd International Conference on Artificial Neural Networks and Machine Learning ICANN 2013 - Volume 8131Image denoising based on a probabilistic model of local image patches has been employed by various researchers, and recently a deep denoising autoencoder has been proposed in [2] and [17] as a good model for this. In this paper, we propose that another ...
Comments