Two-Layer Contractive Encodings with Shortcuts for Semi-supervised Learning

Schulz, Hannes; Cho, Kyunghyun; Raiko, Tapani; Behnke, Sven

doi:10.1007/978-3-642-42054-2_56

Two-Layer Contractive Encodings with Shortcuts for Semi-supervised Learning

Hannes Schulz²⁰,
Kyunghyun Cho²¹,
Tapani Raiko²¹ &
…
Sven Behnke²⁰

Conference paper

3739 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8226))

Abstract

Supervised training of multi-layer perceptrons (MLP) with only few labeled examples is prone to overfitting. Pretraining an MLP with unlabeled samples of the input distribution may achieve better generalization. Usually, pretraining is done in a layer-wise, greedy fashion which limits the complexity of the learnable features. To overcome this limitation, two-layer contractive encodings have been proposed recently—which pose a more difficult optimization problem, however. On the other hand, linear transformations of perceptrons have been proposed to make optimization of deep networks easier. In this paper, we propose to combine these two approaches. Experiments on handwritten digit recognition show the benefits of our combined approach to semi-supervised learning.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1), 1–127 (2009)
Article MathSciNet MATH Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence Special Issue on Learning Deep Architectures (2013); early Access
Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 153–160. MIT Press, Cambridge (2007)
Google Scholar
Bergstra, J., Bardenet, R., Bengio, Y., Kgl, B.: Algorithms for hyper-parameter optimization. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 24, pp. 2546–2554 (2011)
Google Scholar
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS), JMLR Workshop and Conference Proceedings, vol. 9, pp. 249–256. JMLR W&CP (2010)
Google Scholar
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Raiko, T., Valpola, H., LeCun, Y.: Deep learning made easier by linear transformations in perceptrons. In: Proceedings of the Fifteenth Internation Conference on Artificial Intelligence and Statistics (AISTATS), JMLR Workshop and Conference Proceedings, vol. 22, pp. 924–932. JMLR W&CP (April 2012)
Google Scholar
Ranzato, M., Huang, F.J., Boureau, Y.L., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Google Scholar
Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 1137–1144. MIT Press, Cambridge (2007)
Google Scholar
Rifai, S., Mesnil, G., Vincent, P., Muller, X., Bengio, Y., Dauphin, Y., Glorot, X.: Higher order contractive auto-encoder. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 645–660. Springer, Heidelberg (2011)
Chapter Google Scholar
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: Explicit invariance during feature extraction. In: Getoor, L., Scheffer, T. (eds.) Proceedings of the 28th International Conference on Machine Learning (ICML), pp. 833–840. ACM, New York (2011)
Google Scholar
Schulz, H., Behnke, S.: Learning two-layer contractive encodings. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part I. LNCS, vol. 7552, pp. 620–628. Springer, Heidelberg (2012)
Chapter Google Scholar
Vatanen, T., Raiko, T., Valpola, H., LeCun, Y.: Pushing stochastic gradient towards second-order methods – backpropagation learning with transformations in nonlinearities. arXiv:1301.3476 [cs.LG] (May 2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Autonomous Intelligent Systems, Computer Science Institute VI, University of Bonn, Germany
Hannes Schulz & Sven Behnke
Department of Information and Computer Science, Aalto University School of Science, Finland
Kyunghyun Cho & Tapani Raiko

Authors

Hannes Schulz
View author publications
You can also search for this author in PubMed Google Scholar
Kyunghyun Cho
View author publications
You can also search for this author in PubMed Google Scholar
Tapani Raiko
View author publications
You can also search for this author in PubMed Google Scholar
Sven Behnke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyungpook National University, 1370 Sankyuk-Dong, Puk-Gu, 702-701, Taegu, Korea
Minho Lee
The University of Tokyo, 7-3-1 Hongo, 113-8656, Bunkyo-ku, Tokyo, Japan
Akira Hirose
Key Laboratory of Complex Systems and Intelligence Science, Chinese Academy of Sciences, Institute of Automation, 100190, Beijing, China
Zeng-Guang Hou
Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu,, 440-746, Suwon, Korea
Rhee Man Kil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schulz, H., Cho, K., Raiko, T., Behnke, S. (2013). Two-Layer Contractive Encodings with Shortcuts for Semi-supervised Learning. In: Lee, M., Hirose, A., Hou, ZG., Kil, R.M. (eds) Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_56

Download citation

DOI: https://doi.org/10.1007/978-3-642-42054-2_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-42053-5
Online ISBN: 978-3-642-42054-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics