Skip to main content

Deep Feature Representation via Multiple Stack Auto-Encoders

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9314))

Abstract

Recently, deep architectures, such as stack auto-encoders (SAEs), have been used to learn features from the unlabeled data. However, it is difficult to get the multi-level visual information from the traditional deep architectures (such as SAEs). In this paper, a feature representation method which concatenates Multiple Different Stack Auto-Encoders (MDSAEs) is presented. The proposed method tries to imitate the human visual cortex to recognize the objects from different views. The output of the last hidden layer for each SAE can be regarded as a kind of feature. Several kinds of features are concatenated together to form a final representation according to their weights (The output of deep architectures are assigned a high weight, and vice versa). From this way, the hierarchical structure of the human brain cortex can be simulated. Experimental results on datasets MNIST and CIRFA10 for classification have demonstrated the superior performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: IEEE International Conference on Computer Vision (ICCV), pp. 2018–2025. IEEE (2011)

    Google Scholar 

  2. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  3. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)

    Google Scholar 

  4. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002)

    Article  MATH  Google Scholar 

  5. Zuo, Z., Wang, G.: Recognizing trees at a distance with discriminative deep feature learning. In: 2013 9th International Conference on Information, Communications and Signal Processing (ICICS), pp. 1–5. IEEE (2013)

    Google Scholar 

  6. Collobert, R.: Deep learning for efficient discriminative parsing. In: International Conference on Artificial Intelligence and Statistics. Number EPFL-CONF-192374 (2011)

    Google Scholar 

  7. Song, H.A., Lee, S.-Y.: Hierarchical representation using NMF. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8226, pp. 466–473. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  8. Bengio, Y., Courville, A.C., Vincent, P.: Unsupervised feature learning and deep learning: a review and new perspectives. CoRR, abs/1206.5538 1 (2012)

    Google Scholar 

  9. Deng, L., Hinton, G., Kingsbury, B.: New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8599–8603. IEEE (2013)

    Google Scholar 

  10. Jaitly, N., Nguyen, P., Senior, A.W., Vanhoucke, V.: Application of pretrained deep neural networks to large vocabulary speech recognition. In: INTERSPEECH. Citeseer (2012)

    Google Scholar 

  11. Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3361–3368. IEEE (2011)

    Google Scholar 

  12. Le, Q.V.: Building high-level features using large scale unsupervised learning. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8595–8598. IEEE (2013)

    Google Scholar 

  13. Huang, G.B., Lee, H., Learned-Miller, E.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2518–2525. IEEE (2012)

    Google Scholar 

  14. Gens, R., Domingos, P.: Discriminative learning of sum-product networks. In: Advances in Neural Information Processing Systems(NIPS), pp. 3248–3256 (2012)

    Google Scholar 

  15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). arXiv:1409-1556

  16. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint (2014). arXiv:1409.4842

  17. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. arXiv preprint (2015). arXiv:1502.01852

  18. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint (2015). arXiv:1502.03167

  19. Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519. IEEE (2014)

    Google Scholar 

  20. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014)

    Google Scholar 

  21. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)

    Article  Google Scholar 

  22. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)

    Article  MATH  Google Scholar 

  23. Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards AI. Large-scale kernel machines 34 (2007)

    Google Scholar 

  24. Labusch, K., Barth, E., Martinetz, T.: Simple method for high-performance digit recognition based on sparse coding. IEEE Trans. Neural Netw. 19, 1985–1989 (2008)

    Article  Google Scholar 

  25. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)

    Google Scholar 

  26. Lee, Y.: Handwritten digit recognition using k nearest-neighbor, radial-basis function, and backpropagation neural networks. Neural Comput. 3, 440–449 (1991)

    Article  Google Scholar 

  27. Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)

    Google Scholar 

  28. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images, vol. 1, p. 7. Technical report, Computer Science Department, University of Toronto (2009)

    Google Scholar 

  29. Malinowski, M., Fritz, M.: Learning smooth pooling regions for visual recognition. In: 24th British Machine Vision Conference, pp. 1–11. BMVA Press (2013)

    Google Scholar 

  30. Lin, T.H., Kung, H.: Stable and efficient representation learning with nonnegativity constraints. In: Proceedings of the 31st International Conference on Machine Learning (ICML), pp. 1323–1331 (2014)

    Google Scholar 

Download references

Acknowledgements

The research was supported by National Nature Science Foundation of China (No. 61231015, 61172173, 61303114, 61170023). National High Technology Research and Development Program of China (863 Program, No. 2015AA016306). Technology Research Program of Ministry of Public Security (No. 2014JSYJA016). The EU FP7 QUICK project under Grant Agreement (No. PIRSES-GA-2013-612652). Major Science and Technology Innovation Plan of Hubei Province (No. 2013AAA020). Internet of Things Development Funding Project of Ministry of industry in 2013 (No. 25). China Postdoctoral Science Foundation funded project (2013M530350, 2014M562058). Specialized Research Fund for the Doctoral Program of Higher Education (No. 20130141120024). Nature Science Foundation of Hubei Province (2014CFB712). The Fundamental Research Funds for the Central Universities (2042014kf0025, 2042014kf0250, 2014211020203). Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry ([2014]1685).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Xiong, M. et al. (2015). Deep Feature Representation via Multiple Stack Auto-Encoders. In: Ho, YS., Sang, J., Ro, Y., Kim, J., Wu, F. (eds) Advances in Multimedia Information Processing -- PCM 2015. PCM 2015. Lecture Notes in Computer Science(), vol 9314. Springer, Cham. https://doi.org/10.1007/978-3-319-24075-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24075-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24074-9

  • Online ISBN: 978-3-319-24075-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics