Deep Feature Representation via Multiple Stack Auto-Encoders

Xiong, Mingfu; Chen, Jun; Wang, Zheng; Liang, Chao; Zheng, Qi; Han, Zhen; Sun, Kaimin

doi:10.1007/978-3-319-24075-6_27

Deep Feature Representation via Multiple Stack Auto-Encoders

Mingfu Xiong¹⁸,
Jun Chen^18,19,
Zheng Wang¹⁸,
Chao Liang^18,19,
Qi Zheng¹⁸,
Zhen Han^18,19 &
…
Kaimin Sun²⁰

Conference paper
First Online: 22 November 2015

2037 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9314))

Abstract

Recently, deep architectures, such as stack auto-encoders (SAEs), have been used to learn features from the unlabeled data. However, it is difficult to get the multi-level visual information from the traditional deep architectures (such as SAEs). In this paper, a feature representation method which concatenates Multiple Different Stack Auto-Encoders (MDSAEs) is presented. The proposed method tries to imitate the human visual cortex to recognize the objects from different views. The output of the last hidden layer for each SAE can be regarded as a kind of feature. Several kinds of features are concatenated together to form a final representation according to their weights (The output of deep architectures are assigned a high weight, and vice versa). From this way, the hierarchical structure of the human brain cortex can be simulated. Experimental results on datasets MNIST and CIRFA10 for classification have demonstrated the superior performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: IEEE International Conference on Computer Vision (ICCV), pp. 2018–2025. IEEE (2011)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)
Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002)
Article MATH Google Scholar
Zuo, Z., Wang, G.: Recognizing trees at a distance with discriminative deep feature learning. In: 2013 9th International Conference on Information, Communications and Signal Processing (ICICS), pp. 1–5. IEEE (2013)
Google Scholar
Collobert, R.: Deep learning for efficient discriminative parsing. In: International Conference on Artificial Intelligence and Statistics. Number EPFL-CONF-192374 (2011)
Google Scholar
Song, H.A., Lee, S.-Y.: Hierarchical representation using NMF. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8226, pp. 466–473. Springer, Heidelberg (2013)
Chapter Google Scholar
Bengio, Y., Courville, A.C., Vincent, P.: Unsupervised feature learning and deep learning: a review and new perspectives. CoRR, abs/1206.5538 1 (2012)
Google Scholar
Deng, L., Hinton, G., Kingsbury, B.: New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8599–8603. IEEE (2013)
Google Scholar
Jaitly, N., Nguyen, P., Senior, A.W., Vanhoucke, V.: Application of pretrained deep neural networks to large vocabulary speech recognition. In: INTERSPEECH. Citeseer (2012)
Google Scholar
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3361–3368. IEEE (2011)
Google Scholar
Le, Q.V.: Building high-level features using large scale unsupervised learning. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8595–8598. IEEE (2013)
Google Scholar
Huang, G.B., Lee, H., Learned-Miller, E.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2518–2525. IEEE (2012)
Google Scholar
Gens, R., Domingos, P.: Discriminative learning of sum-product networks. In: Advances in Neural Information Processing Systems(NIPS), pp. 3248–3256 (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). arXiv:1409-1556
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint (2014). arXiv:1409.4842
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. arXiv preprint (2015). arXiv:1502.01852
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint (2015). arXiv:1502.03167
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519. IEEE (2014)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Article Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)
Article MATH Google Scholar
Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards AI. Large-scale kernel machines 34 (2007)
Google Scholar
Labusch, K., Barth, E., Martinetz, T.: Simple method for high-performance digit recognition based on sparse coding. IEEE Trans. Neural Netw. 19, 1985–1989 (2008)
Article Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)
Google Scholar
Lee, Y.: Handwritten digit recognition using k nearest-neighbor, radial-basis function, and backpropagation neural networks. Neural Comput. 3, 440–449 (1991)
Article Google Scholar
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images, vol. 1, p. 7. Technical report, Computer Science Department, University of Toronto (2009)
Google Scholar
Malinowski, M., Fritz, M.: Learning smooth pooling regions for visual recognition. In: 24th British Machine Vision Conference, pp. 1–11. BMVA Press (2013)
Google Scholar
Lin, T.H., Kung, H.: Stable and efficient representation learning with nonnegativity constraints. In: Proceedings of the 31st International Conference on Machine Learning (ICML), pp. 1323–1331 (2014)
Google Scholar

Download references

Acknowledgements

The research was supported by National Nature Science Foundation of China (No. 61231015, 61172173, 61303114, 61170023). National High Technology Research and Development Program of China (863 Program, No. 2015AA016306). Technology Research Program of Ministry of Public Security (No. 2014JSYJA016). The EU FP7 QUICK project under Grant Agreement (No. PIRSES-GA-2013-612652). Major Science and Technology Innovation Plan of Hubei Province (No. 2013AAA020). Internet of Things Development Funding Project of Ministry of industry in 2013 (No. 25). China Postdoctoral Science Foundation funded project (2013M530350, 2014M562058). Specialized Research Fund for the Doctoral Program of Higher Education (No. 20130141120024). Nature Science Foundation of Hubei Province (2014CFB712). The Fundamental Research Funds for the Central Universities (2042014kf0025, 2042014kf0250, 2014211020203). Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry ([2014]1685).

Author information

Authors and Affiliations

School of Computer, National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan, 430072, China
Mingfu Xiong, Jun Chen, Zheng Wang, Chao Liang, Qi Zheng & Zhen Han
Collaborative Innovation Center of Geospatial Technology, Wuhan, China
Jun Chen, Chao Liang & Zhen Han
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sening, Wuhan University, Wuhan, 430072, China
Kaimin Sun

Authors

Mingfu Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Jun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Liang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Han
View author publications
You can also search for this author in PubMed Google Scholar
Kaimin Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Chen .

Editor information

Editors and Affiliations

Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Chinese Academy of Sciences, Institute of Automation, Beijing, China
Jitao Sang
ICU, IVY Lab, KAIST, Daejeon, Korea (Republic of)
Yong Man Ro
KAIST, Daejeon, Korea (Republic of)
Junmo Kim
College of Computer Science, Zhejiang University, Hangzhou, China
Fei Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiong, M. et al. (2015). Deep Feature Representation via Multiple Stack Auto-Encoders. In: Ho, YS., Sang, J., Ro, Y., Kim, J., Wu, F. (eds) Advances in Multimedia Information Processing -- PCM 2015. PCM 2015. Lecture Notes in Computer Science(), vol 9314. Springer, Cham. https://doi.org/10.1007/978-3-319-24075-6_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-24075-6_27
Published: 22 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24074-9
Online ISBN: 978-3-319-24075-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics