Abstract
Recently, deep architectures, such as stack auto-encoders (SAEs), have been used to learn features from the unlabeled data. However, it is difficult to get the multi-level visual information from the traditional deep architectures (such as SAEs). In this paper, a feature representation method which concatenates Multiple Different Stack Auto-Encoders (MDSAEs) is presented. The proposed method tries to imitate the human visual cortex to recognize the objects from different views. The output of the last hidden layer for each SAE can be regarded as a kind of feature. Several kinds of features are concatenated together to form a final representation according to their weights (The output of deep architectures are assigned a high weight, and vice versa). From this way, the hierarchical structure of the human brain cortex can be simulated. Experimental results on datasets MNIST and CIRFA10 for classification have demonstrated the superior performance.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: IEEE International Conference on Computer Vision (ICCV), pp. 2018–2025. IEEE (2011)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002)
Zuo, Z., Wang, G.: Recognizing trees at a distance with discriminative deep feature learning. In: 2013 9th International Conference on Information, Communications and Signal Processing (ICICS), pp. 1–5. IEEE (2013)
Collobert, R.: Deep learning for efficient discriminative parsing. In: International Conference on Artificial Intelligence and Statistics. Number EPFL-CONF-192374 (2011)
Song, H.A., Lee, S.-Y.: Hierarchical representation using NMF. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8226, pp. 466–473. Springer, Heidelberg (2013)
Bengio, Y., Courville, A.C., Vincent, P.: Unsupervised feature learning and deep learning: a review and new perspectives. CoRR, abs/1206.5538 1 (2012)
Deng, L., Hinton, G., Kingsbury, B.: New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8599–8603. IEEE (2013)
Jaitly, N., Nguyen, P., Senior, A.W., Vanhoucke, V.: Application of pretrained deep neural networks to large vocabulary speech recognition. In: INTERSPEECH. Citeseer (2012)
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3361–3368. IEEE (2011)
Le, Q.V.: Building high-level features using large scale unsupervised learning. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8595–8598. IEEE (2013)
Huang, G.B., Lee, H., Learned-Miller, E.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2518–2525. IEEE (2012)
Gens, R., Domingos, P.: Discriminative learning of sum-product networks. In: Advances in Neural Information Processing Systems(NIPS), pp. 3248–3256 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). arXiv:1409-1556
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint (2014). arXiv:1409.4842
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. arXiv preprint (2015). arXiv:1502.01852
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint (2015). arXiv:1502.03167
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519. IEEE (2014)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)
Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards AI. Large-scale kernel machines 34 (2007)
Labusch, K., Barth, E., Martinetz, T.: Simple method for high-performance digit recognition based on sparse coding. IEEE Trans. Neural Netw. 19, 1985–1989 (2008)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)
Lee, Y.: Handwritten digit recognition using k nearest-neighbor, radial-basis function, and backpropagation neural networks. Neural Comput. 3, 440–449 (1991)
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images, vol. 1, p. 7. Technical report, Computer Science Department, University of Toronto (2009)
Malinowski, M., Fritz, M.: Learning smooth pooling regions for visual recognition. In: 24th British Machine Vision Conference, pp. 1–11. BMVA Press (2013)
Lin, T.H., Kung, H.: Stable and efficient representation learning with nonnegativity constraints. In: Proceedings of the 31st International Conference on Machine Learning (ICML), pp. 1323–1331 (2014)
Acknowledgements
The research was supported by National Nature Science Foundation of China (No. 61231015, 61172173, 61303114, 61170023). National High Technology Research and Development Program of China (863 Program, No. 2015AA016306). Technology Research Program of Ministry of Public Security (No. 2014JSYJA016). The EU FP7 QUICK project under Grant Agreement (No. PIRSES-GA-2013-612652). Major Science and Technology Innovation Plan of Hubei Province (No. 2013AAA020). Internet of Things Development Funding Project of Ministry of industry in 2013 (No. 25). China Postdoctoral Science Foundation funded project (2013M530350, 2014M562058). Specialized Research Fund for the Doctoral Program of Higher Education (No. 20130141120024). Nature Science Foundation of Hubei Province (2014CFB712). The Fundamental Research Funds for the Central Universities (2042014kf0025, 2042014kf0250, 2014211020203). Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry ([2014]1685).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xiong, M. et al. (2015). Deep Feature Representation via Multiple Stack Auto-Encoders. In: Ho, YS., Sang, J., Ro, Y., Kim, J., Wu, F. (eds) Advances in Multimedia Information Processing -- PCM 2015. PCM 2015. Lecture Notes in Computer Science(), vol 9314. Springer, Cham. https://doi.org/10.1007/978-3-319-24075-6_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-24075-6_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24074-9
Online ISBN: 978-3-319-24075-6
eBook Packages: Computer ScienceComputer Science (R0)