End-to-End Trained Sparse Coding Network with Spatial Pyramid Pooling for Image Classification

Chen, Boheng; Wang, Yige; Wei, Gang; Li, Jie; Ma, Biyun

doi:10.1007/s11063-018-9967-5

End-to-End Trained Sparse Coding Network with Spatial Pyramid Pooling for Image Classification

Published: 22 January 2019

Volume 50, pages 2021–2036, (2019)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Boheng Chen¹,
Yige Wang¹,
Gang Wei¹,
Jie Li¹ &
…
Biyun Ma¹

459 Accesses
4 Citations
Explore all metrics

Abstract

Spatial pyramid matching using sparse coding (ScSPM) has become an efficient method and a benchmark in image classification. However, since it is unsupervised, the trained dictionary may be suboptimal. To further improve classification accuracy, in this paper we propose a sparse coding network with spatial pyramid pooling based on the end-to-end deep learning approach. In our new system, the minimization problem in sparse coding can be modeled as a feed-forward neural network and image features can be extracted by the deep convolutional network. By minimizing the final classifier loss using the end-to-end deep learning method, the sparse coding network can be trained in a supervised way. Our proposed model is tested on three image databases and in terms of classification accuracy, it significantly outperforms ScSPM. Compared with other image classification approaches based on deep learning, it can also achieve a noticeable improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discriminative convolution sparse coding for robust image classification

Article 13 May 2022

Image classification via convolutional sparse coding

Article 06 April 2022

Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

Article 01 May 2020

References

Zhao H, Luo J, Huang Z, Nagumo T, Murayama J, Zhang L (2015) Statistically adaptive image denoising based on overcomplete topographic sparse coding. Neural Process Lett 41(3):1–13
Article Google Scholar
Akhtar N, Shafait F, Mian A (2015) Bayesian sparse representation for hyperspectral image super resolution. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3631–3640
Yang J, Wright J, Huang TS, Ma Y (2010) Image super-resolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873
Article MathSciNet Google Scholar
Wang Z, Liu D, Yang J, Han W, Huang T (2015) Deep networks for image super-resolution with sparse prior. In: 2015 IEEE international conference on computer vision (ICCV), pp 370–378
Ma Z, Xiang Z (2017) Robust visual tracking via binocular consistent sparse learning. Neural Process Lett 2:1–16
Google Scholar
Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR), pp 1794–1801
Bo L, Ren X, Fox D (2013) Multipath sparse coding using hierarchical matching pursuit. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR), pp 660–667
Li Q, Zhang H, Guo J, Bhanu B, An L (2013) Reference-based scheme combined with K-SVD for scene image categorization. IEEE Signal Process Lett 20(1):67–70
Article Google Scholar
Liu Q, Liu C (2015) A novel locally linear KNN model for visual recognition. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1329–1337
Quan Y, Xu Y, Sun Y, Huang Y, Ji H (2016) Sparse coding for classification via discrimination ensemble. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 5839–5847
Wright J, Yang AY, Sastry SS, Ma Y (2008) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Article Google Scholar
Chen B, Li J, Ma B, Wei G (2016) Convolutional sparse coding classification model for image classification. In: 2016 IEEE international conference on image processing (ICIP), pp 1918–1922
Zhang Y, Zhao D, Sun J, Zou G, Li W (2016) Adaptive convolutional neural network and its application in face recognition. Neural Process Lett 43(2):389–399
Article Google Scholar
Liu C, Hou W, Liu D (2017) Foreign exchange rates forecasting with convolutional neural network. Neural Process Lett 2:1–25
Google Scholar
Ding C, Hu Z, Karmoshi S, Zhu M (2017) A novel two-stage learning pipeline for deep neural networks. Neural Process Lett 46(1):159–169
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: 2012 international conference on neural information processing systems (NIPS), pp 1097–1105
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov, D (2014) Going deeper with convolutions. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2):91–110
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE conference on computer vision and pattern recognition (CVPR), pp 886–893
Wang Z, Liu D, Chang S, Ling Q, Yang Y, Huang, TS (2016) D3: deep dual-domain based fast restoration of jpeg-compressed images. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2764–2772
Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen H (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 40(12):3034–3044
Article Google Scholar
Papyan V, Romano Y, Elad M (2016) Convolutional neural networks analyzed via convolutional sparse coding. arXiv preprint arXiv:1607.08194
Deng J, Dong W, Socher R, Li LJ, Li K, Li, FF (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition (CVPR), pp 248–255
Cimpoi M, Maji S, Vedaldi A (2015) Deep filter banks for texture recognition and segmentation. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3828–3836
Daubechies I, Defrise M, De Mol C (2003) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11):1413–1457
Article MathSciNet Google Scholar
Gregor K, Lecun Y (2010) Learning fast approximations of sparse coding. In: 2010 international conference on machine learning (ICML), pp 399–406
Zhang C, Liu J, Tian Q, Xu C, Lu H, Ma S (2014) Image classification by non-negative sparse coding, low-rank and sparse decomposition. Comput Vis Image Underst 123(7):14–22
Article Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Article MathSciNet Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
Nilsback ME, Zisserman A (2009) Automated flower classification over a large number of classes. In: 2009 Indian conference on computer vision, pp 722–729
Feifei L, Fergus R, Perona P (2005) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Article Google Scholar
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. California Institute of Technology, Pasadena
Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 2015 international conference on learning representation (ICLR)
Angelova A, Zhu S (2013) Efficient object detection and segmentation for fine-grained recognition. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR), pp 811–818
Murray N, Perronnin F (2014) Generalized max pooling. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR), pp 2473–2480
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: 2014 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 512–519
Cai S, Zhang L, Zuo W, Feng X (2016) A probabilistic collaborative representation based approach for pattern classification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2950–2959
Simon M, Rodner E (2015) Neural activation constellations: unsupervised part model discovery with convolutional networks. In: 2015 IEEE international conference on computer vision (ICCV), pp 1143–1151
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR), pp 3360–3367
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: 2014 European conference on computer vision (ECCV), pp 818–833
Chapter Google Scholar
Xie GS, Zhang XY, Shu X, Yan S, Liu CL (2015) Task-Driven feature pooling for image classification. In: 2015 IEEE international conference on computer vision (ICCV), pp 1179–1187
Gao BB, Wei XS, Wu J, Lin W (2015) Deep spatial pyramid: the devil is once again in the details. CoRR arXiv:1504.05277

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61302055, 61401160, 61327005), the Science and Technology Planning Project of Guangdong Province (2017A020214011), the Funds for the Central Universities 2017MS039, the Guangdong Provincial Key Laboratory of Short-Range Wireless Detection and Communication (No. 2014B030301010, 2017B030314003), the Science and Technology Program of Guangzhou (No. 201804020079) and the Project sponsored by SRF for ROCS, SEM.

Author information

Authors and Affiliations

National Engineering Technology Research Center for Mobile Ultrasonic Detection, School of Electronics and Information Engineering, South China University of Technology, Guangzhou, People’s Republic of China
Boheng Chen, Yige Wang, Gang Wei, Jie Li & Biyun Ma

Authors

Boheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yige Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jie Li
View author publications
You can also search for this author in PubMed Google Scholar
Biyun Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boheng Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, B., Wang, Y., Wei, G. et al. End-to-End Trained Sparse Coding Network with Spatial Pyramid Pooling for Image Classification. Neural Process Lett 50, 2021–2036 (2019). https://doi.org/10.1007/s11063-018-9967-5

Download citation

Published: 22 January 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11063-018-9967-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Trained Sparse Coding Network with Spatial Pyramid Pooling for Image Classification

Abstract

Access this article

Similar content being viewed by others

Discriminative convolution sparse coding for robust image classification

Image classification via convolutional sparse coding

Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

End-to-End Trained Sparse Coding Network with Spatial Pyramid Pooling for Image Classification

Abstract

Access this article

Similar content being viewed by others

Discriminative convolution sparse coding for robust image classification

Image classification via convolutional sparse coding

Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation