Skip to main content

Learning a Discriminative Dictionary with CNN for Image Classification

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9948))

Included in the following conference series:

  • 3130 Accesses

Abstract

In this paper, we propose a novel framework for image recognition based on an extended sparse model. First, inspired by the impressive results of CNN over different tasks in computer vision, we use the CNN models pre-trained on large datasets to generate features. Then we propose an extended sparse model which learns a dictionary from the CNN features by incorporating the reconstruction residual term and the coefficients adjustment term. Minimizing the reconstruction residual term guarantees that the class-specific sub-dictionary has good representation power for the samples from the corresponding class and minimizing the coefficients adjustment term encourages samples from different classes to be reconstructed by different class-specific sub-dictionaries. With this learned dictionary, not only the representation residual but also the representation coefficients will be discriminative. Finally, a metric involving these discriminative information is introduced for image classification. Experiments on Caltech101 and PASCAL VOC 2012 datasets show the effectiveness of the proposed method on image classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aharon, M., Elad, M., Bruckstein, A.: K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  Google Scholar 

  2. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  3. Castrodad, A., Sapiro, G.: Sparse modeling of human actions from motion imagery. Int. J. Comput. Vis. 100(1), 1–15 (2012)

    Article  Google Scholar 

  4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)

    Google Scholar 

  5. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge (VOC 2012) results (2012). http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html

  6. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)

    Article  Google Scholar 

  7. Jiang, Z., Lin, Z., Davis, L.S.: Label consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2651–2664 (2013)

    Article  Google Scholar 

  8. Kong, S., Wang, D.: A dictionary learning approach for classification: separating the particularity and the commonality. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 186–199. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Le Cun, B.B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems. Citeseer (1990)

    Google Scholar 

  10. Mairal, J., Bach, F., Ponce, J.: Task-driven dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 791–804 (2012)

    Article  Google Scholar 

  11. Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Discriminative learned dictionaries for local image analysis. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)

    Google Scholar 

  12. Mairal, J., Ponce, J., Sapiro, G., Zisserman, A., Bach, F.R.: Supervised dictionary learning. In: Advances in neural information processing systems, pp. 1033–1040 (2009)

    Google Scholar 

  13. Ng, P.C., Henikoff, S.: Sift: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31(13), 3812–3814 (2003)

    Article  Google Scholar 

  14. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1717–1724 (2014)

    Google Scholar 

  15. Ramirez, I., Sprechmann, P., Sapiro, G.: Classification and clustering via dictionary learning with structured incoherence and shared features. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 3501–3508. IEEE (2010)

    Google Scholar 

  16. Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. pp. 806–813 (2014)

    Google Scholar 

  17. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  18. Wang, H., Yuan, C., Hu, W., Sun, C.: Supervised class-specific dictionary learning for sparse modeling in action recognition. Pattern Recogn. 45(11), 3902–3911 (2012)

    Article  Google Scholar 

  19. Wei, Y., Xia, W., Huang, J., Ni, B., Dong, J., Zhao, Y., Yan, S.: CNN: single-label to multi-label (2014). arXiv preprint arXiv:1406.5726

  20. Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)

    Article  Google Scholar 

  21. Yang, M., Zhang, L., Feng, X., Zhang, D.: Fisher discrimination dictionary learning for sparse representation. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 543–550. IEEE (2011)

    Google Scholar 

  22. Yang, M., Zhang, L., Feng, X., Zhang, D.: Sparse representation based fisher discrimination dictionary learning for image classification. Int. J. Comput. Vis. 109(3), 209–232 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  23. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014)

    Google Scholar 

  24. Zhang, Q., Li, B.: Discriminative K-SVD for dictionary learning in face recognition. In: Computer Vision and Pattern Recognition, CVPR 2010, pp. 2691–2698. IEEE (2010)

    Google Scholar 

  25. Zhou, N., Shen, Y., Peng, J., Fan, J.: Learning inter-related visual dictionary for object recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012, pp. 3490–3497. IEEE (2012)

    Google Scholar 

Download references

Acknowledgments

This research is partly supported by 973 Plan, China (No. 2015CB856004) and NSFC, China (No: 61572315).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Yu, S., Zhang, T., Ma, C., Zhou, L., Yang, J., He, X. (2016). Learning a Discriminative Dictionary with CNN for Image Classification. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9948. Springer, Cham. https://doi.org/10.1007/978-3-319-46672-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46672-9_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46671-2

  • Online ISBN: 978-3-319-46672-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics