Skip to main content
Log in

Image classification and annotation based on robust regularized coding

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In recent years, sparse coding has been widely applied to construct high-level image representation in computer vision applications. However, one of major deficiencies of sparse coding is that it fails to capture spatial context in the data. Similar descriptors may be quantized into different visual words during feature quantization process. In this paper, we propose a novel coding scheme called robust regularized coding (RRC), which fully exploits the geometrical information among local descriptors to significantly boost the discriminating capability of the resultant features. More specifically, both locality constraint and smoothness constraint terms with respect to RRC codes are incorporated into the objective function to preserve the local invariance of RRC codes. Besides, to scale up to larger databases, a novel online learning algorithm with no hyperparameter tuning is proposed to incrementally update the codebook. The obtained RRC codes are then employed to represent images for classification and annotation tasks in our experiments. We also propose an effective reconstruction-based image annotation algorithm to propagate the labels of training images to test image by multi-label linear embedding. The experimental results extensively evaluated over several benchmarking datasets demonstrate our approach can achieve significant performance improvements with respect to the state of the arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Alqasrawi, Y., Neagu, D., Cowling, P.: Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification. Signal Image Video Process. 7(4), 759–775 (2013)

    Article  Google Scholar 

  2. Bo, L., Sminchisescu, C.: Efficient match kernel between sets of features for visual recognition. In: NIPS, pp. 135–143 (2009)

  3. Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 394–410 (2007)

    Article  Google Scholar 

  4. Duchenne, O., Joulin, A., Ponce, J.: A Graph-matching Kernel for Object Categorization. In: Proceedings of the International Conference in Computer Vision (ICCV), pp. 1792–1799. Barcelona, Spain (2011)

  5. Duygulu, P., Barnard, K., Freitas, J.F.G.d., Forsyth, D.A.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision: Part IV, pp. 97–112. Springer, London (2002)

  6. Feng, J., Ni, B., Tian, Q., Yan, S.: Geometric lp-norm feature pooling for image classification. In: CVPR’11, pp. 2697–2704 (2011)

  7. Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1002–1009. IEEE Computer Society, Washington, DC (2004)

  8. Gao, S., Chia, L.T., Tsang, I.W.H.: Multi-layer group sparse coding—for concurrent image classification and annotation. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2809–2816. IEEE Computer Society, Washington, DC (2011)

  9. Gao, S., Tsang, I., Chia, L.T.: Sparse representation with kernels. IEEE Trans. Image Process. 22(2), 423–434 (2013)

    Article  MathSciNet  Google Scholar 

  10. Gao, S., Tsang, I.W.H., Chia, L.T.: Laplacian sparse coding, hypergraph Laplacian sparse coding, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 92–104 (2013)

    Article  Google Scholar 

  11. Gao, S., Tsang, I.W.H., Chia, L.T., Zhao, P.: Local features are not lonely - laplacian sparse coding for image classification. In: CVPR, pp. 3555–3561. IEEE (2010)

  12. Goh, H., Thome, N., Cord, M., Lim, J.H.: Unsupervised and supervised visual codes with restricted boltzmann machines. In: Proceedings of the 12th European Conference on Computer Vision, vol. Part V. ECCV’12, pp. 298–311. Springer, Berlin (2012)

  13. Griffin, G., Holub, A., Perona, P.: Caltech-256 Object Category Dataset. Technical Report 7694, California Institute of Technology (2007)

  14. Grubinger, M., Clough, P., Müller, H., Deselaers, T.: The IAPR TC-12 benchmark: a new evaluation resource for visual information systems. In: International Conference on Language Resources and Evaluation, pp. 13–23 (2006)

  15. Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 119–126 (2003)

  16. Ji, R., Yao, H., Liang, D.: Drm: dynamic region matching for image retrieval using probabilistic fuzzy matching and boosting feature selection. Signal Image Video Process. 2(1), 59–71 (2008)

    Article  MATH  Google Scholar 

  17. Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. NIPS (2003)

  18. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE Computer Society, Washington, DC (2006)

  19. Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 524–531. IEEE Computer Society, Washington, DC (2005)

  20. Li, L.J., Li, F.F.: What, where and who? classifying events by scene and object recognition. In: ICCV, pp. 1–8 (2007)

  21. Li, Q., Zhang, H., Guo, J., Bhanu, B., An, L.: Reference-based scheme combined with k-svd for scene image categorization. IEEE Signal Process. Lett. 20(1), 67–70 (2013)

    Article  Google Scholar 

  22. Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of the 2011 International Conference on Computer Vision. ICCV ’11, pp. 2486–2493. IEEE Computer Society, Washington, DC (2011)

  23. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  24. Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Trans. Image Process. 17(1), 53–69 (2008)

    Article  MathSciNet  Google Scholar 

  25. Maji, S., Berg, A.: Max-margin additive classifiers for detection. In: IEEE 12th International Conference on Computer Vision, pp. 40–47 (2009)

  26. Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Proceedings of the 10th European Conference on Computer Vision: Part III, pp. 316–329. Springer, Berlin (2008)

  27. Qiu, G.: Indexing chromatic and achromatic patterns for content-based colour image retrieval. Pattern Recognit. 35, 1675–1686 (2002)

    Article  MATH  Google Scholar 

  28. Ramamurthy, K.N., Thiagarajan, J.J., Sattigeri, P.: Learning dictionaries with graph embedding constraints. In: IEEE Asilomar, pp. 1974–1978 (2012)

  29. Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 994–1000. IEEE Computer Society (2005)

  30. Shabou, A., Borgne, H.L.: Locality-constrained and spatially regularized coding for scene categorization. In: CVPR, pp. 3618–3625. IEEE (2012)

  31. Simou, N., Athanasiadis, T., Stoilos, G., Kollias, S.: Image indexing and retrieval using expressive fuzzy description logics. Signal Image Video Process. 2(4), 321–335 (2008)

    Article  Google Scholar 

  32. Sohn, K., Jung, D.Y., Lee, H., Hero, A.O.: Efficient learning of sparse, distributed, convolutional feature representations for object recognition. In: D.N. Metaxas, L. Quan, A. Sanfeliu, L.J.V. Gool (eds.) ICCV, pp. 2643–2650. IEEE (2011)

  33. Wang, C., Yan, S., Zhang, L., Zhang, H.J.: Multi-label sparse coding for automatic image annotation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1643–1650. IEEE (2009)

  34. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: The 23rd IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13–18 June 2010, pp. 3360–3367. IEEE (2010)

  35. Wu, J., Rehg, J.: Beyond the euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: IEEE 12th International Conference on Computer Vision, pp. 630–637 (2009)

  36. Yang, J., Yu, K., Gong, Y., Huang, T.S.: Linear spatial pyramid matching using sparse coding for image classification. In: Computer Vision and Pattern Recognition, pp. 1794–1801 (2009)

  37. Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: NIPS, pp. 2223–2231. Curran Associates, Inc. (2009)

  38. Zhang, S., Huang, J., Huang, Y., Yu, Y., Li, H., Metaxas, D.N.: Automatic image annotation using group sparsity. In: The 23rd IEEE Conference on Computer Vision and Pattern Recognition, pp. 3312–3319. IEEE (2010)

  39. Zhang, S., Yao, H., Sun, X., Liu, S.: Robust visual tracking using an effective appearance model based on sparse coding. ACM Trans. Intell. Syst. Technol. 3(3), 43:1–43:18 (2012)

  40. Zhang, S., Yao, H., Sun, X., Lu, X.: Sparse coding based visual tracking: review and experimental comparison. Pattern Recognit. 46(7), 1772–1788 (2013)

    Article  Google Scholar 

  41. Zhang, S., Yao, H., Zhou, H., Sun, X., Liu, S.: Robust visual tracking based on online learning sparse representation. Neurocomputing 100, 31–40 (2013)

    Article  Google Scholar 

  42. Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G., Cai, D.: Graph regularized sparse coding for image representation. IEEE Trans. Image Process. 20(5), 1327–1336 (2011)

    Article  MathSciNet  Google Scholar 

  43. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16. MIT Press (2003)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haixia Zheng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, H., Ip, H.H.S. Image classification and annotation based on robust regularized coding. SIViP 10, 55–64 (2016). https://doi.org/10.1007/s11760-014-0701-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-014-0701-0

Keywords

Navigation