Image classification and annotation based on robust regularized coding

Zheng, Haixia; Ip, Horace H. S.

doi:10.1007/s11760-014-0701-0

Image classification and annotation based on robust regularized coding

Original Paper
Published: 10 October 2014

Volume 10, pages 55–64, (2016)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Haixia Zheng¹ &
Horace H. S. Ip¹

387 Accesses
4 Citations
Explore all metrics

Abstract

In recent years, sparse coding has been widely applied to construct high-level image representation in computer vision applications. However, one of major deficiencies of sparse coding is that it fails to capture spatial context in the data. Similar descriptors may be quantized into different visual words during feature quantization process. In this paper, we propose a novel coding scheme called robust regularized coding (RRC), which fully exploits the geometrical information among local descriptors to significantly boost the discriminating capability of the resultant features. More specifically, both locality constraint and smoothness constraint terms with respect to RRC codes are incorporated into the objective function to preserve the local invariance of RRC codes. Besides, to scale up to larger databases, a novel online learning algorithm with no hyperparameter tuning is proposed to incrementally update the codebook. The obtained RRC codes are then employed to represent images for classification and annotation tasks in our experiments. We also propose an effective reconstruction-based image annotation algorithm to propagate the labels of training images to test image by multi-label linear embedding. The experimental results extensively evaluated over several benchmarking datasets demonstrate our approach can achieve significant performance improvements with respect to the state of the arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Alqasrawi, Y., Neagu, D., Cowling, P.: Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification. Signal Image Video Process. 7(4), 759–775 (2013)
Article Google Scholar
Bo, L., Sminchisescu, C.: Efficient match kernel between sets of features for visual recognition. In: NIPS, pp. 135–143 (2009)
Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 394–410 (2007)
Article Google Scholar
Duchenne, O., Joulin, A., Ponce, J.: A Graph-matching Kernel for Object Categorization. In: Proceedings of the International Conference in Computer Vision (ICCV), pp. 1792–1799. Barcelona, Spain (2011)
Duygulu, P., Barnard, K., Freitas, J.F.G.d., Forsyth, D.A.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision: Part IV, pp. 97–112. Springer, London (2002)
Feng, J., Ni, B., Tian, Q., Yan, S.: Geometric lp-norm feature pooling for image classification. In: CVPR’11, pp. 2697–2704 (2011)
Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1002–1009. IEEE Computer Society, Washington, DC (2004)
Gao, S., Chia, L.T., Tsang, I.W.H.: Multi-layer group sparse coding—for concurrent image classification and annotation. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2809–2816. IEEE Computer Society, Washington, DC (2011)
Gao, S., Tsang, I., Chia, L.T.: Sparse representation with kernels. IEEE Trans. Image Process. 22(2), 423–434 (2013)
Article MathSciNet Google Scholar
Gao, S., Tsang, I.W.H., Chia, L.T.: Laplacian sparse coding, hypergraph Laplacian sparse coding, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 92–104 (2013)
Article Google Scholar
Gao, S., Tsang, I.W.H., Chia, L.T., Zhao, P.: Local features are not lonely - laplacian sparse coding for image classification. In: CVPR, pp. 3555–3561. IEEE (2010)
Goh, H., Thome, N., Cord, M., Lim, J.H.: Unsupervised and supervised visual codes with restricted boltzmann machines. In: Proceedings of the 12th European Conference on Computer Vision, vol. Part V. ECCV’12, pp. 298–311. Springer, Berlin (2012)
Griffin, G., Holub, A., Perona, P.: Caltech-256 Object Category Dataset. Technical Report 7694, California Institute of Technology (2007)
Grubinger, M., Clough, P., Müller, H., Deselaers, T.: The IAPR TC-12 benchmark: a new evaluation resource for visual information systems. In: International Conference on Language Resources and Evaluation, pp. 13–23 (2006)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 119–126 (2003)
Ji, R., Yao, H., Liang, D.: Drm: dynamic region matching for image retrieval using probabilistic fuzzy matching and boosting feature selection. Signal Image Video Process. 2(1), 59–71 (2008)
Article MATH Google Scholar
Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. NIPS (2003)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE Computer Society, Washington, DC (2006)
Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 524–531. IEEE Computer Society, Washington, DC (2005)
Li, L.J., Li, F.F.: What, where and who? classifying events by scene and object recognition. In: ICCV, pp. 1–8 (2007)
Li, Q., Zhang, H., Guo, J., Bhanu, B., An, L.: Reference-based scheme combined with k-svd for scene image categorization. IEEE Signal Process. Lett. 20(1), 67–70 (2013)
Article Google Scholar
Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of the 2011 International Conference on Computer Vision. ICCV ’11, pp. 2486–2493. IEEE Computer Society, Washington, DC (2011)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Trans. Image Process. 17(1), 53–69 (2008)
Article MathSciNet Google Scholar
Maji, S., Berg, A.: Max-margin additive classifiers for detection. In: IEEE 12th International Conference on Computer Vision, pp. 40–47 (2009)
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Proceedings of the 10th European Conference on Computer Vision: Part III, pp. 316–329. Springer, Berlin (2008)
Qiu, G.: Indexing chromatic and achromatic patterns for content-based colour image retrieval. Pattern Recognit. 35, 1675–1686 (2002)
Article MATH Google Scholar
Ramamurthy, K.N., Thiagarajan, J.J., Sattigeri, P.: Learning dictionaries with graph embedding constraints. In: IEEE Asilomar, pp. 1974–1978 (2012)
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 994–1000. IEEE Computer Society (2005)
Shabou, A., Borgne, H.L.: Locality-constrained and spatially regularized coding for scene categorization. In: CVPR, pp. 3618–3625. IEEE (2012)
Simou, N., Athanasiadis, T., Stoilos, G., Kollias, S.: Image indexing and retrieval using expressive fuzzy description logics. Signal Image Video Process. 2(4), 321–335 (2008)
Article Google Scholar
Sohn, K., Jung, D.Y., Lee, H., Hero, A.O.: Efficient learning of sparse, distributed, convolutional feature representations for object recognition. In: D.N. Metaxas, L. Quan, A. Sanfeliu, L.J.V. Gool (eds.) ICCV, pp. 2643–2650. IEEE (2011)
Wang, C., Yan, S., Zhang, L., Zhang, H.J.: Multi-label sparse coding for automatic image annotation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1643–1650. IEEE (2009)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: The 23rd IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13–18 June 2010, pp. 3360–3367. IEEE (2010)
Wu, J., Rehg, J.: Beyond the euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: IEEE 12th International Conference on Computer Vision, pp. 630–637 (2009)
Yang, J., Yu, K., Gong, Y., Huang, T.S.: Linear spatial pyramid matching using sparse coding for image classification. In: Computer Vision and Pattern Recognition, pp. 1794–1801 (2009)
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: NIPS, pp. 2223–2231. Curran Associates, Inc. (2009)
Zhang, S., Huang, J., Huang, Y., Yu, Y., Li, H., Metaxas, D.N.: Automatic image annotation using group sparsity. In: The 23rd IEEE Conference on Computer Vision and Pattern Recognition, pp. 3312–3319. IEEE (2010)
Zhang, S., Yao, H., Sun, X., Liu, S.: Robust visual tracking using an effective appearance model based on sparse coding. ACM Trans. Intell. Syst. Technol. 3(3), 43:1–43:18 (2012)
Zhang, S., Yao, H., Sun, X., Lu, X.: Sparse coding based visual tracking: review and experimental comparison. Pattern Recognit. 46(7), 1772–1788 (2013)
Article Google Scholar
Zhang, S., Yao, H., Zhou, H., Sun, X., Liu, S.: Robust visual tracking based on online learning sparse representation. Neurocomputing 100, 31–40 (2013)
Article Google Scholar
Zheng, M., Bu, J., Chen, C., Wang, C., Zhang, L., Qiu, G., Cai, D.: Graph regularized sparse coding for image representation. IEEE Trans. Image Process. 20(5), 1327–1336 (2011)
Article MathSciNet Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16. MIT Press (2003)

Download references

Author information

Authors and Affiliations

Department of Computer Science, Centre for Innovative Applications of Internet and Multimedia Technologies (AIMtech Centre), City University of Hong Kong, Kowloon, Hong Kong
Haixia Zheng & Horace H. S. Ip

Authors

Haixia Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Horace H. S. Ip
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haixia Zheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, H., Ip, H.H.S. Image classification and annotation based on robust regularized coding. SIViP 10, 55–64 (2016). https://doi.org/10.1007/s11760-014-0701-0

Download citation

Received: 19 September 2013
Revised: 12 August 2014
Accepted: 13 September 2014
Published: 10 October 2014
Issue Date: January 2016
DOI: https://doi.org/10.1007/s11760-014-0701-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image classification and annotation based on robust regularized coding

Abstract

Access this article

Similar content being viewed by others

ImageNet Large Scale Visual Recognition Challenge

Sparse Recovery of Hyperspectral Signal from Natural RGB Images

Sparse semi-supervised multi-label feature selection based on latent representation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image classification and annotation based on robust regularized coding

Abstract

Access this article

Similar content being viewed by others

ImageNet Large Scale Visual Recognition Challenge

Sparse Recovery of Hyperspectral Signal from Natural RGB Images

Sparse semi-supervised multi-label feature selection based on latent representation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation