Skip to main content
Log in

Tagging image by merging multiple features in a integrated manner

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Image tagging is a task that automatically assigns the query image with semantic keywords called tags, which significantly facilitates image search and organization. Since tags and image visual content are represented in different feature space, how to merge the multiple features by their correlation to tag the query image is an important problem. However, most of existing approaches merge the features by using a relatively simple mechanism rather than fully exploiting the correlations between different features. In this paper, we propose a new approach to fusing different features and their correlation simultaneously for image tagging. Specifically, we employ a Feature Correlation Graph to capture the correlations between different features in an integrated manner, which take features as nodes and their correlations as edges. Then, a revised probabilistic model based on Markov Random Field is used to describe the graph for evaluating tag’s relevance to query image. Based on that, we design an image tagging algorithm for large scale web image dataset. We evaluate our approach using two large real-life corpuses collected from Flickr, and the experimental results indicate the superiority of our proposed approach over state-of-the-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  • Bailloeul, T., Zhu, C. Z., & Xu, Y. H. (2008). Automatic image tagging as a random walk with priors on the canonical correlation subspace. In Proc. ACM MIR.

  • Chen, G., Song, Y., Wang, F., & Zhang, C. (2008). Semi-supervised multi-label learning by solving a sylvester equation. In SIAM international conference on data mining.

  • Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y. (2009). Nus-wide: A real-world web image database from national university of singapore. In Proc. CIVR.

  • Cui, B., Tung, A. K. H., Zhang, C., & Zhao, Z. (2010). Multiple feature fusion for social media applications. In Proc. ACM SIGMOD.

  • Datta, R., Joshi, D., Li, J., & Wang, J. (2008). Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys, 40(2), 1–60.

    Article  Google Scholar 

  • Fellbaum, C. (1998). WordNet: An electronic lexical database, illustrated edition. The MIT Press.

  • Feng, S., & Manmatha, R. (2008). A discrete direct retrieval model for image and video retrieval. In Proceeding of CIVR.

  • Geng, B., Yang, L., Xu, C., & Hua, X. (2008). Collaborative learning for image and video annotation. In Proc. ACM MIR (2008).

  • Huiskes, M., & Lew, M. (2008). The mir flickr retrieval evaluation. In Proc. ACM MIR.

  • Jiang, J., & Conrath, D. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. In Proc. international conference on research in computational linguistics.

  • Kindermann, R., & Snell, J. L. (1980). Markov random fields and their applications. American Mathematical Society, pp. 1–147. ISBN: 0821850016.

  • Lei, W., Linjun, Y., Nenghai, Y., & Hua, X. S. (2009a). Learning to tag. In Proc. ACM WWW.

  • Lei, W., Steven, C. H., Jin, H. R., Jianke, Z., & Nenghai, Y. (2009b). Distance metric learning from uncertain side information with application to automated photo tagging. In Proc. ACM multimedia.

  • Li, J., & Wang, J. Z. (2008). Real-time computerized annotation of pictures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 985–1002.

    Article  Google Scholar 

  • Li, X., Snoek, C. G. M., & Worring, M. (2009). Learning social tag relevance by neighbor voting. IEEE Transactions on Multimedia, 11(7), 1310–1322.

    Article  Google Scholar 

  • Liu, D., Hua, X.-S., Wang, M., & Zhang, H.-J. (2010). Image retagging. In Proc. ACM multimedia.

  • Liu, D., Wang, M., Hua, X. S., Zhang, H. J. (2009). Tag ranking. In Proc. ACM WWW.

  • Liu, J., Li, M., Ma, W.-Y., Liu, Q., & Lu, H. (2006). An adaptive graph model for automatic image annotation. In Proc. ACM MIR.

  • Liu, J., Wang, B., Li, M. J., Li, Z. W., Ma, W. Y., Lu, H. Q., et al. (2007). Dual cross-media relevance model for image annotation. In Proc. ACM Multimedia.

  • Lowe, D. (1999). Object recognition from local scale-invariant features. In Proc. ICCV.

  • Metzler, D. (2005). Direct maximization of rank-based metrics. Amherst: University of Massachusetts, (Technical report).

    Google Scholar 

  • Metzler, D., & Bruce Croft, W. (2005). A markov random field model for term dependencies. In Proc. ACM SIGIR.

  • Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., & Zhang, H.-J. (2007). Correlative multi-label video annotation. In Proc. ACM multimedia.

  • Shen, Y., & Fan, J. P. (2010). Leveraging loosely-tagged images and inter-object correlations for tag recommendation. In Proc. ACM multimedia.

  • Siersdorfer, S., San Pedro, J., & Sanderson, M. (2009). Automatic video tagging using content redundancy. In Proc. ACM SIGIR.

  • Sigurbjornsson, B., & van Zwol, R. (2008). Flickr tag recommendation based on collective knowledge. In Proc. ACM WWW.

  • Smeulders, A., Worring, M., Santini, S., Gupta, A., & Jain, R. (2000). Content-based image retrieval at the end of the early years. TPAMI.

  • Sun, A. X., & Bhowmick, S. S. (2010). Quantifying tag representativeness of visual content of social images. In Proc. ACM multimedia.

  • Tang, J., Hua, X., Wang, M., Gu, Z., Qi, G., & Wu, X. (2009). Correlative linear neighborhood propagation for video annotation. IEEE Transactions on SMC, 39(2), 409–416.

    Google Scholar 

  • Wu, F., Han, Y. H., Tian, Q., & Zhuang, Y. T. (2010). Multi-label boosting for image annotation by structural grouping sparsity. In Proc. ACM multimedia.

  • Wu, L., Li, M. J., Li, Z. W. , Ma, W. Y., & Yu, N. H. (2007). Visual language modeling for image classification. In Proc. MIR.

  • Xiang, Y., Zhou, X., Chua, T., & Ngo, C. (2009). A revisit of generative model for automatic image annotation using markov random fields. In CVPR.

  • Xiang, Y., Zhou, X. D. , Liu, Z. T., Chua, T.-S., & Ngo, C.-W. (2010). Semantic context modeling with maximal margin conditional random fields for automatic image annotation. In Proc. CVPR.

  • Zhou, X., Wang, M., Zhang, Q., Zhang, J., & Shi, B. (2007). Automatic image annotation by an iterative approach: Incorporating keyword correlations and region matching. In Proc. ACM CIVR.

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (NO. 61170189 and NO. 61003111) and the fund of the State Key Laboratory of Software Development Environment (NO. SKLSDE-2011ZX-03). The authors would like to thank the Editors and the anonymous reviewers for their valuable comments and remarks on the previous versions of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoming Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Li, Z. & Chao, W. Tagging image by merging multiple features in a integrated manner. J Intell Inf Syst 39, 87–107 (2012). https://doi.org/10.1007/s10844-011-0184-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-011-0184-1

Keywords

Navigation