Abstract
Image tagging is a task that automatically assigns the query image with semantic keywords called tags, which significantly facilitates image search and organization. Since tags and image visual content are represented in different feature space, how to merge the multiple features by their correlation to tag the query image is an important problem. However, most of existing approaches merge the features by using a relatively simple mechanism rather than fully exploiting the correlations between different features. In this paper, we propose a new approach to fusing different features and their correlation simultaneously for image tagging. Specifically, we employ a Feature Correlation Graph to capture the correlations between different features in an integrated manner, which take features as nodes and their correlations as edges. Then, a revised probabilistic model based on Markov Random Field is used to describe the graph for evaluating tag’s relevance to query image. Based on that, we design an image tagging algorithm for large scale web image dataset. We evaluate our approach using two large real-life corpuses collected from Flickr, and the experimental results indicate the superiority of our proposed approach over state-of-the-art techniques.
Similar content being viewed by others
References
Bailloeul, T., Zhu, C. Z., & Xu, Y. H. (2008). Automatic image tagging as a random walk with priors on the canonical correlation subspace. In Proc. ACM MIR.
Chen, G., Song, Y., Wang, F., & Zhang, C. (2008). Semi-supervised multi-label learning by solving a sylvester equation. In SIAM international conference on data mining.
Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y. (2009). Nus-wide: A real-world web image database from national university of singapore. In Proc. CIVR.
Cui, B., Tung, A. K. H., Zhang, C., & Zhao, Z. (2010). Multiple feature fusion for social media applications. In Proc. ACM SIGMOD.
Datta, R., Joshi, D., Li, J., & Wang, J. (2008). Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys, 40(2), 1–60.
Fellbaum, C. (1998). WordNet: An electronic lexical database, illustrated edition. The MIT Press.
Feng, S., & Manmatha, R. (2008). A discrete direct retrieval model for image and video retrieval. In Proceeding of CIVR.
Geng, B., Yang, L., Xu, C., & Hua, X. (2008). Collaborative learning for image and video annotation. In Proc. ACM MIR (2008).
Huiskes, M., & Lew, M. (2008). The mir flickr retrieval evaluation. In Proc. ACM MIR.
Jiang, J., & Conrath, D. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. In Proc. international conference on research in computational linguistics.
Kindermann, R., & Snell, J. L. (1980). Markov random fields and their applications. American Mathematical Society, pp. 1–147. ISBN: 0821850016.
Lei, W., Linjun, Y., Nenghai, Y., & Hua, X. S. (2009a). Learning to tag. In Proc. ACM WWW.
Lei, W., Steven, C. H., Jin, H. R., Jianke, Z., & Nenghai, Y. (2009b). Distance metric learning from uncertain side information with application to automated photo tagging. In Proc. ACM multimedia.
Li, J., & Wang, J. Z. (2008). Real-time computerized annotation of pictures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 985–1002.
Li, X., Snoek, C. G. M., & Worring, M. (2009). Learning social tag relevance by neighbor voting. IEEE Transactions on Multimedia, 11(7), 1310–1322.
Liu, D., Hua, X.-S., Wang, M., & Zhang, H.-J. (2010). Image retagging. In Proc. ACM multimedia.
Liu, D., Wang, M., Hua, X. S., Zhang, H. J. (2009). Tag ranking. In Proc. ACM WWW.
Liu, J., Li, M., Ma, W.-Y., Liu, Q., & Lu, H. (2006). An adaptive graph model for automatic image annotation. In Proc. ACM MIR.
Liu, J., Wang, B., Li, M. J., Li, Z. W., Ma, W. Y., Lu, H. Q., et al. (2007). Dual cross-media relevance model for image annotation. In Proc. ACM Multimedia.
Lowe, D. (1999). Object recognition from local scale-invariant features. In Proc. ICCV.
Metzler, D. (2005). Direct maximization of rank-based metrics. Amherst: University of Massachusetts, (Technical report).
Metzler, D., & Bruce Croft, W. (2005). A markov random field model for term dependencies. In Proc. ACM SIGIR.
Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., & Zhang, H.-J. (2007). Correlative multi-label video annotation. In Proc. ACM multimedia.
Shen, Y., & Fan, J. P. (2010). Leveraging loosely-tagged images and inter-object correlations for tag recommendation. In Proc. ACM multimedia.
Siersdorfer, S., San Pedro, J., & Sanderson, M. (2009). Automatic video tagging using content redundancy. In Proc. ACM SIGIR.
Sigurbjornsson, B., & van Zwol, R. (2008). Flickr tag recommendation based on collective knowledge. In Proc. ACM WWW.
Smeulders, A., Worring, M., Santini, S., Gupta, A., & Jain, R. (2000). Content-based image retrieval at the end of the early years. TPAMI.
Sun, A. X., & Bhowmick, S. S. (2010). Quantifying tag representativeness of visual content of social images. In Proc. ACM multimedia.
Tang, J., Hua, X., Wang, M., Gu, Z., Qi, G., & Wu, X. (2009). Correlative linear neighborhood propagation for video annotation. IEEE Transactions on SMC, 39(2), 409–416.
Wu, F., Han, Y. H., Tian, Q., & Zhuang, Y. T. (2010). Multi-label boosting for image annotation by structural grouping sparsity. In Proc. ACM multimedia.
Wu, L., Li, M. J., Li, Z. W. , Ma, W. Y., & Yu, N. H. (2007). Visual language modeling for image classification. In Proc. MIR.
Xiang, Y., Zhou, X., Chua, T., & Ngo, C. (2009). A revisit of generative model for automatic image annotation using markov random fields. In CVPR.
Xiang, Y., Zhou, X. D. , Liu, Z. T., Chua, T.-S., & Ngo, C.-W. (2010). Semantic context modeling with maximal margin conditional random fields for automatic image annotation. In Proc. CVPR.
Zhou, X., Wang, M., Zhang, Q., Zhang, J., & Shi, B. (2007). Automatic image annotation by an iterative approach: Incorporating keyword correlations and region matching. In Proc. ACM CIVR.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (NO. 61170189 and NO. 61003111) and the fund of the State Key Laboratory of Software Development Environment (NO. SKLSDE-2011ZX-03). The authors would like to thank the Editors and the anonymous reviewers for their valuable comments and remarks on the previous versions of this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, X., Li, Z. & Chao, W. Tagging image by merging multiple features in a integrated manner. J Intell Inf Syst 39, 87–107 (2012). https://doi.org/10.1007/s10844-011-0184-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-011-0184-1