Abstract
With the massive explosion of social multimedia community, social images have become very popular in our daily life. The image-associated labels are a valuable resource for automatic image annotation, but they tend to be unreliable. In this paper, we exploit the problem of image annotation from real-world community contributed images and their associated incorrect, insufficient, and personalized labels. We present SNTag, a novel semantic neighborhood learning method, on which image annotation task can be efficiently carried out in real-world scenario. First, we propose to use image-associated labels as the supervising information to guide the replenishment of training images, which enable the labels for training image not only more sufficient, but also more correct. Then, the “semantic balanced neighborhood” for image is generated, thus enabling the presence of more rare labels in image label list. Furthermore, we generate “semantic consistent neighborhood” within corresponding “semantic balanced neighborhood”. The retrieved neighbor images are not only visually alike but also semantically related. Contrary to earlier work, these neighbors are retrieved from the same subspace by the integration of metric learning embedded in multiple labels and sparse reconstruction. Based on the neighbor set, we propose a novel algorithm to assign the optimal labels to the image, which is more robust to noise. We conduct extensive experiments on several standard real-world benchmark data sets downloaded from community websites. The experimental results demonstrate that it outperforms the current state-of-the-art methods.













Similar content being viewed by others
References
Wang, M., Ni, B., Hua, X.S.: Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM Comput. Surv. 44, 25–25 (2012)
Hauptmann, A., Yan, R., Lin, W.H.: How many high-level concepts will fill the semantic gap in news video retrieval? ACM International Conference on Image and Video Retrieval, 627–634 (2007)
Li, X., Snoek, C.G.M., Worring, M.: Learning social tag relevance by neighbor voting. Multimed. IEEE Trans. 11, 1310–1322 (2009)
Golder, S.A., Huberman, B.A.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 198–208 (2006)
Matusiak, K.K.: Towards user-centered indexing in digital image collections. Oclc Syst. Serv. 22, 283–298 (2006)
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. Proc. ECCV 2008, 316–329 (2008)
Nguyen, N., Caruana, R.: Classification with partial labels. In: Proceedings of 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–559 (2008)
Fan, J., Shen, Y., Zhou, N.: Harvesting large-scale weakly-tagged image databases from the web. In: Proceedings of 23th IEEE Conference on Computer Vision and Pattern Recognition, pp. 802–809 (2010)
He, X., Zemel, R.S.: Learning hybrid models for image annotation with partially labeled data. In: Proceedings of 24th Annual Conference on Advances in Neural Information Processing Systems, pp. 625–632 (2009)
Bucak, S.S., Jin, R., Jain, A.K.: Multi-label learning with incomplete class assignments. In: Proceedings of 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2801–2808 (2011)
Duygulu, P., Barnard, K., Freitas, J.F.G.D., et al.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Comput. Vis. ECCV 2002, 97–112 (2002)
Barnard, K., Duygulu, P., Forsyth, D., et al.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)
Monay F., Gatica-Perez D.: PLSA-based image auto-annotation: constraining the latent space. In: Proceedings acm International Conference on Multimedia, pp. 348–351 (2004)
Yakhnenko, O., Honavar, V.: Annotating images and image objects using a hierarchical dirichlet process model. In: International Workshop on Multimedia Data Mining: Held in Conjunction with the ACM SIGKDD, pp. 23–43 (2008)
Socher, R., Li, F.F.: Connecting modalities: semi-supervised segmentation and annotation of images using unaligned text corpora. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 966–973 (2010)
Yavlinsky A, Schofield E, Rger S, automated image annotation using global features and robust nonparametric density estimation. Image Video Retr. 507–517 (2005)
Carneiro, G., Chan, A.B., Moreno, P.J., et al.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 29, 394–410 (2007)
Xiang, Y., Zhou, X., Chua, T.S., et al.: A revisit of generative model for automatic image annotation using Markov Random Fields. iN: IEEE Conference on Computer Vision AND Pattern Recognition, PP. 1153–1160 (2009)
Cusano, C., Ciocca, G., Schettini, R., Image annotation using SVM. Internet Imag. V, 330–338 (2003)
Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning distance functions for image retrieval. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 570–577 (2004)
Grangier, D., Bengio, S.: A discriminative kernel-based approach to rank images from text queries. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1371–1384 (2008)
Li, X., Snoek, C.G.M., Worring, M.: Learning social tag relevance by neighbor voting. Multimed. IEEE Trans. 11, 1310–1322 (2009)
Guillaumin, M., Mensink, T., Verbeek, J., Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: Proceedings of IEEE 12th International Conference on Computer Vision, pp. 309–316 (2009)
S.Zhang, J.Huang, Y.Huang, Automatic image annotation using group sparsity, Proc. of 23th IEEE Conference on Computer Vision and Pattern Recognition, 3312-3319(2010)
Chen M, Zheng A, Weinberger K Q. Fast image tagging, Proc.of International Conference on Machine Learning, 2013
Fergus, R., Weiss, Y., Torralba, A.: Semi-supervised learning in gigantic image collections. Adv Neural Inf. Proc. Syst. (2009)
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. In: International Conference on Computer Vision (2007)
Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: primal estimated sub-gradient solver for svm. Math. Program. 127, 3–30 (2011)
Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 569–856 (1986)
Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: Proceedings of 1st ACM International Conference on Multimedia Information Retrieval, pp. 39–43(2008)
Acknowledgements
Special thanks should go to the collaborators in the Lab for Media Search of National University of Singapore, for their instructive advice and useful suggestions on this work. I am deeply grateful of their help in the completion of this work. This work is supported by the Natural Science Foundation of China (Nos. 61502094, 61402099) and Natural Science Foundation of Heilongjiang Province of China (Nos. F2016002, F2015020).
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the Natural Science Foundation of China (Nos. 61502094, 61402099) and Natural Science Foundation of Heilongjiang Province of China (Nos. F2016002, F2015020).
Rights and permissions
About this article
Cite this article
Tian, F., Shen, X. & Shang, F. Automatic image annotation with real-world community contributed data set. Multimedia Systems 25, 463–474 (2019). https://doi.org/10.1007/s00530-017-0548-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-017-0548-7