Abstract
Image annotation aims at predicting labels that can accurately describe the semantic information of images. In the past few years, many methods have been proposed to solve the image annotation problem. However, the predicted labels of the images by these methods are usually incomplete, insufficient and noisy, which is unsatisfactory. In this paper, we propose a new method denoted as 2PKNN-GSR (Group Sparse Reconstruction) for image annotation and label refinement. First, we get the predicted labels of the testing images using the traditional method, i.e., a two-step variant of the classical K-nearest neighbor algorithm, called 2PKNN. Then, according to the obtained labels, we divide the K nearest neighbors of an image in the training images into several groups. Finally, we utilize the group sparse reconstruction algorithm to refine the annotated label results which are obtained in the first step. Experimental results on three standard datasets, i.e., Corel 5K, IAPR TC12 and ESP Game, show the superior performance of the proposed method compared with the state-of-the-art methods.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-018-5925-5/MediaObjects/11042_2018_5925_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-018-5925-5/MediaObjects/11042_2018_5925_Fig2_HTML.gif)
Similar content being viewed by others
References
Bahmanyar R, Ambar MMD, Datcu M (2015) The semantic gap: an exploration of user and computer perspectives in earth observation images. IEEE Geosci Remote Sens Lett 12(10):2046–2050
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. Journal of Machine Learning Research 3:993–1022
Duygulu P, Barnard K, de Freitas JF, Forsyth DA (2002) Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. European Conference on Computer Vision 4:97–112
Feng S, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. Comput Vis Pattern Recognit 2:1002–1009
Fu H, Zhang Q, Qiu G (2012) Random forest for image annotation. European Conference on Computer Vision 2:86–99
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: IEEE 12th International Conference on Computer Vision. IEEE, pp 309–316
Han Y, Wu F, Tian Q, Zhuang Y (2012) Image Annotation by InputCOutput Structural Grouping Sparsity. IEEE Trans Image Process 21(6):3066–3079
Hong R, Wang M, Gao Y, Tao D, Li X, Wu X (2014) Image annotation by multiple-instance learning with discriminative feature mapping and selection. IEEE Trans Cybern 44(5):669–680
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 119–126
Lavrenko V, Manmatha R, Jeon J (2004) A model for learning the semantics of pictures. In: Advances in Neural Information Processing Systems, vol 16, pp 553–560
Li X, Snoek CGM, Worring M (2008) Learning tag relevance by neighbor voting for social image retrieval. Proceedings of 1st ACM international conference on multimedia information retrieval. ACM, pp 180–187
Lin Z, Ding G, Hu M, Wang J, Ye X (2013) Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1618–1625
Liu J, Li M, Liu Q, Lu H, Ma S (2009) Image annotation via graph learning. Pattern Recogn 42(2):218–228
Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. European Conference on Computer Vision 3:316–329
Moran S, Lavrenko V (2014) Sparse kernel learning for image annotation. Proceedings of international conference on multimedia retrieval, pp 113–120
Nakayama H (2011) Linear distance metric learning for large-scale generic image recognition. PhD thesis, The University of Tokyo
Putthividhya D, Attias HT, Nagarajan SS (2010) Supervised topic model for automatic image annotation. IEEE International Conference on Acoustics, Speech, & Signal Processing 1:1894–1897
Szummer M, Picard R (1998) Indoor-outdoor image classification. In: Proceedings of IEEE international workshop on Contentbased Access of Image and Video Database, pp 42–51
Tang J, Hong R, Yan S, Chua TS, Qi GJ, Jain R (2011) Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans Intell Syst Technol 2(2):1–15
Tang J, Shu X, Qi G, Li Z, Wang M, Yan S, Jain R (2016) Generalized Deep Transfer Networks for Knowledge Propagation in Heterogeneous Domains. CM Trans Multimed Comput Commun Appl 12(4s):68
Tang J, Shu X, Qi G, Li Z, Wang M, Yan S, Jain R (2016) ri-Clustered Tensor Completion for Social-Aware Image Tag Refinement. IEEE Transactions on Pattern Analysis Machine Intelligence. pp(99), pp 1-1
Tao D, Tang X, Li X, Wu X (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099
Verma Y, Jawahar C (2012) Image annotation using metric learning in semantic neighborhoods. European Conference on Computer Vision 3:836–849
Verma Y, Jawahar C (2013) Exploring SVM for image annotation in presence of confusing labels. British Machine Vision Conference 1:1–11
Von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: SIGCHI Conference on Human Factors in Computing Systems, pp 319–326
Yu J, Rui Y, Tao D (2014) Click Prediction for Web Image Reranking Using Multimodal Sparse Coding. IEEE Trans Image Process 23(5):2019–2032
Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2010) Automatic image annotation using group sparsity. Comput Vis Pattern Recognit 3:3312–3319
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ji, Q., Zhang, L., Shu, X. et al. Image annotation refinement via 2P-KNN based group sparse reconstruction. Multimed Tools Appl 78, 13213–13225 (2019). https://doi.org/10.1007/s11042-018-5925-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5925-5