Abstract
Automatic image annotation becomes a hot research area because of its efficiency on shrinking the semantic gap between images and their semantic meanings. We present a model referred as weight-KNN which firstly introduces the CNN feature to address the problem that traditional models only work well with well-designed manual feature representations. Additionally, in order to employ the simplicity and generality of the KNN-based model for annotation, the proposed model incorporates a multi-label linear discriminant approach to compute the weighting which improves the accuracy in the subsequent procedures of distance calculation. Moreover, we take the advantage of the KNN-based model to acquire the test image’s k-nearest neighbors in each label category and get the prediction of the image according to the contribution of its neighbors. At last, the experiments are performed on three typical image data sets, corel 5k, esp game and laprtc12, which verify the effectiveness of the proposed model.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60
Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval-SIGIR ’03, p 119
Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimed 17(11):1989–1999
Wang S, Chang X, Li X, Long G, Yao L, Sheng QZ (2016) Diagnosis code assignment using sparsity-based disease correlation embedding. IEEE Trans Knowl Data Eng 28(12):3191–3202
Wang S, Li X, Yao L, Sheng QZ, Long G (2017) Learning multiple diagnosis codes for ICU patients with local disease correlation mining. ACM Trans Knowl Discov Data 11(3):1–21
Zhu X, Suk HI, Huang H, Shen D (2017) Low-rank graph-regularized structured sparse regression for identifying genetic biomarkers. IEEE Trans Big Data 3(4):1–1
Lavrenko V, Manmatha R, Jeon J (2003) A model for learning the semantics of pictures. In: Advances in neural information processing systems, pp 553–560
Feng SL, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, vol 2, pp 1002–1009
Ghoshal A, Ircing P, Khudanpur S (2005) Hidden Markov models for automatic annotation and content-based retrieval of images and video. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pp 544–551
Cusano C, Ciocca G, Schettini R (2003) Image annotation using SVM. Int Soc Opt Photonics 5304(1):330–338
Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741
Guillaumin M, Mensink T, Verbeek J, Schmid C (2010) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th international conference on computer vision
Makadia A, Pavlovic V, Kumar S (2010) A new baselines for image annotation. Int J Comput Vis 90:88–105
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, pp 1–14
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Li Z, Tang J (2015) Deep matrix factorization for social image tag refinement and assignment. In: 2015 IEEE 17th international workshop on multimedia signal processing (MMSP), pp 1–6
Li Z, Tang J (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288
Li Z, Tang J, Mei T (2018) Deep collaborative embedding for social image understanding. IEEE Trans Pattern Anal Mach Intell 39:417–429
Li Z, Liu J, Xu C, Lu H (2013) Mlrank: multi-correlation learning to rank for image annotation. Pattern Recognit 46(10):2700–2710
Zhang X, Liu C (2015) Image annotation based on feature fusion and semantic similarity. Neurocomputing 149(PC):1658–1671
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: International conference on machine learning, vol 32, pp 647–655
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1717–1724
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 512–519
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 8689 LNCS (PART 1), pp 818–833
Duygulu P, Barnard K, de Freitas JF, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: European conference on computer vision, pp 97–112
Von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the 2004 conference on Human factors in computing systems-CHI ’04, pp 319–326
Grubinger M, Clough P, Müller H, Deselaers T (2006) The iapr tc-12 benchmark: a new evaluation resource for visual information systems. In: LREC workshop OntoImage language resources for content-based image retrieval, pp 13–23
Mori Y, Takahashi H, Oka R (1999) Image-to-word transformation based on dividing and vector quantizing images with words. In: First international workshop on multimedia intelligent storage and retrieval management, pp 1–9
Yu F, Ip HH (2006) Automatic semantic annotation of images using spatial hidden Markov model. In: 2006 IEEE international conference on multimedia and expo, pp 305–308
Lei Y, Wong W, Liu W, Bennamoun M (2010) An HMM-SVM-based automatic image annotation approach. In: Asian conference on computer vision, pp 115–126
Li Z, Liu J, Tang J, Lu H (2015) Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Intell 37(10):2085–2098
Li Z, Tang J, He X (2018) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Syst 29(5):1947–1960
Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph PCA hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044
Luo Y, Yang Y, Shen F, Huang Z, Zhou P, Shen HT (2018) Robust discrete code modeling for supervised hashing. Pattern Recognit 75:128–135
Yang Y, Shen F, Shen HT, Li H, Li X (2015) Robust discrete spectral hashing for large-scale image semantic indexing. IEEE Trans Big Data 1(4):162–171
Zhu X, Li X, Zhang S, Ju C, Wu X (2016) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Systems 1:1–13
Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2010) Automatic image annotation using group sparsity. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR), pp 3312–3319
Yu W, Yang K, Bai Y, Xiao T, Yao H, Rui Y (2016) Visualizing and comparing AlexNet and VGG using deconvolutional layers. In: Proceedings of the 33rd international conference on machine learning
Rajaraman S, Antani SK, Poostchi M, Silamut K, Hossain MA, Maude RJ, Jaeger S, Thoma GR (2018) Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ 6:e4568
Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York
Yu H, Yang J (2001) A direct lda algorithm for high dimensional data with application to face recognition. Pattern Recognit 34(10):2067–2070
Acknowledgements
This work is partially supported by Natural Science Foundation of China (Grant No. 61602353), Natural Science Foundation of Hubei Province (Grant No. 2017CFB505) and the Fundamental Research Funds for the Central Universities (WUT:2017YB028).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ma, Y., Xie, Q., Liu, Y. et al. A weighted KNN-based automatic image annotation method. Neural Comput & Applic 32, 6559–6570 (2020). https://doi.org/10.1007/s00521-019-04114-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-019-04114-y