Skip to main content
Log in

A weighted KNN-based automatic image annotation method

  • Multi-Source Data Understanding (MSDU)
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Automatic image annotation becomes a hot research area because of its efficiency on shrinking the semantic gap between images and their semantic meanings. We present a model referred as weight-KNN which firstly introduces the CNN feature to address the problem that traditional models only work well with well-designed manual feature representations. Additionally, in order to employ the simplicity and generality of the KNN-based model for annotation, the proposed model incorporates a multi-label linear discriminant approach to compute the weighting which improves the accuracy in the subsequent procedures of distance calculation. Moreover, we take the advantage of the KNN-based model to acquire the test image’s k-nearest neighbors in each label category and get the prediction of the image according to the contribution of its neighbors. At last, the experiments are performed on three typical image data sets, corel 5k, esp game and laprtc12, which verify the effectiveness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60

    Google Scholar 

  2. Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Google Scholar 

  3. Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval-SIGIR ’03, p 119

  4. Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimed 17(11):1989–1999

    Google Scholar 

  5. Wang S, Chang X, Li X, Long G, Yao L, Sheng QZ (2016) Diagnosis code assignment using sparsity-based disease correlation embedding. IEEE Trans Knowl Data Eng 28(12):3191–3202

    Google Scholar 

  6. Wang S, Li X, Yao L, Sheng QZ, Long G (2017) Learning multiple diagnosis codes for ICU patients with local disease correlation mining. ACM Trans Knowl Discov Data 11(3):1–21

    Google Scholar 

  7. Zhu X, Suk HI, Huang H, Shen D (2017) Low-rank graph-regularized structured sparse regression for identifying genetic biomarkers. IEEE Trans Big Data 3(4):1–1

    Google Scholar 

  8. Lavrenko V, Manmatha R, Jeon J (2003) A model for learning the semantics of pictures. In: Advances in neural information processing systems, pp 553–560

  9. Feng SL, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, vol 2, pp 1002–1009

  10. Ghoshal A, Ircing P, Khudanpur S (2005) Hidden Markov models for automatic annotation and content-based retrieval of images and video. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pp 544–551

  11. Cusano C, Ciocca G, Schettini R (2003) Image annotation using SVM. Int Soc Opt Photonics 5304(1):330–338

    Google Scholar 

  12. Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741

    MATH  Google Scholar 

  13. Guillaumin M, Mensink T, Verbeek J, Schmid C (2010) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th international conference on computer vision

  14. Makadia A, Pavlovic V, Kumar S (2010) A new baselines for image annotation. Int J Comput Vis 90:88–105

    Google Scholar 

  15. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Google Scholar 

  16. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, pp 1–14

  17. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  18. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  19. Li Z, Tang J (2015) Deep matrix factorization for social image tag refinement and assignment. In: 2015 IEEE 17th international workshop on multimedia signal processing (MMSP), pp 1–6

  20. Li Z, Tang J (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288

    MathSciNet  MATH  Google Scholar 

  21. Li Z, Tang J, Mei T (2018) Deep collaborative embedding for social image understanding. IEEE Trans Pattern Anal Mach Intell 39:417–429

    Google Scholar 

  22. Li Z, Liu J, Xu C, Lu H (2013) Mlrank: multi-correlation learning to rank for image annotation. Pattern Recognit 46(10):2700–2710

    MATH  Google Scholar 

  23. Zhang X, Liu C (2015) Image annotation based on feature fusion and semantic similarity. Neurocomputing 149(PC):1658–1671

    Google Scholar 

  24. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: International conference on machine learning, vol 32, pp 647–655

  25. Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1717–1724

  26. Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 512–519

  27. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 8689 LNCS (PART 1), pp 818–833

    Google Scholar 

  28. Duygulu P, Barnard K, de Freitas JF, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: European conference on computer vision, pp 97–112

  29. Von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the 2004 conference on Human factors in computing systems-CHI ’04, pp 319–326

  30. Grubinger M, Clough P, Müller H, Deselaers T (2006) The iapr tc-12 benchmark: a new evaluation resource for visual information systems. In: LREC workshop OntoImage language resources for content-based image retrieval, pp 13–23

  31. Mori Y, Takahashi H, Oka R (1999) Image-to-word transformation based on dividing and vector quantizing images with words. In: First international workshop on multimedia intelligent storage and retrieval management, pp 1–9

  32. Yu F, Ip HH (2006) Automatic semantic annotation of images using spatial hidden Markov model. In: 2006 IEEE international conference on multimedia and expo, pp 305–308

  33. Lei Y, Wong W, Liu W, Bennamoun M (2010) An HMM-SVM-based automatic image annotation approach. In: Asian conference on computer vision, pp 115–126

    Google Scholar 

  34. Li Z, Liu J, Tang J, Lu H (2015) Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Intell 37(10):2085–2098

    Google Scholar 

  35. Li Z, Tang J, He X (2018) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Syst 29(5):1947–1960

    MathSciNet  Google Scholar 

  36. Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph PCA hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044

    Google Scholar 

  37. Luo Y, Yang Y, Shen F, Huang Z, Zhou P, Shen HT (2018) Robust discrete code modeling for supervised hashing. Pattern Recognit 75:128–135

    Google Scholar 

  38. Yang Y, Shen F, Shen HT, Li H, Li X (2015) Robust discrete spectral hashing for large-scale image semantic indexing. IEEE Trans Big Data 1(4):162–171

    Google Scholar 

  39. Zhu X, Li X, Zhang S, Ju C, Wu X (2016) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Systems 1:1–13

    Google Scholar 

  40. Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2010) Automatic image annotation using group sparsity. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR), pp 3312–3319

  41. Yu W, Yang K, Bai Y, Xiao T, Yao H, Rui Y (2016) Visualizing and comparing AlexNet and VGG using deconvolutional layers. In: Proceedings of the 33rd international conference on machine learning

  42. Rajaraman S, Antani SK, Poostchi M, Silamut K, Hossain MA, Maude RJ, Jaeger S, Thoma GR (2018) Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ 6:e4568

    Google Scholar 

  43. Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York

    MATH  Google Scholar 

  44. Yu H, Yang J (2001) A direct lda algorithm for high dimensional data with application to face recognition. Pattern Recognit 34(10):2067–2070

    MATH  Google Scholar 

Download references

Acknowledgements

This work is partially supported by Natural Science Foundation of China (Grant No. 61602353), Natural Science Foundation of Hubei Province (Grant No. 2017CFB505) and the Fundamental Research Funds for the Central Universities (WUT:2017YB028).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing Xie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Y., Xie, Q., Liu, Y. et al. A weighted KNN-based automatic image annotation method. Neural Comput & Applic 32, 6559–6570 (2020). https://doi.org/10.1007/s00521-019-04114-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04114-y

Keywords

Navigation