Skip to main content

Advertisement

Log in

A review on visual content-based and users’ tags-based image annotation: methods and techniques

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the current era of digital communication, the use of images is growing exponentially since they are one of the best ways of expressing, sharing and memorizing knowledge. In fact, images can be used in various real-world applications, like biology, medical diagnosis, space research, remote sensing, etc. However, finding the most relevant images that meet the users’ needs is a challenging task, especially when the search is performed over gigantic amounts of images. This has led to the emergence of several image retrieval studies during the past two decades. Typically, research studies in this area were focused on the Content-based Image Retrieval (CBIR). However, extensive research have proved that there is a ‘semantic gap’ between the visual information captured by the imaging devices and the image semantics understandable by humans. As an alternative, researchers’ efforts have been oriented towards the Text-based Image Retrieval (TBIR). Indeed, TBIR is a typical method that helps bridge the issue of ‘semantic gap’ between the low-level image features and the high-level image semantics. Its policy consists in associating textual descriptions with the images, which constitute the focus of the research queries later on. In this paper, we analyze various image annotation methods, namely: Visual Content-based and Users’ Tags-based Image Annotation Methods. In particular, we focus on the visual content-based image annotation techniques since they are one of the dynamic research fields nowadays.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. http://kspace.cdvp.dcu.ie/public/interactive-segmentation/index.html

  2. http://www.ubio.org/index.php?pagename=xml_services

  3. http:// www.dbpedia.org/

References

  1. Abdel-Hamid O, Mohamed AR, Jiang H, Deng L, Penn G, Yu D (2014) Convolutional neural networks for speech recognition. IEEE/ACM transactions on audio, speech, and language processing. IEEE/ACM 22(10):1533–1545

    Google Scholar 

  2. Abioui H, Idarrou A, Bouzit A, Mammass D: Review: Automatic Image Annotation for Semantic Image Retrieval. In: Proceedings of the 6th International Conference on Image and Signal Processing (ICISP), pp. 129-137. Springer, Cherbourg, France (2018)

  3. Abo-Zahhad M, Gharieb RR, Ahmed SM, Donkol AAEB (2014) Edge detection with a preprocessing approach. Journal of Signal and Information Processing (JSIP) 5(4):123–134

  4. Adebayo S, McLeod K, Tudose I, Osumi-Sutherland D, Burdett T, Baldock R, Parkinson H (2016) PhenoImageShare: an image annotation and query infrastructure. Journal of Biomedical Semantics 7(1):35–44

    Google Scholar 

  5. Ajala Funmilola A, Oke OA, Adedeji TO, Alade OM, Adewusi E (2012) A: fuzzy k-means clustering algorithm for medical image segmentation. Journal of Information Engineering and Applications 2(6):21–32

    Google Scholar 

  6. Akbulut Y, Sengur A, Guo Y, Smarandache F (2017) NS-k-NN: Neutrosophic set-based k-nearest Neighbors classifier. Symmetry 9(9):179

    Google Scholar 

  7. Alham N. K, Li M, Liu Y, Hammoud S, Ponraj M: A distributed SVM for scalable image annotation. In: Proceedings of the 8th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 2655-2658. IEEE, Shanghai, China (2011)

  8. Anees V M, Kumar G S, Sreeraj M: Automatic image annotation using SURF descriptors. In: Proceedings of the 2012 Annual IEEE India Conference (INDICON), pp. 920-924. IEEE, Kochi, India (2012)

  9. Aneja J, Deshpande A, Schwing A G: Convolutional image captioning. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5561–5570. IEEE, Honolulu, HI, USA (2017)

  10. Angelina S, Suresh L P, Veni S K: Image segmentation based on genetic algorithm for region growth and region merging. In: Proceedings of the 2012 IEEE International Conference on Computing, Electronics and Electrical Technologies (ICCEET), pp. 970-974. IEEE, Kumaracoil, India (2012)

  11. Anjna EA, Er RK (2017) Review of image segmentation technique. Int J Adv Res Comput Sci 8(4):36–39

    Google Scholar 

  12. Appels R, Nystrom-Persson J, Keeble-Gagnere G (2014) Advances in genome studies in plants and animals. Functional et Integrative Genomics Springer 14(1):1–9

    Google Scholar 

  13. Arellano G, Sucar L E, Morales E F: Automatic image annotation using multiple grid segmentation. In: Proceedings of the Mexican International Conference on Artificial Intelligence (MICAI), pp. 278-289. Springer, Pachuca (2010)

  14. Arun K. Pujari, Data mining techniques-a reffrence book ,pg. no.-114-147 (2013)

  15. Atlam HF, Attiya G, El-Fishawy N (2017) Integration of color and texture features in CBIR system. Int J Comput Appl 164(3):23–29

    Google Scholar 

  16. Ayadi Y, Amous I, Gargouri F (2013) Toward an automatic annotation approach based on ontological enrichment for advanced research. International Journal of Engineering et Technology (IJET-IJENS) 13(2):80–89

    Google Scholar 

  17. Badrinarayanan V, Kendall A, Cipolla R: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. CoRR, abs/1511.00561 (2015)

  18. Bay H, Tuytelaars T, Van Gool L: Surf: Speeded up robust features. In: Proceedings of the 9th European Conference on Computer Vision (ECCV), pp. 404– 417. Springer, Graz, Austria (2006)

  19. Belkhatir M (2009) An operational model based on knowledge representation for querying the image content with concepts and relations. Multimedia Tools and Applications Springer 43(1):1–23

    Google Scholar 

  20. Bell S., Upchurch P, Snavely N, Bala K: Material recognition in the wild with the materials in context database. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3479-3487. IEEE, Boston, MA, USA (2015)

  21. Bergeaud F, Mallat S: Matching pursuit of images. In: Proceedings of the 1995 IEEE International Conference on Image Processing (ICIP), pp. 53-56. IEEE, Washington, DC, USA (1995)

  22. Bhatt H S, Bharadwaj S, Singh R, Vatsa M: On matching sketches with digital face images. In: Proceedings of the 4th International Conference on Biometrics Theory Applications and Systems (BTAS), pp. 1-7. IEEE, Washington, DC, USA (2010)

  23. Bhende P, Cheran, AN.: Content based image retrieval in Medical Imaging. International Journal of Computational Engineering and Research. (IJCER). 3(8), 10-15 (2013)

  24. Blei D M, Jordan M I: Modeling annotated data. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 127-134. ACM, Toronto, Canada (2003)

  25. Bobade KB, Jagtap SV (2014) Automatic image annotation by classification using SIFT features. International Journal of Scientific Research Engineering & Technology 3(3):713–720

    Google Scholar 

  26. Bouchakwa M, Ayadi Y, Amous I: Modeling the semantic content of the socio-tagged images based on the extended conceptual graphs formalism. In: Proceedings of the 14th International Conference on Advances in Mobile Computing and MultiMedia (MOMM), pp. 35-39. ACM, Singapore (2016)

  27. Bouchakwa M, Ayadi Y, Amous I: Semantic Pattern-based Automatic Annotation Process of Images Shared on Social Networks. In: Proceedings of the 30th IBIMA Conference (IBIMA), pp. 19. Madrid, Spain (2017)

  28. Bouchakwa M, Ayadi Y, Amous I: Multi-level diversification approach of semantic-based image retrieval results. Progress in Artificial Intelligence (PAI). 1-30 (2019)

  29. Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern recognition Elsevier science 37(9):1757–1771

    Google Scholar 

  30. Bovik AC, Clark M, Geisler WS (1990) Multichannel texture analysis using localized spatial filters. IEEE transactions on pattern analysis machine intelligence. (TPAMI). IEEE 12(1):55–73

    Google Scholar 

  31. Boykov Y Y, Jolly M P: Interactive graph cuts for optimal boundary et region segmentation of objects in ND images. In: Proceedings of the 8th IEEE International Conference on Computer Vision (ICCV), pp. 105-112. IEEE, Vancouver, Canada (2001)

  32. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Chapman&Hall (Wadsworth). Monterey, California, USA

    Google Scholar 

  33. Cannon RL, Dave JV, Bezdek JC, Trivedi MM (1986) Segmentation of a thematic mapper image using the fuzzy c-means clusterng algorthm. IEEE transactions on geoscience and remote sensing (TGRS). IEEE 24(3):400–408

    Google Scholar 

  34. Carson C, Belongie S, Greenspan H, Malik J (2002) Blobworld: image segmentation using expectation-maximization and its application to image. IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE 24(8):1026–1038

    Google Scholar 

  35. Chakraborty A, Duncan JS (1999) Game-theoretic integration for image segmentation. IEEE transactions on pattern analysis and machine intelligence (PAMI). IEEE 21(1):12–30

    Google Scholar 

  36. Chan TF, Vese LA (2001) Active contours without edges. IEEE transactions on image processing (TIP). IEEE 10(2):266–277

    MATH  Google Scholar 

  37. Chang T, Kuo CC (1993) Texture analysis and classification with tree-structured wavelet transform. IEEE transactions on image processing (TIP). IEEE 2(4):429–441

    Google Scholar 

  38. Chapelle O, Haffner P, Vapnik VN (1999) Support vector machines for histogram-based image classification. IEEE Transactions on Neural Networks IEEE 10(5):1055–1064

    Google Scholar 

  39. Chathurani N W U D, Geva S, Chandran V, Cynthujah V: An effective content based image retrieval system based on global representation and multi-level searching. In: Proceedings of the 10th International Conference on Industrial and Information Systems (ICIIS), pp. 158-163. IEEE, Peradeniya, Sri Lanka (2015)

  40. Chaudhuri BB, Sarkar N (1995) Texture segmentation using fractal dimension. IEEE transactions on pattern analysis and machine intelligence (TPAMI). 17:1, 72–IEEE, 77

  41. Chen Y, Wang JZ (2002) A region-based fuzzy feature matching approach to content based image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE 24(9):1252–1267

    Google Scholar 

  42. Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. The Journal of Machine Learning Research (JMLR) ACM 5:913–939

    MathSciNet  Google Scholar 

  43. Xinlei Chen and C Lawrence Zitnick.: Mind’s eye: A recurrent visual representation for image caption generation. In: Proceedings of the 25th IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2422–2431. IEEE, Boston, MA, USA (2015)

  44. Chen X, Yuan X, Yan S, Tang J, Rui Y, Chua T S: Towards multi-semantic image annotation with graph regularized exclusive group lasso. In: Proceedings of the 19th ACM International Conference on Multimedia (MM), pp. 263-272. ACM, Scottsdale, AZ, USA (2011)

  45. Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L: Semantic image segmentation with deep convolutional nets and fully connected crfs. CoRR, abs/1412.7062 (2014)

  46. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. In: IEEE transactions on pattern analysis and machine intelligence (TPAMI). IEEE 40(4):834–848

    Google Scholar 

  47. Cheng Q, Zhang Q, Fu P, Tu C, Li S (2018) A survey and analysis on automatic image annotation. Pattern Recogn 79(2018):242–259

    Google Scholar 

  48. Chengjian S, Zhu S, Shi Z: Image annotation via deep neural network. In: Proceedings of the 14th IAPR International Conference on Machine Vision Applications (MVA), pp. 518-521. IEEE, Tokyo, Japan (2015)

  49. Choi D, Kim P: Automatic image annotation using semantic text analysis. In: Proceedings of the 7th International Conference on Availability, Reliability, and Security (ARES), pp. 479-487. Springer, Prague, Czech Republic (2012)

  50. Clerc M, Kennedy J (2002) The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE transactions on evolutionary computation (TEVC). IEEE 6(1):58–73

    Google Scholar 

  51. Cooper L, Walls RL, Elser J, Gandolfo MA, Stevenson DW, Smith B, Hiss M (2012) The plant ontology as a tool for comparative plant anatomy and genomic analyses. Plant Cell Physiol 54(2):1–23

    Google Scholar 

  52. Cross GR, Jain AK (1983) Markov random field texture models. IEEE transactions on pattern analysis and machine intelligence (TPAMI). IEEE 5(1):25–39

    Google Scholar 

  53. Cusano C, Ciocca G, Schettini R: Image annotation using SVM. In: International Society for Optics and Photonics (SPIE), pp. 330-339 (2003)

  54. Dai J, Li Y, He K, Sun J: R-fcn: Object detection via region-based fully convolutional networks. In: Proceedings of the 30th Advances in Neural Information Processing Systems (NIPS), pp. 379-387. Barcelona, Spain (2016)

  55. Dai B, Fidler S, Urtasun R, Lin D: Towards Diverse and Natural Image Descriptions via a Conditional GAN. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2989–2998. IEEE, Honolulu, HI, USA (2017)

  56. Dalal N, Triggs B: Histograms of Oriented Gradients for Human Detection. In: Proceedings of the 15th Computer Vision and Pattern Recognition (CVPR), pp. 886-893. IEEE, San Diego, CA, USA (2005)

  57. Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America A (JOSA A) 2(7):1160–1169

    Google Scholar 

  58. Deng Y, Manjunath BS (2001) Unsupervised segmentation of color-texture regions in images and video. IEEE transactions on pattern analysis and machine intelligence (TPAMI). IEEE 23(8):800–810

    Google Scholar 

  59. Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L.: Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248-255. IEEE, Miami, FL, USA (2009)

  60. Derin H, Elliott H, Cristi R, Geman D (1984) Bayes smoothing algorithms for segmentation of binary images modeled by Markov random fields. IEEE transactions on pattern analysis and machine intelligence (PAMI). IEEE 6(6):–707, 720

  61. Dharani T, Aroquiaraj I L: A survey on content based image retrieval. In: Proceedings of the 2013 IEEE International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME), pp. 485-490. IEEE, Tamilnadu, India (2013)

  62. Dimitrovski I, Kocev D, Loskovska S, Dzeroski S: Detection of Visual Concepts and Annotation of Images Using Predictive Clustering Trees. In : CLEF (Notebook Papers/LABs/Workshops), pp. 1-10 (2010)

  63. Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Machine learning Springer 29(2-3):103–130

    MATH  Google Scholar 

  64. Erhan D, Szegedy C, Toshev A, Anguelov D: Scalable object detection using deep neural networks. In: Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2147-2154. IEEE, Columbus, OH, USA (2014)

  65. Fan J, Gao Y, Luo H, et Xu G: Automatic image annotation by using concept-sensitive salient objects for image content representation. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 361-368. ACM, Sheffield, United Kingdom (2004)

  66. Fang H, Gupta S, Iandola F, Srivastava R K, Deng L, Dollár P, Lawrence Zitnick C: From captions to visual concepts and back. In: Proceedings of the 25th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1473-1482. IEEE, Boston, MA, USA (2015)

  67. Farhadi A, Hejrati M, Sadeghi M A, Young P, Rashtchian C, Hockenmaier J, Forsyth D: Every picture tells a story: Generating sentences from images. In: Proceedings of the 11th European Conference on Computer Vision (ECCV), pp. 15-29. Springer, Heraklion, Crete, Greece (2010)

  68. Feng H, Chua T S: A bootstrapping approach to annotating large image collection. In: Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 55-62. ACM, Berkeley, California (2003)

  69. Feng S L, Manmatha R, Lavrenko V: Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1002-1009. IEEE, Washington, DC, USA, (2004)

  70. Figueiredo J C, Neto F G M, de Paula I C: Contour-based feature extraction for image classification and retrieval. In: Proceedings of the 35th International Conference of the Chilean Computer Science Society (SCCC), pp. 1-7. IEEE, Valparaiso, Chile (2016)

  71. Franco-Lopez H, Ek AR, Bauer ME (2001) Estimation and mapping of forest stand density, volume, and cover type using the k-nearest neighbors method. Remote sensing of Environment Elsevier science 77(3):251–274

    Google Scholar 

  72. Fu C Y, Liu W, Ranga A, Tyagi A, Berg A C: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)

  73. Gan C, Gan Z, He X, Gao J, Deng L: Stylenet: Generating attractive visual captions with styles In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3137–3146. IEEE, Honolulu, HI, USA (2017)

  74. Gao YY, Yi-Xin YIN, Uozumi T (2010) A hierarchical image annotation method based on SVM and semi-supervised EM. Acta Automatica Sinica Elsevier science 36(7):960–967

    MATH  Google Scholar 

  75. Garcia-Garcia A, Orts-Escolano S., Oprea S, Villena-Martinez V, Garcia-Rodriguez J: A review on deep learning techniques applied to semantic segmentation. CoRR, abs/ 1704.06857 (2017)

  76. Geman S, Geman D: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). IEEE 20(6-5), 721-741 (1984)

  77. Ghahabi O, Hernando Pericás FJ (2018) Restricted Boltzmann machines for vector representation of speech in speaker recognition. Computer Speech and Language Elsevier science 47:16–29

    Google Scholar 

  78. Ghoshal A, Ircing P, Khudanpur S: Hidden Markov models for automatic annotation and content-based retrieval of images and video. In: Proceedings of the 28th annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 544-551. ACM Salvador, Brazil (2005)

  79. Girshick R: Fast r-cnn. In: Proceedings of the 15th IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448. IEEE, Santiago, Chile (2015)

  80. Girshick R, Donahue J, Darrell T, Malik J: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580-587. IEEE, Columbus, OH, USA (2014)

  81. Goh K S, Chang E Y, Li B: Using one-class and two-class SVMs for multiclass image annotation. IEEE Transactions on Knowledge and Data Engineering (TKDE). IEEE 17(10), 1333-1346 (2005)

  82. Göksu Ö, Aptoula E: Content based image retrieval of remote sensing images based on deep features. In: Proceedings of the 26th Signal Processing and Communications Applications Conference (SIU), pp. 1-4. IEEE, Izmir, Turkey (2018)

  83. Gong T, Li S, Tan C L: A semantic similarity language model to improve automatic image annotation. In: Proceedings of the 22nd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 197-203. IEEE, Arras, France (2010)

  84. Gong Y, Jia Y, Leung T, Toshev A, Ioffe S: Deep convolutional ranking for multilabel image annotation. CoRR, abs/1402.1128 (2013)

  85. Gong Y, Wang L, Hodosh M, Hockenmaier J, Lazebnik S: Improving image-sentence embeddings using large weakly annotated photo collections. In: Proceedings of the 13th European Conference on Computer Vision (ECCV), pp. 529-545. Springer, Zurich, Switzerland (2014)

  86. Grady L: Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). IEEE 28(11), 1768-1783 (2006)

  87. Grady L, Schwartz E L: Isoperimetric graph partitioning for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). IEEE 28(3), 469-475 (2006)

  88. Gu J, Wang G, Cai J, Chen T: An empirical study of language cnn for image captioning. In: Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), pp. 1231–1240. IEEE, Venice, Italy (2017)

  89. Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. ACM Sigmod Record ACM 27(2):73–84

    MATH  Google Scholar 

  90. Guillaumin M, Mensink T, Verbeek J, Schmid C: Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: Proceedings of the 12th International Conference on Computer Vision (ICCV), pp. 309-316. IEEE, Kyoto, Japan (2009)

  91. Guru D S, Sharath Y H, Manjunath S: Texture features and KNN in classification of flower images. International Journal of Computer Applications (IJCA), Special Issue on Recent Trends in Image Processing and Pattern Recognition. (1), 21-29 (2010)

  92. Halaschek-Wiener C, Golbeck J, Schain A, Grove M, Parsia B, Hendler J: Photostuff: An image annotation tool for the semantic web. In: Proceedings of the 4th International Semantic Web Conference (ISWC), pp. 6-10. Springer, Galway, Ireland (2005)

  93. Hambali H A, Abdullah S L S, Jamil N, Harun H: Fruit Classification using Neural Network Model. Journal of Telecommunication, Electronic and Computer Engineering (JTEC). 9(1-2), 43-46 (2017)

  94. Han Y, Qi X: A complementary svms-based image annotation system. In: Proceedings of the 2005 IEEE International Conference on Image Processing (ICIP), pp. 1185-1188. IEEE, Genoa, Italy (2005)

  95. Hanbury A: A survey of methods for image annotation. Journal of Visual Languages & Computing (JVLC). Elsevier science 19(5), 617-627 (2008)

  96. Haralick RM (1979) Statistical and structural approaches to texture. Proceedings of the IEEE IEEE 67(5):786–804

    Google Scholar 

  97. Harzallah H, Jurie F, Schmid C: Combining efficient object localization and image classification In : Proceedings of the 12th IEEE International Conference on Computer Vision (ICCV), pp. 237-244. IEEE, Kyoto, Japan (2009)

  98. Hastings S, Oster S, Langella S, Kurc TM, Pan T, Catalyurek UV, Saltz JH (2005) A grid-based image archival and analysis system. Journal of the American medical informatics association (JAMIA). Elsevier science 12(3):286–295

    Google Scholar 

  99. He X J, Zhang Y, Lok T M, Lyu M R: A new feature of uniformity of image texture directions coinciding with the human eyes perception. In: Proceedings of the 2nd International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 727-730. Springer, Changsha, China (2005)

  100. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Google Scholar 

  101. He K, Gkioxari G, Dollár P, Girshick R: Mask r-cnn. In: Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), pp. 2980-2988. IEEE Venice, Italy (2017)

  102. Hermanto A, Adji T B, Setiawan N A: Recurrent neural network language model for English-Indonesian Machine Translation: Experimental study. In: Proceedings of the 2015 International Conference on Science in Information Technology (ICSITech), pp. 132-136. IEEE, Yogyakarta, Indonesia (2015)

  103. Hiremath P S, Pujari J: Content based image retrieval using color, texture and shape features. In: Proceedings of the 15th International Conference on Advance Computing and Communications (ADCOM), pp. 780-784. IEEE, Guwahati, Assam (2007)

  104. Hodosh M, Young P, Hockenmaier J (2013) Framing image description as a ranking task: data, models and evaluation metrics. J Artif Intell Res 47(1):853–899

    MathSciNet  MATH  Google Scholar 

  105. Hollink L, Schreiber A T, Wielemaker J, Wielinga B J: Semantic annotation of image collections. p. 8 (2003)

  106. Hollink L, Nguyen G, Schreiber G, Wielemaker J, Wielinga B, Worring M: Adding spatial semantics to image annotations. In: Proceedings of the 4th International Workshop on Knowledge Markup and Semantic Annotation at ISWC, pp.31-40. Hiroshima, Japan (2004)

  107. Horvat M, Grbin A, Gledec G (2013) Labeling and retrieval of emotionally-annotated images using WordNet. International Journal of Knowledge-based and Intelligent Engineering Systems ACM 17(2):157–166

    Google Scholar 

  108. Hossain MD, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Computing Surveys (CSUR) 51(6):118–154

    Google Scholar 

  109. Huang Y F, Lu H Y: Automatic image annotation using multi-object identification. In: Proceedings of the 4th Pacific-Rim Symposium on Image and Video Technology (PSIVT), pp. 386-392. IEEE, Singapore (2010)

  110. Huang J, Kumar S R, Mitra M, Zhu W J, Zabih R: Image indexing using color correlograms. In: Proceedings of the 1997 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 762-768. IEEE, San Juan, Puerto Rico, USA (1997)

  111. Huang J, Liu H, Shen J, Yan S: Towards efficient sparse coding for scalable image annotation. In: Proceedings of the 21st ACM International Conference on Multimedia (MM), pp. 947-956. ACM, Barcelona, Spain (2013)

  112. Im D H, Park G D: STAG: semantic image annotation using relationships between tags. In: Proceedings of the 2013 International Conference on Information Science and Applications (ICISA), pp. 1-2. IEEE, Suwon, South Korea (2013)

  113. Im DH, Park GD (2015) Linked tag: image annotation using semantic relationships between image tags. Multimedia Tools and Applications Springer 74(7):2273–2287

    Google Scholar 

  114. Islam M M, Zhang D, Lu G: A geometric method to compute directionality features for texture images. In: Proceedings of the 2008 IEEE International Conference on Multimedia and Expo (ICME), pp. 1521–1524. IEEE, Hannover, Germany (2008)

  115. Islam M M, Zhang D, Lu G: Automatic categorization of image regions using dominant color based vector quantization. In: Proceedings of the 2008 IEEE Digital Image Computing: Techniques and Applications (DICTA), pp. 191–198. IEEE, Canberra, Australia (2008)

  116. Jaderberg M, Simonyan K, Zisserman A: Spatial transformer networks. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS), pp. 2017-2025. Montréal CANADA (2015)

  117. Jain AK, Vailaya A (1996) Image retrieval using color and shape. Pattern recognition Elsevier science 29(8):1233–1244

    Google Scholar 

  118. Jau-Ling S, Ling-Hwei C: Color image retrieval based on primitives of color moments. In: Proceedings of the 5th International Conference on Advances in Visual Information Systems (VISUAL), pp. 88-94. Springer, Hsin Chu, Taiwan (2002)

  119. Jeon J, Lavrenko V, Manmatha R: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 119-126. ACM, Toronto, Canada (2003)

  120. Jeong J W, Hong H K, Lee D H: i-TagRanker: an efficient tag ranking system for image sharing and retrieval using the semantic relationships between tags. Multimedia Tools and Applications. Springer 62(2), 51-478 (2013)

  121. Ji Q, Zhang L, Li Z: KNN-based Image Annotation by Collectively Mining Visual and Semantic Similarities. Transactions on Internet & Information Systems (KSII). 11(9), 4476-4490 (2017)

  122. Jia X, Gavves E, Fernando B, Tuytelaars T: Guiding the long-short term memory model for image caption generation. In: Proceedings of the 15th IEEE International Conference on Computer Vision (ICCV), pp. 2407–2415. IEEE, Santiago, Chile (2015)

  123. Jiang Z, He J, Guo P: Feature data optimization with LVQ technique in semantic image annotation. In: Proceedings of the 10th International Conference on Intelligent Systems Design and Applications (ISDA), pp. 906-911. IEEE, Cairo, Egypt (2010)

  124. Jiawei H, Michheline K: Data mining concepts and techniques-a reffrence book ,pg. no.-383-422

  125. Jin Y, Khan L, Wang L, Awad M: Image annotations by combining multiple evidence et wordnet. In: Proceedings of the 13th Annual ACM International Conference on Multimedia (MM), pp. 706-715. ACM, Singapore (2005)

  126. Jin J, Fu K, Cui R, Sha F, Zhang C: Aligning where to see and what to tell: image caption with region-based attention and scene factorization. arXiv preprint arXiv:1506.06272 (2015)

  127. Jing F, Li M, Zhang L, Zhang H J, Zhang B: Learning in region-based image retrieval. In: Proceedings of the 2nd International Conference on Image and Video Retrieval (CIVR), pp. 206-215. Springer, Urbana-Champaign, IL, USA (2003)

  128. Joachims T: Optimizing search engines using clickthrough data. In: Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 133-142. ACM, Edmonton, Alberta, Canada (2002)

  129. John G H, Langley P: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 338-345. ACM, Montréal, Canada (1995)

  130. Johnson J, Karpathy A, Fei-Fei L: Densecap: Fully convolutional localization networks for dense captioning. In: Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4565-4574. IEEE, Las Vegas, NV, USA (2016)

  131. Kalafi EY, Tan WB, Town C, Dhillon SK (2016) Automated identification of monogeneans using digital image processing and K-nearest neighbor approaches. BMC bioinformatics 17(19):511

    Google Scholar 

  132. Kamdi S, Krishna R K: Image segmentation and region growing algorithm. International Journal of Computer Technology and Electronics Engineering (IJCTEE). 2(1), 103-107 (2012)

  133. Karoui I, Fablet R, Boucher JM, Augustin JM (2010) Variational region-based segmentation using multiple texture statistics. IEEE Transactions on Image Processing (TIP) 19(12):3146–3156

    MathSciNet  MATH  Google Scholar 

  134. Karpathy A, Fei-Fei L: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the 25th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3128–3137. IEEE, Boston, MA, USA (2015)

  135. Karpathy A, Joulin A, Fei-Fei L.: Deep fragment embeddings for bidirectional image sentence mapping. In: Proceedings of the 29th Advances in Neural Information Processing Systems (NIPS), pp. 1889–1897. Montreal, Quebec, Canada (2014)

  136. Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer IEEE 32(8):68–75

    Google Scholar 

  137. Kass M, Witkin A, Terzopoulos D (1988) Snakes: active contour models. International Journal of Computer Vision Springer 1(4):321–331

    MATH  Google Scholar 

  138. Kaya Y, Kayci L (2014) Application of artificial neural network for automatic detection of butterfly species using color and texture features. The Visual Computer Elsevier science 30(1):71–79

    Google Scholar 

  139. Kendall A, Badrinarayanan V, Cipolla R: Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. CoRR, abs/1511.02680 (2015)

  140. Kennedy J, Eberhart R.: Particle swarm optimization. In: Proceedings of the 5th IEEE International Conference on Neural Networks (ICANN), pp. 1942-1948. IEEE, Paris, France (1995)

  141. Khan A, Deep S, Li J P, Kumar K, Shaikh R A, Hasan F: Vision prehension with CBIR for cloud robo. In: Proceedings of the 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing (ICCWAMTIP), pp. 293-296. IEEE, China, Sichuan Province (2014)

  142. Kiros, R., Szepesvári, C.: Deep representations and codes for image auto-annotation. In: Proceedings of 26th Annual Conference on Neural Information Processing Systems (NIPS), pp. 908-916. Lake Tahoe, Nevada, USA (2012)

  143. Kiros R, Salakhutdinov R, Zemel R: Multimodal neural language models. In: Proceedings of the 31st International Conference on Machine Learning (ICML), pp. 595–603. Beijing, China (2014)

  144. Kiros J R, Salakhutdinov R, Zemel R: Unifying visual-semantic embeddings with multimodal neural language models. In: Proceedings of the 28th Workshop on Neural Information Processing Systems (NIPS). Montreal, Quebec, Canada (2014)

  145. Krishnan KB, Ranga SP (2017) Guptha. N: A Survey on Different Edge Detection Techniques for Image Segmentation Indian Journal of Science and Technology 10(4):1–8

    Google Scholar 

  146. Krizhevsky A, Sutskever I, Hinton G E: Imagenet classification with deep convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS), pp. 1097-1105 (2012)

  147. Ksibi A, Ammar A B, Amar C B: Effective concept detection using second order co-occurence flickr context similarity measure socfcs. In: Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1-6. IEEE, Annecy, France (2012)

  148. Kulkarni G, Premraj V, Dhar S, Li S, Choi Y, Berg A C, Berg T L.: Baby talk: Understanding and generating image descriptions. In: Proceedings of the 24th Computer Vision and Pattern Recognition (CVPR), pp. 1601-1608. IEEE, Colorado Springs, CO, USA (2011)

  149. Kumar K K: CBIR: Content based image retrieval. In: Proceedings of the 2010 National Conference on Recent Trends in information/ Network Security (NCRTNS), pp. 36-43 (2010)

  150. Kuroda K, Hagiwara M (2002) An image retrieval system by impression words and specific object names–IRIS. Neurocomputing Elsevier science 43(1-4):259–276

    MATH  Google Scholar 

  151. Kurtz C, Rubin D L: Utilisation de relations ontologiques pour la comparaison d’images décrites par des annotations sémantiques, In: Proceedings of the 14th Conference on Knowledge Extraction and Management (EGC), pp. 609-614. Rennes (2014)

  152. Kwitt, R., Vasconcelos, N., Rasiwasia, N., Uhl, A., Davis, B., Häfner, M., Wrba, F.: Endoscopic image analysis in semantic space. Medical Image Analysis (MIA). 16(7), 1415-1422 (2012)

  153. Laine A, Fan J: Texture classification by wavelet packet signatures. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). IEEE 15(11), 1186-1191 (1993)

  154. Lavrenko V, Manmatha R, Jeon J: A model for learning the semantics of pictures. In: Proceedings of the 16th International Conference on Neural Information Processing Systems (NIPS), pp. 553-560. ACM, Whistler, British Columbia, Canada (2003)

  155. Law H, Deng J: Cornernet: Detecting objects as paired keypoints. In: Proceedings of the 15th European Conference on Computer Vision (ECCV), pp. 734-750. Springer, Munich, Germany (2018)

  156. Lazebnik S, Schmid C, Ponce J: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2169-2178. IEEE, New York, NY, USA (2006)

  157. Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. WordNet: An electronic lexical database ACM 49(2):265–283

    Google Scholar 

  158. Lei Y, Wong W, Liu W, Bennamoun M: An HMM-SVM-based automatic image annotation approach. In: Proceedings of the 10th Asian Conference on Computer Vision (ACCV), pp. 115-126. Springer, Queenstown, New Zealand (2010)

  159. Levine M: Vision in Man and Machine, McGraw-Hill (1985)

  160. Lew M S, Sebe N, Djeraba C, Jain R: Content-based multimedia information retrieval: state of the art and challenges. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM). ACM 2(1), 1–19 (2006)

  161. Li B, Goh K: Confidence-based dynamic ensemble for image annotation and semantics discovery. In: Proceedings of the 11th ACM International Conference on Multimedia (MM), pp. 195-206. ACM, Berkeley, CA, USA (2003)

  162. Li J, Wang J Z, Wiederhold G: IRM: Integrated region matching for image retrieval. In: Proceedings of the 8th ACM international conference on Multimedia, pp. 147-156. ACM, Marina del Rey, California, USA (2000)

  163. Li S, Kulkarni G, Berg T L, Berg A C, Choi Y: Composing simple image descriptions using web-scale n-grams. In: Proceedings of the 15th Conference on Computational Natural Language Learning (CoNLL), pp. 220-228. ACM, Portland, Oregon (2011)

  164. Li T, Cheng B, Ni B, Liu G, Yan S: Multitask low-rank affinity graph for image segmentation and image annotation. ACM Transactions on Intelligent Systems and Technology (TIST). 7(4), 1-18 (2016)

  165. Li Y D, Hao Z B, Lei H: Survey of convolutional neural network. International Journal of Computer Applications (IJCA). 36(9), 2508-2515 (2016)

  166. Lin D: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning (LCML), pp. 296-304. ACM, San Francisco, CA, USA (1998)

  167. Lingutla NT, Preece J, Todorovic S, Cooper L, Moore L, Jaiswal P (2014) AISO: annotation of image segments with ontologies. Journal of Biomedical Semantics Springer 5(1):50–54

    Google Scholar 

  168. Liu Y, Zhang D, Lu G, Ma W Y: Region-based image retrieval with perceptual colors. In: Proceedings of the 5th Pacific-Rim Conference on Multimedia (PCM), pp. 931-938. Springer, Tokyo, Japan (2004)

  169. Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recognition Elsevier science 40(1):262–282

    MATH  Google Scholar 

  170. Liu D, Hua X S, Wang M, Zhang H J: Image retagging. In: Proceedings of the 18th ACM International Conference on Multimedia (MM), pp. 491-500. ACM, Firenze, Italy (2010)

  171. Liu W, Ji R, Li S: Towards 3d object detection with bimodal deep boltzmann machines over rgbd imagery. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3013-3021. IEEE, Boston, MA, USA (2015)

  172. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, Berg A C: Ssd: Single shot multibox detector. In: Proceedings of the 14th European Conference on Computer Vision (ECCV), pp. 21-37. Springer, Cham (2016)

  173. Long F, Zhang H, Feng D D: Fundamentals of content-based image retrieval. In: Proceedings of 2003 International Conference on Multimedia Information Retrieval and Management (MIRM), pp. 1-26. Springer, Berlin, Heidelberg (2003)

  174. Long J, Shelhamer E, Darrell T: Fully convolutional networks for semantic segmentation. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431-3440. IEEE, Boston, MA, USA (2015)

  175. Long J, Shelhamer E, Darrell T: Fully convolutional networks for semantic segmentation. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431-3440. IEEE, Las Vegas, NV, USA (2015)

  176. Low W C, Chua T S: Colour-based relevance feedback for image retrieval. In: Proceedings of the 1998 IEEE International Workshop on Multi-Media Database Management Systems, pp. 116-123. IEEE, Dayton, OH, USA (1998)

  177. Lowe D G: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV), pp. 1150–1157. IEEE, Kerkyra, Corfu, Greece (1999)

  178. Lu CS, Chung PC, Chen CF (1997) Unsupervised texture segmentation via wavelet transform. Pattern Recognition Elsevier science 30(5):729–742

    Google Scholar 

  179. Lu H, Zheng Y, Xue X, Zhang Y: Content and context-based multi-label image annotation. In: Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR), pp. 61-68. IEEE, Miami, FL, USA (2009)

  180. Lu J, Xiong C, Parikh D, Socher R: Knowing when to look: Adaptive attention via A visual sentinel for image captioning. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3242–3250. IEEE, Honolulu, HI, USA (2017)

  181. Magesh N, Thangaraj P: Semantic image retrieval based on ontology and SPARQL query. In: Proceedings of the 2nd International Conference on Advanced Computer Technology (ICACT), pp. 12-16. IEEE, Gangwon-Do, Korea (2011)

  182. Makadia A, Pavlovic V, Kumar S: A new baseline for image annotation. In: Proceedings of the 10th European Conference on Computer Vision (ECCV), pp. 316-329. Springer, Marseille, France (2008)

  183. Mallat S G: Multifrequency channel decompositions of images and wavelet models. IEEE Transactions on Acoustics, Speech, and Signal Processing. IEEE 37(12), 2091-2110 (1989)

  184. Mallat S, Zhang Z: Matching pursuit with time-frequency dictionaries. IEEE Transactions on Signal Processing (TSP). IEEE 41(12), 3397-3415 (1993)

  185. Manjunath B S, Ohm J R, Vasudevan V V, Yamada A: Color and texture descriptors. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT). IEEE 11(6), 703-715 (2001)

  186. Manjunath BS, Salembier P, Sikora T (2002) Introduction to MPEG-7: multimedia content description interface. John Wiley & Sons

  187. Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT press, Cambridge, MA, USA

    MATH  Google Scholar 

  188. Mao J, Xu W, Yang Y, Wang J, Yuille A L: Explain images with multimodal recurrent neural networks. arXiv preprint arXiv:1410.1090 (2014)

  189. Mao J, Xu W, Yang Y, Wang J, Huang Z, Yuille A: Deep captioning with multimodal recurrent neural networks (m-rnn). In: Proceedings of the 3rd International Conference on Learning Representations (ICLR). San Diego, CA, USA (2015)

  190. Maree R, Geurts P, Piater J, Wehenkel L: Random subwindows for robust image classification. In: Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 34-40. IEEE, San Diego, CA, USA (2005)

  191. Materka A, Strzelecki M: Texture analysis methods–a review. Technical university of lodz, institute of electronics, COST B11 report, Brussels, 9-11 (1998)

  192. Mathews A P, Xie L, He X: SentiCap: Generating Image Descriptions with Sentiments. In: Proceedings of the 30th Association for the Advancement of Artificial Intelligence (AAAI), pp. 3574–3580. Phoenix, Arizona, USA (2016)

  193. Mayhew M B, Chen B, Ni K S: Assessing semantic information in convolutional neural network representations of images via image annotation. In: Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), pp. 2266-2270. IEEE, Phoenix, AZ, USA (2016)

  194. Mezaris V, Kompatsiaris I, Strintzis M G: An ontology approach to object-based image retrieval. In: Proceedings of the 2003 IEEE International Conference on Image Processing (ICIP), pp. 511-514. IEEE, Barcelona, Spain (2003)

  195. Mezaris V, Kompatsiaris I, Strintzis MG (2004) Region-based image retrieval using an object ontology and relevance feedback. EURASIP Journal on Advances in Signal Processing Springer 2004(6):886–901

    Google Scholar 

  196. Mitran M, Mihalcea R, Cabanac G, Boughanem M: Landmark image annotation using textual and geolocation metadata. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval (OAIR), pp. 65-68. ACM, Lisbon, Portugal (2013)

  197. Miyamori H, Iisaku S I: Video annotation for content-based retrieval using human behavior analysis and domain knowledge. In: Proceeding of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 320-325. IEEE, Grenoble, France (2000)

  198. Mori Y, Takahashi H, Oka R: Image-to-word transformation based on dividing and vector quantizing images with words. In: Proceedings of the 1st International Workshop on Multimedia Intelligent Storage and Retrieval Management (MISRM), pp. 1-9. ACM, Orlando, Florida (1999)

  199. Mousselly-Sergieh H, Egyed-Zsigmond E, Gianini G, Döller M, Kosch H, Pinon J M: Tag similarity in folksonomies. In: Proceedings of the XXXI INFORSID congress, pp. 319-334 (2013)

  200. Muda Z, Lewis P H, Payne T R, Weal M J: Enhanced image annotations based on spatial information extraction and ontologies. In: Proceedings of the 2009 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pp.173-178. IEEE, Kuala Lumpur, Malaysia (2009)

  201. Murthy V N, Can E F, Manmatha R: A hybrid model for automatic image annotation. In: Proceedings of the 4th International Conference on Multimedia Retrieval (ICMR), pp. 369). ACM, Glasgow, UK (2014)

  202. Murthy V N, Maji S, Manmatha R: Automatic image annotation using deep learning representations. In: Proceedings of the 5th ACM International Conference on Multimedia Retrieval (ICMR), pp. 603-606. ACM, Shanghai, China (2015)

  203. Naik D., Shah P.: A review on image segmentation clustering algorithms. International Journal of Computer Science and Information Technologies (JCSIT). 5(3), 3289-3289 (2014)

  204. Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. Journal of Big Data Springer 2(1):21

    Google Scholar 

  205. Nanda P. K, Ponacha P G, Desai U B: A Supervised Image Segmentation scheme using MRF Model and Homotopy Continuation Method. In: Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), pp. 15-20. Delhi, India (1998)

  206. Natsev A, Rastogi R, Shim K: WALRUS: A similarity retrieval algorithm for image databases. In: Proceedings of the 1999 International Conference on Management of Data (ACM SIGMOD Record), pp. 395-406. ACM, Philadelphia, Pennsylvania, USA (1999)

  207. Nguyen T V, Zhao Q, Yan S: Attentive systems: A survey. International Journal of Computer Vision (IJCV). 126(1), 86-110 (2018)

  208. Niles I, Pease A: Towards a standard upper ontology. In: Proceedings of the 2001 International Conference on Formal Ontology in Information Systems, pp. 2-9. ACM, Ogunquit, Maine, USA (2001)

  209. Oberoi A, Singh M (2012) Content-based image retrieval system for medical data bases (CBIR-MD)-lucratively tested on endoscopy, dental and skull images. International Journal of Computer Science Issues (IJCSI) 9(3):300–306

    Google Scholar 

  210. Ojha U, Adhikari U, Singh D K: Image annotation using deep learning: A review. In: 2017 Proceedings of the International Conference on Intelligent Computing and Control (I2C2), pp. 1-5. IEEE, Coimbatore, India (2017)

  211. Oliva D, Cuevas E: An Introduction to Machine Learning. Advances and Applications of Optimized Algorithms in Image Processing, pp.1–11. Springer Vol. 117 (2017)

  212. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    MATH  Google Scholar 

  213. Ordonez V, Kulkarni G, Berg T L: Im2text: Describing images using 1 million captioned photographs. In: Proceedings of the 25th Advances in Neural Information Processing Systems (NIPS), pp. 1143-1151. Granada, Spain (2011)

  214. Panda S: Unsupervised Color Image Segmentation using MRF Models to Preserve Weak Edges. International Journal of Computer & Mathematical Sciences (IJCMS). 5(6), 73-81 (2016)

  215. Pandey S, Khanna P: A hierarchical clustering approach for image datasets. In: Proceedings of the 9th International Conference on Industrial and Information Systems (ICIIS), pp. 1-6. IEEE, Gwalior, India (2014)

  216. Park SB, Lee JW, Kim SK (2004) Content-based image classification using a neural network. Pattern Recognition Letters Elsevier science 25(3):287–300

    Google Scholar 

  217. Pass G, Zabih R: Histogram refinement for content-based image retrieval. In: Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV), pp. 96-102. IEEE, Sarasota, FL, USA (1996)

  218. Pass G, Zabih R (1999) Comparing images using joint histograms. Multimedia systems Springer 7(3):234–240

    Google Scholar 

  219. Patil MP, Kolhe SR (2012) Automatic image categorization and annotation using K-NN for COREL dataset. Advances in Computational Research 4(1):108–112

    Google Scholar 

  220. Patil M P, Kolhe S R: Automatic Image Annotation Using Decision Trees and Rough Sets. International Journal of Computer Science & Applications (IJCSA). 11(2), 38-49 (2014)

  221. Pawlak Z (1982) Rough sets. International Journal of Computer & Information Sciences Springer 11(5):341–356

    MATH  Google Scholar 

  222. Peleg S, Naor J, Hartley R, Avnir D: Multiple resolution texture analysis and classification. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). IEEE 6(4), 518-523 (1984)

  223. Perronnin F, Sánchez J, Mensink T: Improving the fisher kernel for large-scale image classification. In: Proceedings of the 11th European Conference on Computer Vision (ECCV), pp. 143-156. Crete, Greece (2010)

  224. Petridis K, Anastasopoulos D, Saathoff C, Timmermann N, Kompatsiaris Y, Staab S: M-OntoMat-Annotizer: Image annotation linking ontologies and multimedia low-level features. In: Proceedings of the 10th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES), pp. 633-640. Springer, Bournemouth, UK (2006)

  225. Ping Tian D: A review on image feature extraction and representation techniques. International Journal of Multimedia and Ubiquitous Engineering (IJMUE). 8(4), 385-396 (2013)

  226. Pinheiro, P. O., Collobert, R., Dollár, P.: Learning to segment object candidates. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), pp (1990-1998) IEEE, Montreal. Canada 2015

  227. Preece J, Elser J, Jaiswal P, Kvilekval K, Fedorov D, Manjunath BS, Kitchen R, Xu X, Trigkakis D, Todorovic S, Carbon S (2016) Plant image segmentation and annotation with ontologies in BisQue. In: proceedings of the 7th joint international conference on biological ontology and BioCreative (ICBO/BioCreative). Corvallis. Oregon

  228. Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognition Elsevier science 40(2):728–741

    MATH  Google Scholar 

  229. Qian Y, Zhou W, Yan J, Li W, Han L (2015) Comparing machine learning classifiers for object-based land cover classification using very high resolution imagery. Remote sensing of Environment Elsevier science 7(1):153–168

    Google Scholar 

  230. Qiu B: A refined SVM applied in medical image annotation. In: Proceedings of the Workshop of the Cross-Language Evaluation Forum for European Languages, pp. 690-693. Springer, Alicante, Spain (2006)

  231. Quattrone G, Ferrara E, De Meo P, Capra L: Measuring similarity in large-scale folksonomies. In: Proceedings of the 23rd International Conference on Software Engineering and Knowledge Engineering (SEKE), pp. 385-391. Miami Beach, USA (2012)

  232. Quinlan JR (1986) Induction of decision trees. Machine learning Springer 1(1):81–106

    Google Scholar 

  233. Quinlan J R: C4.5: Programs for Machine Learning, Morgan Kaufmann, Los Altos, California, USA (1993)

  234. Redmon J, Farhadi A: YOLO9000: better, faster, stronger. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7263-7271. IEEE, Honolulu, HI, USA (2017)

  235. Redmon J, Divvala S, Girshick R, Farhadi A: You only look once: Unified, real-time object detection. In: Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788. IEEE, Las Vegas, NV, USA (2016)

  236. Ren S, He K, Girshick R, Sun J: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th Advances in Neural Information Processing Systems (NIPS), pp. 91-99. Montreal, Quebec, Canada (2015)

  237. Ren Z, Wang X, Zhang N, Lv X, Li L J: Deep reinforcement learning-based image captioning with embedding reward. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 290-298. IEEE, Honolulu, HI, USA (2017)

  238. Rennie S J, Marcheret E, Mroueh Y, Ross J, Goel V: Self-critical sequence training for image captioning. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1179–1195. IEEE, Honolulu, HI, USA (2017)

  239. Rosenfeld A, Weszka J S: Picture recognition. Digital Pattern Recognition. Springer, p. 135-166 (1980)

  240. Rubner, Y., Tomasi, C., Guibas, L. J.: The earth mover's distance as a metric for image retrieval. International Journal of Computer Vision (IJCV). Springer 40(2), 99-121 (2000)

  241. Rui Y, Huang T S, Ortega M, Mehrotra S: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT). IEEE 8(5), 644-655 (1998)

  242. Rui Y, Huang T S, Chang S F: Image retrieval: Current techniques, promising directions, and open issues. Journal of Visual Communication and Image Representation (JVCI). Elsevier science 10(1), 39-62 (1999)

  243. Rui S, Jin W, Chua T S: A novel approach to auto image annotation based on pairwise constrained clustering and semi-naïve Bayesian model. In: Proceedings of the 11th International Conference on Multimedia Modelling (MMM), pp. 322–327. IEEE, Melbourne, Australia (2005)

  244. Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a dadabase and web-based tool or image annotation. International Journal of Computer Vision Springer 77(1-3):157–173

    Google Scholar 

  245. Sak H, Senior A, Beaufays F: Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. CoRR, abs/1402.1128 (2014)

  246. Sami M, El-Bendary N, Hassanien A E: Automatic image annotation via incorporating Naive Bayes with particle swarm optimization. In: Proceedings of the World Congress on Information and Communication Technologies (WICT), pp. 790-794. IEEE, Trivandrum, India (2012)

  247. Senthilkumar R, Prakash T S: Image Retrieval System by Automatic Annotation. International Journal on Engineering Technology and Sciences (IJETS). 1(8), 286-290 (2014)

  248. Senthilkumaran N, Vaithegi S: Image segmentation by using thresholding techniques for medical images. International Journal of Computer Science and Engineering (IJCSE). 6(1), 1-13 (2016)

  249. Serrano N, Savakis A, Luo A: A computationally efficient approach to indoor/outdoor scene classification. In: Proceedings of the 16th International Conference on Pattern Recognition (ICPR), pp. 146-149. IEEE, Quebec City, Quebec, Canada (2002)

  250. Sethi I K, Coman I L, Stan D: Mining association rules between low-level image features and high-level concepts. In: International Society for Optics and Photonics (SPIE). Vol. 4384, pp. 279-291 (2001)

  251. Shen J, Wang M, Yan S, Hua X S: Multimedia tagging: past, present and future. In: Proceedings of the 19th ACM International Conference on Multimedia (MM), pp. 639-640. ACM, Scottsdale, AZ, USA (2011)

  252. Shen Z, Liu Z, Li J, Jiang Y G, Chen Y, Xue X. Dsod: Learning deeply supervised object detectors from scratch. In: Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), pp. 1919-1927. IEEE, Venice, Italy (2017)

  253. Shetty R, Rohrbach M, Anne Hendricks L, Fritz M, Schiele B.: Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training. In: Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), pp. 4155–4164. IEEE, Venice, Italy (2017)

  254. Shi J, Malik J: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). IEEE 22(8), 888-905 (2000)

  255. Shi R, Feng H, Chua T S, Lee C H: An adaptive image content representation and segmentation approach to automatic image annotation. In: Proceedings of the 3rd International Conference on Image and Video Retrieval (CIVR), pp. 545-554. Springer, Dublin, Ireland (2004)

  256. Shimpi S, Patil V: Hidden Markov model as classifier: a survey. In: Proceedings of the 2013 International Conference on Computer Science and Engineering (COMPSE), pp. 13530-13533 (2013)

  257. Shitole A, Godase U: Survey on Content Based Image Retrieval. International Journal of Computer-Aided Technologies (IJCAx). 1(1), 21-29 (2014)

  258. Shukla T, Mishra N, Sharma S (2013) Automatic image annotation using SURF features. Int J Comput Appl 68(4):17–24

    Google Scholar 

  259. Shyu C R: Relevance feedback decision trees in content-based image retrieval. In: Proceedings of the 2000 IEEE Workshop on Content-based Access of Image and Video Libraries, pp. 68-72. IEEE, Hilton Head Island, SC, USA (2000)

  260. Simonyan K, Zisserman A: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)

  261. Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE 22(12):1349–1380

    Google Scholar 

  262. Socher R, Perelygin A, Wu J, Chuang J, Manning C D, Ng A, Potts C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1631-1642. Washington, USA (2013)

  263. Sreedhar Kumar S, Shilpa S.: A new approach for image feature vector classification using unsupervised clustering method. International Journal of Advance Research in Science And Engineering (IJARSE). 3(6), 108-117 (2014)

  264. Stanchev PL, Green D Jr, Dimitrov B (2003) Level color similarity retrieval. International Journal of Information Theories & Application 10(3):363–369

    Google Scholar 

  265. Steggink J, Snoek CG (2011) Adding semantics to image-region annotations with the name-it-game. Multimedia Systems Springer 17(5):367–378

    Google Scholar 

  266. Stührenberg M (2013) What, when, where? Spatial and temporal annotations with XStandoff. In Balisage, The Markup Conference. Montréal, Canada

    Google Scholar 

  267. Sugano Y, Bulling A: Seeing with humans: Gaze-assisted neural image captioning. arXiv preprint arXiv:1608.05203 (2016)

  268. Sun C, Gan C, Nevatia R.: Automatic concept discovery from parallel text and visual corpora. In: Proceedings of the 15th IEEE International Conference on Computer Vision (ICCV), pp. 2596–2604. IEEE, Santiago, Chile (2015)

  269. Swain M J, Ballard D H: Color indexing. International Journal of Computer Vision (IJCV). Springer 7(1), 11-32 (1991)

  270. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Rabinovich A: Going deeper with convolutions. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-9. IEEE, Boston, MA, USA (2015)

  271. Tabb M, Ahuja N: Multiscale image segmentation by integrated edge and region detection. IEEE Transactions on Image Processing (TIP). IEEE 6(5), 642-655 (1997)

  272. Tallapragada V S, Reddy D M, Kiran P S, Reddy D V: A Novel Medical Image Segmentation and Classification using Combined Feature Set and Decision Tree Classifier. International Journal of Research in Engineering and Technology (IJRET). 4(9), 83-86 (2016)

  273. Tamura H, Mori S, Yamawaki T: Textural features corresponding to visual perception. IEEE Transactions on Systems, Man, and Cybernetics. IEEE 8(6), 460-473 (1978)

  274. Tan, W., Wang, X., Zhang, Y., Zhou, B., Chen, X.: A conceptual prototype for digital media cloud. In: Proceedings of the 8th ChinaGrid Annual Conference (ChinaGrid), pp. 103-108. IEEE, Changchun, China (2013)

  275. Tang J, Hong R, Yan S, Chua TS, Qi GJ, Jain R (2011) Image annotation by k nn-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent Systems and Technology (TIST) 2(2):1–15

    Google Scholar 

  276. Tang J, Chen Q, Wang M, Yan S, Chua TS, Jain R (2013) Towards optimizing human labeling for interactive image tagging. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 9(4):1–18

    Google Scholar 

  277. Tang J, Yan S, Zhao C, Chua TS, Jain R (2013) Label-specific training set construction from web resource for image annotation. Signal Processing (SP) 93(8):2199–2204

    Google Scholar 

  278. Tian D: Support vector machine for automatic image annotation. International Journal of Hybrid Information Technology (IJHIT). 8(11), 435-446 (2015)

  279. Tian Z, Shen C, Chen H, He T.: FCOS: Fully Convolutional One-Stage Object Detection. arXiv preprint arXiv:1904.01355 (2019)

  280. Ting Y, Yingwei P, Yehao L, Zhaofan Q, and Tao M: Boosting image captioning with attributes. In: Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), pp. 4904–4912. IEEE, Venice, Italy (2017)

  281. Torralba A, Russell BC, Yuen J (2010) Labelme: online image annotation and applications. Proc IEEE 98(8):1467–1484

    Google Scholar 

  282. Town C, Sinclair D (2000) Content based image retrieval using semantic visual categories. Society of Manufacturing Engineers

  283. Tran K, He X, Zhang L, Sun J: Rich image captioning in the wild. In: Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 49–56. IEEE, Las Vegas, NV, USA (2016)

  284. Trelea IC (2003) The particle swarm optimization algorithm: convergence analysis and parameter selection. Information processing letters Elsevier science 85(6):317–325

    MathSciNet  MATH  Google Scholar 

  285. Tsai C F, McGarry K, Tait J: CLAIRE: A modular support vector image indexing and classification system. ACM Transactions on Information Systems (TOIS). ACM 24(3), 353-379 (2006)

  286. Tuceryan M, Jain A K: Texture analysis. In: Handbook of Pattern Recognition and Computer Vision, pp. 235-276 (1993)

  287. Tunga S, Jayadevappa D, Gururaj C: A comparative study of content based image retrieval trends and approaches. International Journal of Image Processing (IJIP). 9(3), 127-155 (2015)

  288. Tyagi V: Content-Based Image Retrieval Techniques: A Review. In: Proceeding of the 2017 Content-Based Image Retrieval, pp. 29-48. Springer, Singapore (2017)

  289. Ugarriza L G, Saber E, Vantaram S R, Amuso V, Shaw M, Bhaskar R: Automatic image segmentation by dynamic region growth and multiresolution merging. IEEE Transactions on Image Processing (TIP). IEEE 18(10), 2275-2288 (2009)

  290. Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. International Journal of Computer Vision (IJCV) 104(2):154–171

    Google Scholar 

  291. Vedaldi A, Gulshan V, Varma M, Zisserman A: Multiple kernels for object detection. In: Proceedings of the 12th IEEE International Conference on Computer Vision (ICCV), pp. 606-613. IEEE, Kyoto, Japan (2009)

  292. Vega F, Pérez W, Tello A, Saquicela V, Espinoza M, Vidal M, La Cruzc A: WebMedSA: a web-based framework for segmenting and annotating medical images using biomedical ontologies. In: Proceedings of the 11th International Symposium on Medical Information Processing and Analysis (SIPAIM), pp. 134-146, Cuenca, Ecuador (2015)

  293. Venugopalan S, Hendricks L A, Rohrbach M, Mooney R, Darrell T, Saenko K: Captioning images with diverse objects. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1170–1178. IEEE, Honolulu, HI, USA (2017)

  294. Verma Y, Jawahar C V: Image annotation using metric learning in semantic neighbourhoods. In: Proceedings of the 12th European Conference on Computer Vision (ECCV), pp. 836-849. Springer, Firenze, Italy (2012)

  295. Vincent L, Soille P: Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis & Machine Intelligence (TPAMI). IEEE 13(6), 583-598 (1991)

  296. Visa A, Valkealahti K, Simula O: Cloud detection based on texture segmentation by neural network methods. In: Proceedings of the 1991 IEEE International Conference Joint Conference on Neural Networks (IJCNN), pp. 1001-1006. IEEE, Singapore (1991)

  297. Von Ahn L, Dabbish L: Labeling images with a computer game. In: Proceedings of the 2004 ACM Conference on Human Factors in Computing Systems, pp. 319-326. ACM, Vienna, Austria (2004)

  298. Von Ahn L, Liu R, Blum M: Peekaboom: A game for locating objects in images. In: Proceedings of the 2006 ACM SIGCHI conference on Human in Computing Systems, pp. 55–64. ACM, Montréal, Québec, Canada (2006)

  299. Wagstaff K, Cardie C, Rogers S, Schrödl S: Constrained K-means Clustering with Background Knowledge. In: Proceedings of the 18th International Conference on Machine Learning (ICML), pp. 577-584. ACM, Williamstown, MA, USA (2001)

  300. Wang Q, Chan A B: CNN+ CNN: convolutional decoders for image captioning. arXiv preprint arXiv:1805.09019 (2018)

  301. Wang J Z, Li J: Learning-based linguistic indexing of pictures with 2--d MHMMs. In: Proceedings of the 10th ACM International Conference on Multimedia (MM), pp. 436-445. ACM, Juan-les-Pins, France (2002)

  302. Wang J Z, Li J, Wiederhold G: SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). IEEE 23(9), 947-963 (2001)

  303. Wang C, Yan S, Zhang L, Zhang H J: Multi-label sparse coding for automatic image annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1643-1650. IEEE, Miami, FL, USA (2009)

  304. Wang T, Wu D J, Coates A, Ng A Y: End-to-end text recognition with convolutional neural networks. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR), pp. 3304-3308. IEEE, Tsukuba, Japan (2012)

  305. Wang XY, Zhang BB, Yang HY (2014) Content-based image retrieval by integrating color and texture features. Multimedia Tools and Applications Springer 68(3):545–569

    Google Scholar 

  306. Wang R, Xie Y, Yang J, Xue L, Hu M, Zhang Q: Large scale automatic image annotation based on convolutional neural network. Journal of Visual Communication and Image Representation (JVCI). Elsevier science 49(C), 213-224 (2017)

  307. Wei Z, Luo X, Zhou F: Ontology based automatic image annotation using multi-class SVM. In: Proceedings of the 7th International Conference on Image and Graphics (ICIG), pp. 434-438. IEEE, Qingdao, China (2013)

  308. Wei Y, Liang X, Chen Y, Jie Z, Xiao Y, Zhao Y, Yan S (2016) Learning to segment with image-level annotations. Pattern Recognition (PR) 59:234–244

    Google Scholar 

  309. Wei C, Huang J, Mansaray LR, Li Z, Liu W, Han J (2017) Estimation and mapping of winter oilseed rape LAI from high spatial resolution satellite data based on a hybrid method. Remote sensing of Environment Elsevier science 9(5):488

    Google Scholar 

  310. Wei-ning W, Ying-lin Y, Sheng-ming J: Image retrieval by emotional semantics: A study of emotional space and feature extraction. In: Proceedings of the 2006 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 3534-3539. IEEE, Taipei, Taiwan (2006)

  311. Weston J, Bengio S, Usunier N: Wsabie: Scaling up to large vocabulary image annotation. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI), pp. 2764-2770. ACM, Barcelona, Catalonia, Spain (2011)

  312. Wojnar A, Pinheiro A M: Annotation of medical images using the SURF descriptor. In: Proceedings of the 9th IEEE International Symposium on Biomedical Imaging (ISBI), pp. 130-133. IEEE, Barcelona, Spain (2012)

  313. Wong R C, Leung C H: Automatic semantic annotation of real-world web images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). IEEE 30(11), 1933-1944 (2008)

  314. Wong ST, Tjandra DA (1999) A digital library for biomedical imaging on the internet. IEEE Commun Mag 37(1):84–91

    Google Scholar 

  315. Wu J, Yu Y, Huang C, Yu K: Deep multiple instance learning for image classification and auto-annotation. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3460-3469. IEEE, Boston, MA, USA (2015)

  316. Xu H, Zhou X, Wang M, Xiang Y, Shi B: Exploring Flickr's related tags for semantic annotation of web images. In: Proceedings of the 2009 ACM International Conference on Image and Video Retrieval (CIVR), p. 46. ACM, Santorini, Fira, Greece (2009)

  317. Xu Z, Luo X, Liu Y, Mei L, Hu C (2014) Measuring semantic relatedness between flickr images: from a social tag based view. Sci World J 2014(758089)

  318. Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Bengio Y: Show, attend and tell: Neural image caption generation with visual attention. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 2048–2057. Lille, France (2015)

  319. Xue J, Li J, Gong Y.: Restructuring of deep neural network acoustic models with singular value decomposition. In: Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech), pp. 2365-2369. Lyon, France (2013)

  320. Yang C, Dong M, Fotouhi F: Image content annotation using bayesian framework and complement components analysis. In: Proceedings of the 2005 IEEE International Conference on Image Processing (ICIP), pp. pp. 1190-1193. IEEE, Genova, Italy (2005)

  321. Yang C, Dong M, Hua J.: Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2057-2063. IEEE, New York, NY, USA (2006)

  322. Yang M, Kpalma K, Ronsin J: A survey of shape feature extraction techniques. Pattern Recognition. Elsevier science p. 43-90. (2008).

  323. Yang Y, Zhang W, Xie Y (2015) Image automatic annotation via multi-view deep representation. Journal of Visual Communication and Image Representation Elsevier science/ACM 33(2015):368–377

    Google Scholar 

  324. Yang L, Tang K, Yang J, Li L J.: Dense Captioning with Joint Inference and Visual Context. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1978-1987. IEEE, Honolulu, HI, USA (2017)

  325. Yao T, Pan Y, Li Y, Mei T: Incorporating copying mechanism in image captioning for learning novel objects. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5263–5271. IEEE, Honolulu, HI, USA (2017)

  326. Yavlinsky A, Schofield E, Rüger S: Automated image annotation using global features and robust nonparametric density estimation. In: Proceedings of the 4th International Conference on Image and Video Retrieval (CIVR), pp. 507-517. Springer, Singapore (2005)

  327. You, D., Antani, S., Demner-Fushman, D., Thoma, G. R.: A contour-based shape descriptor for biomedical image classification and retrieval. Document Recognition and Retrieval (DRR). Vol. 9021, p. 90210L (2014)

  328. You, Q., Jin, H., Wang, Z., Fang, C., Luo, J.: Image captioning with semantic attention. In: Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4651-4659. IEEE, Las Vegas, NV, USA (2016)

  329. Yue J, Li Z, Liu L, Fu Z (2011) Content-based image retrieval using color and texture fused features. Mathematical and Computer Modelling Elsevier science 54(3-4):1121–1127

    Google Scholar 

  330. Zahn C T: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers (TC). IEEE 20(1), 68–86 (1971)

  331. Zhang H: The Optimality of Naive Bayes. In: Proceedings of the 17th International Conference of Florida AI Research Society (FLAIRS), pp. 17-19. Florida, USA (2004)

  332. Zhang D, Lu G (2004) Review of shape representation and description techniques. Pattern recognition Elsevier science 37(1):1–19

    Google Scholar 

  333. Zhang ML, Zhou ZH (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognition Elsevier science 40(7):2038–2048

    MATH  Google Scholar 

  334. Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. In: ACM Sigmod Record ACM 25(2):103–114

    Google Scholar 

  335. Zhang C, Chai J, Jin R: User term feedback in interactive text-based image retrieval. In: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 51-58. ACM, Salvador, Brazil (2005)

  336. Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques. Pattern Recognition Elsevier science 45(1):346–362

    Google Scholar 

  337. Zhao Y, Zhao Y, Zhu Z (2009) TSVM-HMM: Transductive SVM based hidden Markov model for automatic image annotation. Expert Systems with Applications Elsevier science 36(6):9813–9818

    Google Scholar 

  338. Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Torr P H: Conditional random fields as recurrent neural networks. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1529-1537. IEEE, Santiago, Chile (2015)

  339. Zhou X, Zhuo J, Krahenbuhl P: Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. pp. 850-859. IEEE, Californie, United-States (2019)

  340. Zhu S C, Yuille A: Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for Multi-band Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). IEEE 18(9), 884 - 900 (1996)

  341. Zhu C, He Y, Savvides M: Feature selective anchor-free module for single-shot object detection. In: Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 840-849. IEEE, Californie, United-States (2019)

  342. Znaidia A, Le Borgne H, Popescu A: CEA LIST's participation to visual concept detection task of imageCLEF 2011. In: Proceedings of the CLEF (Notebook Papers/Labs/Workshop) (2011)

  343. Zomahoun D E: Collaborative semantic annotation of images: ontology-based model. Signal et Image Processing. An International Journal (SIPIJ). 4(6), 71-81 (2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariam Bouchakwa.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bouchakwa, M., Ayadi, Y. & Amous, I. A review on visual content-based and users’ tags-based image annotation: methods and techniques. Multimed Tools Appl 79, 21679–21741 (2020). https://doi.org/10.1007/s11042-020-08862-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08862-1

Keywords

Navigation