Adaptive image annotation: refining labels according to contents and relations

Xiao, Fen; Chen, Yuyu; Zhang, Yiming; Gong, Xue; Gao, Xieping

doi:10.1007/s00521-021-06866-y

Adaptive image annotation: refining labels according to contents and relations

Original Article
Published: 30 January 2022

Volume 34, pages 7271–7282, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Fen Xiao¹,
Yuyu Chen¹,
Yiming Zhang¹,
Xue Gong¹ &
…
Xieping Gao ORCID: orcid.org/0000-0002-7764-3616¹

336 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Image annotation has been an active research in computer vision. Most of the prior research works focus on annotating images with fixed number of labels, while it is unreasonable to annotate all images with the same number of labels and do not take into consideration their contents. In this paper, we present an extensive survey on the recent works about image annotation with label-to-image semantic relevance and propose a general framework for image adaptive annotation. Compared to previous works on image annotation methods, the proposed framework is novel in the following aspects: (1) It predicts label numbers of each image according to its visual features, which is more reasonable and practical for real-world image annotation. (2) It models label-to-image relevance with similar images and related labels, which can generate abundant candidate labels. (3) It can progressively refine the image label sets, which ensures the selected label set to be truly representative and with few redundancies. Experimental results on two benchmark multi-label image annotation datasets demonstrate that the proposed model outperforms the prior state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Annotation with Weak Labels

Annotating Web Images by Combining Label Set Relevance with Correlation

Learning multi-task local metrics for image annotation

Article 14 December 2014

References

Bhagat P, Choudhary P (2018) Image annotation: then and now. Image Vision Comput 80:1–23
Article Google Scholar
Chacko JS (2018) Tulasi B Semantic image annotation using convolutional neural network and wordnet ontology. Int J Eng Technol 7(2.27):56–60
Article Google Scholar
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531
Chen M, Zheng A, Weinberger K (2013) Fast image tagging. In: ICML, pp 1274–1282
Chen S, Jin Q, Wang P, Wu Q (2020) Say as you wish: Fine-grained control of image caption generation with abstract scene graphs. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9962–9971
Chen ZM, Wei XS, Wang P, Guo Y (2019) Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5177–5186
Cheng Q, Zhang Q, Fu P, Tu C, Li S (2018) A survey and analysis on automatic image annotation. Pattern Recogn 79:242–259
Article Google Scholar
Donahue J, Jia Y, Vinyals O, Hoffman J, Ning Z, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: ICML, pp 647–655
Fellbaum C (1998) Wordnet: an electronic lexical database. Libr Q Inf Commun Policy 25(2):292–296
MATH Google Scholar
Feng L, Bhanu B (2016) Semantic concept co-occurrence patterns for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 38(4):785–799
Article Google Scholar
Feng SL, Manmatha R, Lavrenko V (2004) Multiple bernoulli relevance models for image and video annotation. In: CVPR, pp 1002–1009
Foumani SNM, Nickabadi A (2019) A probabilistic topic model using deep visual word representation for simultaneous image classification and annotation. J Visual Commun Image Represent 59:195–203
Article Google Scholar
Grubinger M, Clough P, Muller H, Deselaers T (2006) The IAPR benchmark: a new evaluation resource for visual information systems. In: ICLRE, pp 13–23
Gu Y, Qian X, Li Q, Wang M, Hong R, Tian Q (2015) Image annotation by latent community detection and multikernel learning. IEEE Trans Image Process 24:3450–3463
Article MathSciNet Google Scholar
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV, pp 309–316
Guo QJ, Li N, Yang YB, Wu GS (2014) Image annotation by modeling supporting region graph. Appl Intell 40(3):389–403
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Hu H, Zhou G, Deng Z, Liao Z, Mori G (2016) Learning structured inference neural networks with label relations. In: CVPR, pp 2960–2968
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: ACM SIGIR, pp 119–126
Jin J, Nakayama H (2016) Annotation order matters: recurrent image annotator for arbitrary length image tagging. In: ICPR, pp 2452–2457
Ke X, Zou J, Niu Y (2019) End-to-end automatic image annotation based on deep CNN and multi-label data augmentation. IEEE Trans Multimed 21(8):2093–2106
Article Google Scholar
Kulesza A, Taskar B (2011) k-dpps: Fixed-size determinantal point processes. In: ICML, pp 1193–1200
Kulesza A, Taskar B (2012) Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083
Li X, Snoek CGM, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimed 11(7):1310–1322
Article Google Scholar
Li X, Uricchio T, Ballan L, Bertini M, Snoek C, Bimbo A (2015) Socializing the semantic gap: a comparative survey on image tag assignment, refinement and retrieval. ACM Comput Surv 49(1):1–14
Article Google Scholar
Liang X, Zhou H, Xing E (2018) Dynamic-structured semantic propagation network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 752–761
Lu D, Weng Q (2007) A survey of image classification methods and techniques for improving classification performance. Int J Remote Sens 28(5):823–870
Article Google Scholar
Lyu F, Wu Q, Hu F, Wu Q, Tan M (2019) Attend and imagine: multi-label image classification with visual attention and recurrent neural networks. IEEE Trans Multimed 21(8):1971–1981
Article Google Scholar
Ma Y, Liu Y, Xie Q, Li L (2019) CNN-feature based automatic image annotation method. Multimed Tools Appl 78(3):3767–3780
Article Google Scholar
Ma Y, Xie Q, Liu Y, Xiong S (2019) A weighted kNN-based automatic image annotation method. Neural Comput Appl, 1–12
Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. In: ECCV, pp 316–329
Niu Y, Lu Z, Wen JR, Xiang T, Chang SF (2018) Multi-modal multi-scale deep learning for large-scale image annotation. IEEE Trans Image Process 28(4):1720–1731
Article MathSciNet Google Scholar
Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: EMNLP, pp 1532–1543
Putthividhy D, Attias HT, Nagarajan SS (2010) Topic regression multi-modal latent dirichlet allocation for image annotation. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3408–3415. IEEE
Szegedy C, Ioffe S, Vanhoucke V (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261
Tang C, Liu X, Wang P, Zhang C, Li M, Wang L (2019) Adaptive hypergraph embedded semi-supervised multi-label image annotation. IEEE Trans Multimed 21(11):2837–2849. https://doi.org/10.1109/TMM.2019.2909860
Article Google Scholar
Tatler, Benjamin, W (2008) A new baseline for image annotation. In: ECCV, pp 316–329
Verma Y (2019) Diverse image annotation with missing labels. Pattern Recogn, 93, 470–484. https://doi.org/10.1016/j.patcog.2019.05.018. http://www.sciencedirect.com/science/article/pii/S0031320319301931
Verma Y, Jawahar CV (2016) Image annotation by propagating labels from semantic neighbourhoods. Int J Comput Vis, 1–23
von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: ACM SIGCHI, pp 319–326
Wang J, Yang Y, Mao J, Huang Z, Huang C, Xu W (2016) Cnn-rnn: A unified framework for multi-label image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2285–2294
Wei W, Wu Q, Chen D, Zhang Y, Liu W, Duan G, Luo X (2021) Automatic image annotation based on an improved nearest neighbor technique with tag semantic extension model. Proc Comput Sci 183:616–623
Article Google Scholar
Wu B, Chen W, Sun P, Liu W, Ghanem B, Lyu S (2018) Tagging like humans: Diverse and distinct image annotation. In: CVPR, pp 7967–7975
Wu B, Chen W, Sun P, Liu W, Ghanem B, Lyu S (2018) Tagging like humans: Diverse and distinct image annotation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7967–7975. https://doi.org/10.1109/CVPR.2018.00831
Wu B, Jia F, Liu W, Ghanem B (2017) Diverse image annotation. In: CVPR, pp 6194–6202
Wu B, Jia F, Liu W, Ghanem B, Lyu S (2018) Multi-label learning with missing labels using mixed dependency graphs. Int J Comput Vis 126(8):875–896
Article MathSciNet Google Scholar
Wu B, Lyu S, Ghanem B (2015) Ml-mg: Multi-label learning with missing labels using a mixed graph. In: ICCV, pp 4157–4165
Wu Y, Zhai H, Li M, Cui F, Wang L, Patil N (2019) Learning image convolutional representations and complete tags jointly. Neural Comput Appl 31(7):2593–2604
Article Google Scholar
Yu H, Jain P, Kar P, Dhillon D (2014) Large-scale multi-label learning with missing labels. In: ICML, pp 593–601
Yuan BH, Liu GH (2020) Image retrieval based on gradient-structures histogram. Neural Comput Appl 32(15):11717–11727
Article Google Scholar
Yuan C, Wu Y, Qin X, Qiao S, Pan Y, Huang P, Liu D, Han N (2019) An effective image classification method for shallow densely connected convolution networks through squeezing and splitting techniques. Appl Intell 49(10):3570–3586
Article Google Scholar
Zhang J, He Z, Zhang J, Dai T (2019) Cograph regularized collective nonnegative matrix factorization for multilabel image annotation. IEEE Access 7:88338–88356. https://doi.org/10.1109/ACCESS.2019.2925891
Article Google Scholar
Zhang J, Wu Q, Zhang J, Shen C, Lu J (2019) Mind your neighbours: Image annotation with metadata neighbourhood graph co-attention networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2956–2964

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61771415, 61802328), Natural Science Foundation of Hunan province in China (Grant No. 2018JJ2405), Scientific Research Fund of Hunan Provincial Education Department (Grant No. 18K034).

Author information

Authors and Affiliations

Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, Hunan, China
Fen Xiao, Yuyu Chen, Yiming Zhang, Xue Gong & Xieping Gao

Authors

Fen Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yuyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yiming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xue Gong
View author publications
You can also search for this author in PubMed Google Scholar
Xieping Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xieping Gao.

Ethics declarations

Conflict of interest

All authors disclosed no relevant relationships. There are no other relationships or activities that could appear to have influenced the submitted work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, F., Chen, Y., Zhang, Y. et al. Adaptive image annotation: refining labels according to contents and relations. Neural Comput & Applic 34, 7271–7282 (2022). https://doi.org/10.1007/s00521-021-06866-y

Download citation

Received: 24 February 2021
Accepted: 12 December 2021
Published: 30 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s00521-021-06866-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive image annotation: refining labels according to contents and relations

Abstract

Access this article

Similar content being viewed by others

Image Annotation with Weak Labels

Annotating Web Images by Combining Label Set Relevance with Correlation

Learning multi-task local metrics for image annotation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive image annotation: refining labels according to contents and relations

Abstract

Access this article

Similar content being viewed by others

Image Annotation with Weak Labels

Annotating Web Images by Combining Label Set Relevance with Correlation

Learning multi-task local metrics for image annotation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation