Adaptive knowledge graph for multi-label image classification

Lin, Zhihong; Tang, Xue-song; Hao, Kuangrong; Zhao, Mingbo; Li, Yubing

doi:10.1007/s10489-024-05845-9

Adaptive knowledge graph for multi-label image classification

Published: 25 November 2024

Volume 55, article number 20, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Zhihong Lin^1,2,
Xue-song Tang ORCID: orcid.org/0000-0002-7594-2241^1,2,
Kuangrong Hao^1,2,
Mingbo Zhao^1,2 &
…
Yubing Li^1,2

237 Accesses
Explore all metrics

Abstract

In multi-label image classification tasks, recent studies often exploit Graph Convolutional Networks(GCNs) to construct category label dependencies. However, existing GCN-based methods have two major drawbacks. First, the co-occurrence relationships contained in the GCN adjacency matrix constructed only from the dataset label statistics are not comprehensive enough, and a fixed adjacency matrix may reduce the generalization of the model. Second, GCN may suffer from over-smoothing during node updates. To solve these problems, we propose a Multi-Label classification model based on Adaptive Knowledge Graph (ML-AKG). ML-AKG consists of the following parts: (1) We adopt an adaptive adjacency matrix constructed based on the knowledge graph to obtain better category label dependencies. (2) To alleviate the over-smoothing and gradient vanishing problems of the GCN model, we add a residual connection structure between the input and output of the GCN layer. (3) A pre-trained multimodal model is introduced to replace the traditional CNN as the image encoder. We conducted extensive experiments on public multi-label image classification benchmarks, and the experimental results verified the effectiveness of our method. Our model achieves 80.1%, 94.1% and 94.6% mAPs on the MS-COCO, VOC 2007 and VOC 2012, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition

GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-label Image Recognition

Multi-label Image Classification with Multi-scale Global-Local Semantic Graph Network

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability and access

The data that support the findings of this study are available from the corresponding author on reasonable request.

References

Zhou Q, Shi H, Xiang W, Kang B, Latecki LJ (2024) Dpnet: Dual-path network for real-time object detection with lightweight attention. IEEE Trans Neural Netw Learn Syst
Zhou Q, Wang L, Gao G, Bin K, Ou W, Lu H (2024) Boundary-guided lightweight semantic segmentation with multi-scale semantic context. IEEE Trans Multimed
Chen Z, Tian S, Shi X, Lu H (2022) Multiscale shared learning for fault diagnosis of rotating machinery in transportation infrastructures. IEEE Trans Indust Inf 19(1):447–458
Article MATH Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Wang J, Yang Y, Mao J, Huang Z, Huang C, Xu W (2016) Cnn-rnn: A unified framework for multi-label image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2285–2294
Li Q, Qiao M, Bian W, Tao D (2016) Conditional graphical lasso for multi-label image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2977–2986
Chen Z-M, Wei X-S, Wang P, Guo Y (2019) Multilabel image recognition with graph convolutional networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5177–5186
Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, et al (2021) Learning transferable visual models from natural language supervision. In: International conference on machine learning, pp 8748–8763. PMLR
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13, pp 740–755. Springer
Everingham M, Gool LV, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Computer Vision 88:303–338
Article Google Scholar
Gong Y, Jia Y, Leung T, Toshev A, Ioffe S (2013) Deep convolutional ranking for multilabel image annotation. arXiv:1312.4894
Wang Z, Chen T, Li G, Xu R, Lin L (2017) Multi-label image recognition by recurrently discovering attentional regions. In: Proceedings of the IEEE international conference on computer vision, pp 464–472
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929
Liu S, Zhang L, Yang X, Su H, Zhu J (2021) Query2label: A simple transformer way to multi-label classification. arXiv:2107.10834
Zhao J, Zhao Y, Li J (2021) M3tr: Multi-modal multilabel recognition with transformer. In: Proceedings of the 29th ACM international conference on multimedia, pp 469–477
Kipf TN, Max Welling. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 2016
Chen T, Xu M, Hui X, Wu H, Lin L (2019) Learning semantic-specific graph representation for multi-label image recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 522–531
Ye J, He J, Peng X, Wu W, Qiao Y (2020) Attention-driven dynamic graph convolutional network for multilabel image recognition. In: Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXI 16, pp 649–665. Springer
Wang Y, He D, Li F, Long X, Zhou Z, Ma J, Wen S (2020) Multi-label classification with label graph superimposing. Proceedings of the AAAI Conference on Artificial Intelligence 34:12265–12272
Article MATH Google Scholar
Speer R, Chin J, Havasi C (2017) Conceptnet 5.5: An open multilingual graph of general knowledge. In: Proceedings of the AAAI conference on artificial intelligence, volume 31
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch
Wei Y, Xia W, Lin M, Huang J, Ni B, Dong J, Zhao Y, Yan S (2015) Hcp: A flexible cnn framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907
Article MATH Google Scholar
Chen T, Wang Z, Li G, Lin L (2018) Recurrent attentional reinforcement learning for multi-label image recognition. In: Proceedings of the AAAI conference on artificial intelligence, volume 32
Wang M, Luo C, Hong R, Tang J, Feng J (2016) Beyond object proposals: Random crop pooling for multilabel image recognition. IEEE Trans Image Process 25(12):5678–5688
Article MathSciNet MATH Google Scholar
Wei Y, Xia W, Huang J, Ni B, Dong J, Zhao Y, Yan S (2014) Cnn: Single-label to multi-label. arXiv:1406.5726
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Yang H, Zhou JT, Zhang Y, Gao B-B, Wu J, Cai J (2016) Exploit bounding box annotations for multi-label object recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–288
Gao B-B, Xing C, Xie C-W, Wu J, Geng X (2017) Deep label distribution learning with label ambiguity. IEEE Trans Image Process 26(6):2825–2838
Article MathSciNet MATH Google Scholar
Gao B-B, Zhou H-Y (2021) Learning to discover multi-class attentional regions for multi-label image recognition. IEEE Trans Image Process 30:5920–5932
Article MATH Google Scholar
Chen S-F, Chen Y-C, Yeh C-K, Wang Y-C (2018) Order-free rnn with visual attention for multi-label classification. In: Proceedings of the AAAI conference on artificial intelligence, volume 32
Lee C-W, Fang W, Yeh C-K, Wang Y-CF (2018) Multi-label zero-shot learning with structured knowledge graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1576–1585
Zhu F, Li H, Ouyang W, Yu N, Wang X (2017) Learning spatial regularization with image-level supervisions for multi-label image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5513–5522
Guo H, Zheng K, Fan X, Yu H, Wang S (2019) Visual attention consistency under image transforms for multi-label image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 729–739
Jiang M, Liu G, Su Y, Wu X (2024) Self-attention empowered graph convolutional network for structure learning and node embedding. Pattern Recognit 153:110537
Article MATH Google Scholar
Lin Y, Chen M, Zhang K, Li H, Li M, Yang Z, Lv D, Lin B, Liu H, Cai D (2024) Tagclip: A local-to-global framework to enhance open-vocabulary multi-label classification of clip without training. Proceedings of the AAAI Conference on Artificial Intelligence 38:3513–3521
Article MATH Google Scholar
Chong CF, Yang X, Wang T, Ke W, Wang Y (2023) Category-wise fine-tuning for image multi-label classification with partial labels. In: International Conference on Neural Information Processing, pp 332–345. Springer
Chen C, Zhao Y, Li J (2023) Semantic contrastive bootstrapping for single-positive multi-label recognition. Int J Comput Vision 131(12):3289–3306
Pu T, Chen T, Wu H, Lin L (2022) Semantic-aware representation blending for multi-label image recognition with partial labels. Proceedings of the AAAI conference on artificial intelligence 36:2091–2098
Article MATH Google Scholar
Chen T, Pu T, Liu L, Shi Y, Yang Z, Lin L (2024) Heterogeneous semantic transfer for multi-label recognition with partial labels. Int J Comput Vision, pp 1–16
Gao B-B, Zhou H-Y (2021) Learning to discover multi-class attentional regions for multi-label image recognition. IEEE Trans Image Process 30:5920–5932
Article MATH Google Scholar
Yuan Z, Zhang K, Huang T (2023) Positive label is all you need for multi-label classification. arXiv:2306.16016

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No.62176052) and Natural Science Foundation of Shanghai (No.21ZR1401700)

Author information

Authors and Affiliations

College of Information Sciences and Technology, Donghua University, Renmin North Road, Songjiang District, Shanghai, 201620, China
Zhihong Lin, Xue-song Tang, Kuangrong Hao, Mingbo Zhao & Yubing Li
Engineering Research Center of Digitized Textile & Apparel Technology, Ministry of Education, Donghua University, Renmin North Road, Songjiang District, Shanghai, 201620, China
Zhihong Lin, Xue-song Tang, Kuangrong Hao, Mingbo Zhao & Yubing Li

Authors

Zhihong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Xue-song Tang
View author publications
You can also search for this author in PubMed Google Scholar
Kuangrong Hao
View author publications
You can also search for this author in PubMed Google Scholar
Mingbo Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yubing Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Formal analysis and investigation: Xue-song Tang; Writing - original draft preparation: Zhihong Lin; review and editing: Xue-song Tang, Kuangrong Hao, Mingbo Zhao, Yubing Li; authors read and approved the final manuscript.

Corresponding author

Correspondence to Xue-song Tang.

Ethics declarations

Ethical and informed consent for data used

This manuscript has not been published or presented elsewhere and is not under consideration by another journal. We have read and understood your journal’s policies, and we believe that neither the manuscript nor the study violates any of these, and all authors have checked the manuscript and have agreed to the submission.

Competing Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lin, Z., Tang, Xs., Hao, K. et al. Adaptive knowledge graph for multi-label image classification. Appl Intell 55, 20 (2025). https://doi.org/10.1007/s10489-024-05845-9

Download citation

Accepted: 10 November 2024
Published: 25 November 2024
DOI: https://doi.org/10.1007/s10489-024-05845-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive knowledge graph for multi-label image classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition

GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-label Image Recognition

Multi-label Image Classification with Multi-scale Global-Local Semantic Graph Network

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and informed consent for data used

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Adaptive knowledge graph for multi-label image classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition

GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-label Image Recognition

Multi-label Image Classification with Multi-scale Global-Local Semantic Graph Network

Explore related subjects

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and informed consent for data used

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation