Skip to main content

Advertisement

Log in

Multi-label guided graph attention network for education image retrieval

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In recent years, deep learning has achieved remarkable success thanks to advanced neural network architectures and large-scale datasets manually labeled by humans. However, accurately and efficiently labeling large datasets is often costly and challenging, particularly in fields requiring specialized labeling expertise, such as healthcare. In this context, building a model capable of large-scale image retrieval without extensive manual labeling is a crucial need. This study proposes a multi-label learning method based on attentive graph convolutions called GLGM (Graph network combined with Local and Global features based on Multi-label techniques) to address the issue of detailed classification with coarsely labeled datasets. Specifically, within the framework of contrastive learning, our method generates labels interconnected through graph convolutions. Unlike self-supervised contrastive learning methods that link global and local image features to create a graph that represents specific object characteristics, GLGM introduces a common search space that supports image retrieval in the educational field and image retrieval in general based on advanced sample distance search algorithms. We demonstrate that the GLGM method can encompass many state-of-the-art approaches as special cases. Experiments show that GLGM achieves significant improvements over existing advanced methods on various datasets, including CIFAR-10 and MLIC-Edu (a dataset we collected ourselves for the educational image retrieval domain).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The paper uses public data obtained from CIFAR10 and our dataset MLIC-Edu. The use of data in this study follows the guidelines by the dataset’s authors.

References

  1. Obschonka, M., Audretsch, D.B.: Artificial intelligence and big data in entrepreneurship: a new era has begun. Small Bus. Econ. 55, 529–539 (2020). https://doi.org/10.1007/s11187-019-00202-4

    Article  MATH  Google Scholar 

  2. Hu, X., Chu, L., Pei, J., et al.: Model complexity of deep learning: a survey. Knowl. Inf. Syst. 63, 2585–2619 (2021). https://doi.org/10.1007/s10115-021-01605-0

    Article  MATH  Google Scholar 

  3. Venugopalan, J., Tong, L., Hassanzadeh, H.R., et al.: Multimodal deep learning models for early detection of Alzheimer’s disease stage. Sci. Rep. 11, 3254 (2021). https://doi.org/10.1038/s41598-020-74399-w

    Article  MATH  Google Scholar 

  4. Gong, J., et al.: Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification. IEEE Access 8, 30885–30896 (2020). https://doi.org/10.1109/ACCESS.2020.2972751

    Article  Google Scholar 

  5. Zhan, X., et al.: Rapid estimation of entire brain strain using deep learning models. IEEE Trans. Biomed. Eng. 68(11), 3424–3434 (2021). https://doi.org/10.1109/TBME.2021.3073380

    Article  MATH  Google Scholar 

  6. Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds) Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol 325. Springer, Cham. https://doi.org/10.1007/978-3-030-60990-0_12 (2021)

  7. Rajeswaran, A., Mordatch, I., Kumar, V.: A game theoretic framework for model based reinforcement learning, 37th Int. Conf. Mach. Learn. ICML 2020, vol. PartF168147-11, pp. 7909-7919 (2020)

  8. Su, J.C., Maji, S., Hariharan, B.: When does self-supervision improve few-shot learning?. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision - ECCV 2020. ECCV 2020. Lecture Notes in Computer Science, vol 12352. Springer, Cham. https://doi.org/10.1007/978-3-030-58571-6_38 (2020)

  9. Simon, C., Koniusz, P., Nock, R., Harandi, M.: Adaptive subspaces for few-shot learning, Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 4135-4144, (2020) https://doi.org/10.1109/CVPR42600.2020.00419.

  10. Pourpanah, F., et al.: A Review of Generalized Zero-Shot Learning Methods, In: IEEE Transactions on pattern analysis and machine intelligence, vol. 45, no. 4, pp. 4051-4070, 1 April (2023), https://doi.org/10.1109/TPAMI.2022.3191696.

  11. Ren, W., Tang, Y., Sun, Q., Zhao, C., Han, Q.-L.: Visual semantic segmentation based on Few/Zero-shot learning: an overview. IEEE/CAA J. Automat. Sinica 11(5), 1106–1126 (2024). https://doi.org/10.1109/JAS.2023.123207

    Article  MATH  Google Scholar 

  12. Ye, J., Zhao, J., Ye, K., Xu, C.: How to build a graph-based deep learning architecture in traffic domain: a survey. IEEE Trans. Intell. Transp. Syst. 23(5), 3904–3924 (2022). https://doi.org/10.1109/TITS.2020.3043250

    Article  MATH  Google Scholar 

  13. Ghodratnama, S., Abrishami Moghaddam, H.: Content-based image retrieval using feature weighting and C-means clustering in a multi-label classification framework. Pattern. Anal. Applic. 24, 1–10 (2021). https://doi.org/10.1007/s10044-020-00887-4

    Article  MATH  Google Scholar 

  14. Hong, D., Gao, L., Yao, J., Zhang, B., Plaza, A., Chanussot, J.: Graph convolutional networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 59(7), 5966–5978 (2021). https://doi.org/10.1109/TGRS.2020.3015157

    Article  Google Scholar 

  15. Chen, X., Ding, M., Wang, X., et al.: Context autoencoder for self-supervised representation learning. Int. J. Comput. Vis. 132, 208–223 (2024). https://doi.org/10.1007/s11263-023-01852-4

    Article  MATH  Google Scholar 

  16. Yang, X., Song, Z., King, I., Xu, Z.: "A Survey on Deep Semi-Supervised Learning. IEEE Transactions on Knowledge and Data Engineering 35(9), 8934–8954 (2023). https://doi.org/10.1109/TKDE.2022.3220219

    Article  MATH  Google Scholar 

  17. Zhang, P.-F., Li, Y., Huang, Z., Xu, X.-S.: Aggregation-Based Graph Convolutional Hashing for Unsupervised Cross-Modal Retrieval. IEEE Trans. Multimedia 24, 466–479 (2022). https://doi.org/10.1109/TMM.2021.3053766

    Article  MATH  Google Scholar 

  18. Zhang, B., Kannan, R., Prasanna, V., BoostGCN: a framework for optimizing GCN inference on FPGA, IEEE 29th Annual international symposium on field-programmable custom computing machines (FCCM). Orlando, FL, USA 2021, 29–39 (2021). https://doi.org/10.1109/FCCM51124.2021.00012

  19. Hu, B., Guo, K., Wang, X., Zhang, J., Zhou, D.: RRL-GAT: Graph Attention Network-Driven Multilabel Image Robust Representation Learning. IEEE Internet of Things Journal 9(12), 9167–9178 (2022). https://doi.org/10.1109/JIOT.2021.3089180

    Article  Google Scholar 

  20. Yan, W., Tong, W., Zhi, X.: S-GAT: accelerating graph attention networks inference on FPGA platform with shift operation, 2020 IEEE 26th International conference on parallel and distributed systems (ICPADS), Hong Kong (2020), pp. 661-666 https://doi.org/10.1109/ICPADS51040.2020.00093.

  21. Yu, Z., Feng, B., He, D., Wang, Z., Huang, Y., Feng, Z.: LG-GNN: local-global adaptive graph neural network for modeling both homophily and heterophily

  22. Barceló, P., Geerts, F., Reutter, J., Ryschkov, M.: Graph neural networks with local graph parameters. Adv. Neural. Inf. Process. Syst. 34, 25280–25293 (2021)

    MATH  Google Scholar 

  23. Zhang, L., Li, X., Arnab, A., Yang, K., Tong, Y., Torr, P.H.: Dual graph convolutional network for semantic segmentation. arXiv preprint (2019) arXiv:1909.06121

  24. Li, X., Li, X., You, A., Zhang, L., Cheng, G., Yang, K., Lin, Z.: Towards efficient scene understanding via squeeze reasoning. IEEE Trans. Image Proc. 30, 7050–7063 (2021)

    Article  MATH  Google Scholar 

  25. Xu, K., Huang, H., Deng, P., Li, Y.: Deep feature aggregation framework driven by graph convolutional network for scene classification in remote sensing. IEEE Trans. Neural Netw. Learn. Syst. 33(10), 5751–5765 (2022). https://doi.org/10.1109/TNNLS.2021.3071369

    Article  MATH  Google Scholar 

  26. Alfke, D., Stoll, M.: Pseudoinverse graph convolutional networks. Data Min. Knowl. Disc. 35, 1318–1341 (2021). https://doi.org/10.1007/s10618-021-00752-w

    Article  MATH  Google Scholar 

  27. Wu, Z., Chen, Z., Du, S., Huang, S., Wang, S.: Graph convolutional network with elastic topology. Pattern Recognit. 151, 110364 (2024). https://doi.org/10.1016/j.patcog.2024.110364

    Article  MATH  Google Scholar 

  28. Feng, M., et al.: Exploring Hierarchical Spatial Layout Cues for 3D Point Cloud based Scene Graph Prediction. IEEE Transactions on Multimedia (2023). https://doi.org/10.1109/TMM.2023.3277736

    Article  Google Scholar 

  29. Sariyildiz, M.B., Alahari, K., Larlus, D., Kalantidis, Y.: Fake it till you make it: learning transferable representations from synthetic imagenet clones (2023). https://doi.org/10.1109/cvpr52729.2023.00774.

  30. Smith, R.J., Amaral, R., Heywood, M.I.: Evolving simple solutions to the CIFAR-10 benchmark using tangled program graphs, IEEE congress on evolutionary computation (CEC). Kraków, Poland 2021, 2061–2068 (2021). https://doi.org/10.1109/CEC45853.2021.9504998

    Article  Google Scholar 

  31. Chen, Z.M., Wei, X.S., Wang, P., Guo, Y.: Learning graph convolutional networks for multi-label recognition and applications. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 6969–6983 (2023). https://doi.org/10.1109/TPAMI.2021.3063496

    Article  MATH  Google Scholar 

  32. Rodrigues, J., Cristo, M., Colonna, J.G.: Deep hashing for multi-label image retrieval: a survey. Artif. Intell. Rev. 53, 5261–5307 (2020). https://doi.org/10.1007/s10462-020-09820-x

    Article  MATH  Google Scholar 

  33. Shen, X., Dong, G., Zheng, Y., Lan, L., Tsang, I.W., Sun, Q.-S.: Deep co-image-label hashing for multi-label image retrieval. IEEE Trans. Multimed. 24, 1116–1126 (2022). https://doi.org/10.1109/TMM.2021.3119868

    Article  Google Scholar 

  34. Chen, T., Lin, L., Chen, R., Hui, X., Wu, H.: Knowledge-guided multi-label few-shot learning for general image recognition. IEEE Trans. Pattern Anal. Mach. Intell. 44(3), 1371–1384 (2022). https://doi.org/10.1109/TPAMI.2020.3025814

    Article  MATH  Google Scholar 

  35. Min, W., et al.: Large scale visual food recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 9932–9949 (2023). https://doi.org/10.1109/TPAMI.2023.3237871

    Article  Google Scholar 

  36. Ji, Z., et al.: Deep ranking for image zero-shot multi-label classification. IEEE Trans. Image Process. 29, 6549–6560 (2020). https://doi.org/10.1109/TIP.2020.2991527

    Article  MathSciNet  MATH  Google Scholar 

  37. Zhang, J., Ren, J., Zhang, Q., Liu, J., Jiang, X.: Spatial context-aware object-attentional network for multi-label image classification. IEEE Trans. Image Process. 32, 3000–3012 (2023). https://doi.org/10.1109/TIP.2023.3266161

    Article  MATH  Google Scholar 

  38. Zhang, Q.: A novel ResNet101 model based on dense dilated convolution for image classification. SN Appl. Sci. 4, 9 (2022). https://doi.org/10.1007/s42452-021-04897-7

    Article  MATH  Google Scholar 

  39. Ni, R., Cao, H.: Sentiment analysis based on GloVe and LSTM-GRU, 39th Chinese control conference (CCC). Shenyang, China 2020, 7492–7497 (2020). https://doi.org/10.23919/CCC50068.2020.9188578

  40. Ghadekar, P.P., Mohite, S., More, O., Patil, P., Sayantika, Mangrule, S.: Sentence meaning similarity detector using FAISS, 7th International conference on computing, communication, control and automation (ICCUBEA). Pune, India 2023, 1–6 (2023). https://doi.org/10.1109/ICCUBEA58933.2023.10392009

  41. Li, L., Doroslovački, M., Loew, M.H.: Approximating the Gradient of Cross-Entropy Loss Function. IEEE Access 8, 111626–111635 (2020). https://doi.org/10.1109/ACCESS.2020.3001531

    Article  MATH  Google Scholar 

  42. Taguchi, H., Liu, X., Murata, T.: Graph convolutional networks for graphs containing missing features. Futur. Gener. Comput. Syst. 117, 155–168 (2021). https://doi.org/10.1016/j.future.2020.11.016

    Article  MATH  Google Scholar 

  43. Ieamsaard, J., Charoensook, S.N., Yammen, S.: Deep learning-based face mask detection using YoloV5. 9th International electrical engineering congress (iEECON). Pattaya, Thailand 2021, 428–431 (2021). https://doi.org/10.1109/iEECON51072.2021.9440346

Download references

Acknowledgements

This research is funded by the Posts and Telecommunications Institute of Technology (PTIT), Vietnam under grant number ‘12-2024-HV-CNTT1’. The authors would like to thank PTIT for the financial support.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to study’s conception and design. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Huu Quynh Nguyen.

Ethics declarations

Conflict of interest

No Conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication.

Ethics approval

Consent was obtained from all participants prior tho their involvements in the study, and they were informed of their right to withdraw at any time without consequence.

Consent to participate

All authors agreed to participate in the construction and development of this research topic.

Consent to publication

All authors agreed to make this study public.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, V.T., Nguyen, H.Q., Tran, A.D. et al. Multi-label guided graph attention network for education image retrieval. SIViP 19, 19 (2025). https://doi.org/10.1007/s11760-024-03630-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11760-024-03630-2

Keywords

Navigation