Skip to main content
Log in

A meaningful learning method for zero-shot semantic segmentation

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Zero-shot semantic segmentation, which is developed to segment unseen categories, has attracted increasing attention due to its strong practicability. Previous approaches usually applied semantic-visual mapping based on seen categories to unseen categories, and thus failed to generate meaningful unseen visual representations and struggled to balance the seen and unseen concepts in the classifier. To overcome the above limitations, we propose a novel meaningful learning method that could be embedded into any generation-based zero-shot semantic segmentation model, borrowing the idea from the educational psychology field. The proposed meaningful learning method refers to the process that the new concepts could be learned by relating to existing comprehensible concepts and harmoniously incorporated into the concept schema. Specifically, we introduce a generator with conjugate conceptual correlation (G3C) which generates meaningful unseen visual information through anchoring into existing concepts. Moreover, simulating the rational thinking mechanism, we introduce a fast-slow concept modulator to alleviate the noisy over-correlation problem introduced by G3C and further construct a comprehensive concept schema. Extensive experiments conducted on three benchmarks demonstrate the superior performance of our method, especially according to the commonly-acknowledged h-mIoU (e.g., 4% improvement on the Pascal-VOC dataset).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Feng J P, Wang X G, Liu W Y. Deep graph cut network for weakly-supervised semantic segmentation. Sci China Inf Sci, 2021, 64: 130105

    Article  Google Scholar 

  2. Zhang Z J, Pang Y W. CGNet: cross-guidance network for semantic segmentation. Sci China Inf Sci, 2020, 63: 120104

    Article  Google Scholar 

  3. Zhou Q, Wang Y, Liu J, et al. An open-source project for real-time image semantic segmentation. Sci China Inf Sci, 2019, 62: 227101

    Article  Google Scholar 

  4. Li W X, Lin N, Zhang M Z, et al. VNet: a versatile network to train real-time semantic segmentation models on a single GPU. Sci China Inf Sci, 2022, 65: 139105

    Article  Google Scholar 

  5. Peng H T, Zhou B, Yin L Y, et al. Semantic part segmentation of single-view point cloud. Sci China Inf Sci, 2020, 63: 224101

    Article  Google Scholar 

  6. Chen L J, Xiao Y, Yuan X M, et al. Robust autonomous landing of UAVs in non-cooperative environments based on comprehensive terrain understanding. Sci China Inf Sci, 2022, 65: 212202

    Article  Google Scholar 

  7. Wang J L, Lu Y H, Liu J B, et al. A robust three-stage approach to large-scale urban scene recognition. Sci China Inf Sci, 2017, 60: 103101

    Article  Google Scholar 

  8. Chen S T, Jian Z Q, Huang Y H, et al. Autonomous driving: cognitive construction and situation understanding. Sci China Inf Sci, 2019, 62: 081101

    Article  Google Scholar 

  9. Wang L F, Yu Z Y, Pan C H. A unified level set framework utilizing parameter priors for medical image segmentation. Sci China Inf Sci, 2013, 56: 110902

    Google Scholar 

  10. Xu Q, Xi X M, Meng X J, et al. Difficulty-aware bi-network with spatial attention constrained graph for axillary lymph node segmentation. Sci China Inf Sci, 2022, 65: 192102

    Article  Google Scholar 

  11. Liu F, Li H B. Joint sparsity and fidelity regularization for segmentation-driven CT image preprocessing. Sci China Inf Sci, 2016, 59: 032112

    Article  Google Scholar 

  12. Kato N, Yamasaki T, Aizawa K. Zero-shot semantic segmentation via variational mapping. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2019

    Google Scholar 

  13. Xian Y, Choudhury S, He Y, et al. Semantic projection network for zero-and few-label semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 8256–8265

    Google Scholar 

  14. Bucher M, Tuan-Hung V, Cord M, et al. Zero-shot semantic segmentation. In: Proceedings of Advances in Neural Information Processing Systems, 2019. 468–479

    Google Scholar 

  15. Gu Z, Zhou S, Niu L, et al. Context-aware feature generation for zero-shot semantic segmentation. In: Proceedings of the 28th ACM International Conference on Multimedia, 2020. 1921–1929

    Chapter  Google Scholar 

  16. McCloskey M, Cohen N J. Catastrophic interference in connectionist networks: the sequential learning problem. Psychol Learning Motiv, 1989, 24: 109–165

    Article  Google Scholar 

  17. Kahneman D. Thinking, Fast and Slow. London: Macmillan, 2011

    Google Scholar 

  18. Everingham M, Eslami S M A, van Gool L, et al. The pascal visual object classes challenge: a retrospective. Int J Comput Vis, 2015, 111: 98–136

    Article  Google Scholar 

  19. Mottaghi R, Chen X, Liu X, et al. The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, 2014. 891–898

    Google Scholar 

  20. Caesar H, Uijlings J, Ferrari V. COCO-stuff: thing and stuff classes in context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 1209–1218

    Google Scholar 

  21. Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell, 2017, 40: 834–848

    Article  Google Scholar 

  22. Lin G, Milan A, Shen C, et al. RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 1925–1934

    Google Scholar 

  23. Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2881–2890

    Google Scholar 

  24. Zhang Z, Chen A, Xie L, et al. Learning semantics-aware distance map with semantics layering network for amodal instance segmentation. In: Proceedings of the 27th ACM International Conference on Multimedia, 2019. 2124–2132

    Chapter  Google Scholar 

  25. Pei G, Shen F, Yao Y, et al. Hierarchical feature alignment network for unsupervised video object segmentation. In: Proceedings of European Conference on Computer Vision, 2022. 596–613

    Google Scholar 

  26. Yao Y, Chen T, Xie G S, et al. Non-salient region object mining for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021. 2623–2632

    Google Scholar 

  27. Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 3431–3440

    Google Scholar 

  28. Chen L C, Papandreou G, Kokkinos I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. 2014. ArXiv:1412.7062

  29. Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2015. 234–241

    Google Scholar 

  30. Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 801–818

    Google Scholar 

  31. Fu Y, Hospedales T M, Xiang T, et al. Transductive multi-view embedding for zero-shot recognition and annotation. In: Proceedings of European Conference On Computer Vision. Berlin: Springer, 2014. 584–599

    Google Scholar 

  32. Chen S, Hong Z, Xie G, et al. GNDAN: graph navigated dual attention network for zero-shot learning. IEEE Trans Neural Netw Learn Syst, 2022. doi: https://doi.org/10.1109/TNNLS.2022.3155602

    Google Scholar 

  33. Xu B, Zeng Z, Lian C, et al. Generative mixup networks for zero-shot learning. IEEE Trans Neural Netw Learn Syst, 2022. doi: https://doi.org/10.1109/TNNLS.2022.3142181

    Google Scholar 

  34. Yu Y, Li B, Ji Z, et al. Knowledge distillation classifier generation network for zero-shot learning. IEEE Trans Neural Netw Learn Syst, 2023, 34: 3183–3194

    Article  Google Scholar 

  35. Ji Z, Sun Y, Yu Y L, et al. Attribute-guided network for cross-modal zero-shot hashing. IEEE Trans Neural Netw Learn Syst, 2020, 31: 321–330

    Article  Google Scholar 

  36. Ji Z, Yu X J, Yu Y L, et al. Semantic-guided class-imbalance learning model for zero-shot image classification. IEEE Trans Cybern, 2021, 52: 6543–6554

    Article  Google Scholar 

  37. Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks. 2014. ArXiv:1406.2661

  38. Xian Y, Lorenz T, Schiele B, et al. Feature generating networks for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 5542–5551

    Google Scholar 

  39. Felix R, Reid I, Carneiro G, et al. Multi-modal cycle-consistent generalized zero-shot learning. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 21–37

    Google Scholar 

  40. Li J, Jing M, Lu K, et al. Leveraging the invariant side of generative zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 7402–7411

    Google Scholar 

  41. Ji Z, Yan J T, Wang Q, et al. Triple discriminator generative adversarial network for zero-shot image classification. Sci China Inf Sci, 2021, 64: 120101

    Article  MathSciNet  Google Scholar 

  42. Bai G R, He S Z, Liu K, et al. Example-guided stylized response generation in zero-shot setting. Sci China Inf Sci, 2022, 65: 149103

    Article  Google Scholar 

  43. Xian Y, Sharma S, Schiele B, et al. F-VAEGAN-D2: a feature generating framework for any-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 10275–10284

    Google Scholar 

  44. Xie G S, Zhang Z, Liu G S, et al. Generalized zero-shot learning with multiple graph adaptive generative networks. IEEE Trans Neural Netw Learn Syst, 2022, 33: 2903–2915

    Article  Google Scholar 

  45. Zou Q, Cao L, Zhang Z, et al. Transductive zero-shot hashing for multilabel image retrieval. IEEE Trans Neural Netw Learn Syst, 2022, 33: 1673–1687

    Article  Google Scholar 

  46. Huang P, Han J, Cheng D, et al. Robust region feature synthesizer for zero-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 7622–7631

    Google Scholar 

  47. Zhang D W, Guo G Y, Zeng W Y, et al. Generalized weakly supervised object localization. IEEE Trans Neural Netw Learn Syst, 2022. doi: https://doi.org/10.1109/TNNLS.2022.3204337

    Google Scholar 

  48. Zhao H, Puig X, Zhou B, et al. Open vocabulary scene parsing. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 2002–2010

    Google Scholar 

  49. Li P, Wei Y, Yang Y. Consistent structural relation learning for zero-shot segmentation. In: Proceedings of Advances in Neural Information Processing Systems, 2020. 33

    Google Scholar 

  50. Hu P, Sclaroff S, Saenko K. Uncertainty-aware learning for zero-shot semantic segmentation. In: Proceedings of Advances in Neural Information Processing Systems, 2020. 33: 21713–21724

    Google Scholar 

  51. Pastore G, Cermelli F, Xian Y, et al. A closer look at self-training for zero-label semantic segmentation. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021. 2687–2696

    Google Scholar 

  52. Cheng J, Nandi S, Natarajan P, et al. SIGN: spatial-information incorporated generative network for generalized zero-shot semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021. 9556–9566

    Google Scholar 

  53. Zhu Y, Elhoseiny M, Liu B, et al. A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 1004–1013

    Google Scholar 

  54. Elhoseiny M, Elfeki M. Creativity inspired zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. 5784–5793

    Google Scholar 

  55. Wei K, Deng C, Yang X. Lifelong zero-shot learning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020. 551–557

    Google Scholar 

  56. Gautam C, Parameswaran S, Mishra A, et al. Generalized continual zero-shot learning. 2020. ArXiv:2011.08508

  57. Liu Q, Xie L, Wang H, et al. Semantic-aware knowledge preservation for zero-shot sketch-based image retrieval. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. 3662–3671

    Google Scholar 

  58. Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems, 2013. 3111–3119

    Google Scholar 

  59. Joulin A, Grave E, Bojanowski P, et al. Bag of tricks for efficient text classification. 2016. ArXiv:1607.01759

Download references

Acknowledgements This work was supported by National Natural Science Foundation of China (Grant Nos. 62206010, 62022009).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuqing Ma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Bai, S., An, S. et al. A meaningful learning method for zero-shot semantic segmentation. Sci. China Inf. Sci. 66, 210103 (2023). https://doi.org/10.1007/s11432-022-3748-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3748-5

Keywords

Navigation