A novel inference paradigm based on multi-view prototypes for one-shot semantic segmentation

Wang, Hailing; Cao, Guitao; Cao, Wenming

doi:10.1007/s10489-023-04922-9

A novel inference paradigm based on multi-view prototypes for one-shot semantic segmentation

Published: 11 August 2023

Volume 53, pages 25771–25786, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

201 Accesses
Explore all metrics

Abstract

One-shot semantic segmentation approaches aim to learn a meta-learning framework from seen classes with annotated samples, which can be applied in novel classes with only one annotated sample. However, most existing works still face the challenge of reduced generalization capability on novel classes due to two reasons: utilizing only foreground and background prototypes generated from support samples may lead to semantic bias from the model’s perspective, and negative support-query pairs may result in spatial inconsistency from the data’s perspective. To alleviate the semantic bias problem, we propose a multi-view prototype learning paradigm to reduce the appearance discrepancy between support and query images. In addition to the classical foreground and background prototypes, the multi-view prototypes include support outline view, query foreground view, seen class object view and natural background view prototypes. These proposed prototypes provide more refined semantic support information. To reduce the impact of negative samples, we propose a novel inference paradigm (n-iteration inference) for producing pseudo labels of novel classes as augmented support samples. These samples are then applied in the proposed multi-view prototype method for one-shot semantic segmentation. Experimental results show that we have achieved new state-of-the-art performance on the two standard datasets, PASCAL-5\( ^{i} \) and COCO-20\( ^{i} \). Furthermore, we apply the inference paradigm to other classical works in order to enhance the performance of one-shot semantic segmentation. Our source code will be available on https://github.com/WHL182/MVPNet.

Graphical Abstract

(left).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Object Detection with Transformers

Learning to Prompt for Vision-Language Models

Article 31 July 2022

Self-supervised Learning: A Succinct Review

Article 20 January 2023

References

Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: MICCAI, pp 234–241
Gu Z, Cheng J, Fu H et al (2019) Cenet: Context encoder network for 2d medical image segmentation. IEEE Trans Med Imaging 38(10):2281–2292
Article Google Scholar
Ibtehaz N, Rahman MS (2020) Multiresunet: Rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87
Article Google Scholar
Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95
Article Google Scholar
Hospedales TM, Antoniou A, Micaelli P et al (2022) Meta-learning in neural networks: A survey. IEEE Trans Pattern Anal Mach Intell 44(9):5149–5169. https://doi.org/10.1109/TPAMI.2021.3079209
Article Google Scholar
Luo S, Li Y, Gao P et al (2022) Meta-seg: A survey of meta-learning for image segmentation. Pattern Recognit 126:108586. https://doi.org/10.1016/j.patcog.2022.108586
Article Google Scholar
Li W, Xu J, Huo J et al (2019) Distribution consistency based covariance metric networks for few-shot learning. In: AAAI, pp 8642–8649
Liu J, Song L, Qin Y (2020) Prototype rectification for few-shot learning. In: ECCV, pp 741–756
Shen Z, Liu Z, Qin J et al (2021) Partial is better than all: Revisiting fine-tuning strategy for few-shot learning. In: AAAI, pp 9594–9602
Shaban A, Bansal S, Liu Z et al (2017) Oneshot learning for semantic segmentation. In: BMVC
Zhang C, Lin G, Liu F et al (2019) Canet: Class-agnostic segmentation networks with iterative refinement and attentive few–shot learning. In: CVPR, pp 5217–5226
Nguyen K, Todorovic S (2019) Feature weighting and boosting for few-shot segmentation. In: ICCV, pp 622–631
Yang L, Zhuo W, Qi L, et al (2021) Mining latent classes for few-shot segmentation. In: ICCV, pp 8701–8710
Li G, Jampani V, Sevilla-Lara L et al (2021) Adaptive prototype learning and allocation for few-shot segmentation. In: CVPR, pp 8334–8343
Tian Z, Zhao H, Shu M et al (2022) Prior guided feature enrichment network for fewshot segmentation. IEEE Trans Pattern Anal Mach Intell 44(2):1050–1065
Article Google Scholar
Cheng G, Lang C, Han J (2023) Holistic prototype activation for few-shot segmentation. IEEE Trans Pattern Anal Mach Intell 45(4):4650–4666
Google Scholar
Zhang X, Wei Y, Li Z et al (2022) Rich embedding features for one-shot semantic segmentation. IEEE Trans Neural Netw Learn Syst 33(11):6484–6493
Article Google Scholar
Zhang C, Lin G, Liu F, et al (2019) Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: ICCV, pp 9586–9594
Wang H, Zhang X, Hu Y, et al (2020) Fewshot semantic segmentation with democratic attention networks. In: ECCV (13), Lecture Notes in Computer Science, vol 12358.Springer, pp 730–746
Gairola S, Hemani M, Chopra A et al (2020) Simpropnet: Improved similarity propagation for few-shot image segmentation. In: IJCAI.ijcai.org, pp 573–579
Min J, Kang D, Cho M (2021) Hypercorrelation squeeze for few-shot segmenation. In: ICCV, pp 6921–6932
Liu B, Jiao J, Ye Q (2021) Harmonic feature activation for few-shot semantic segmentation. IEEE Trans Image Process 30:3142-3153
Article Google Scholar
Fan Q, Pei W, Tai Y et al (2022) Self-support few-shot semantic segmentation. In: ECCV, pp 701–719
Lang C, Cheng G, Tu B et al (2022) Learning what not to segment: A new perspective on few-shot segmentation. In: CVPR, pp 8047–8057
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: CVPR, pp 3431–3440
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoderdecoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. In: CVPR, pp 6230–6239
Chen L, Zhu Y, Papandreou G et al (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp 833–851
He J, Deng Z, Zhou L et al (2019) Adaptive pyramid context network for semantic segmentation. In: CVPR, pp 7519–7528
Chen L, Papandreou G, Kokkinos I et al (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation. In: CVPR, pp 3146–3154
Li X, Zhong Z, Wu J et al (2019) Expectation-maximization attention networks for semantic segmentation. In: ICCV, pp 9166–9175
Choi S, Kim JT, Choo J (2020) Cars can’t fly up in the sky: Improving urban-scene segmentation via height-driven attention networks. In: CVPR, pp 9370–9380
Zhang F, Chen Y, Li Z et al (2019) Acfnet: Attentional class feature network for semantic segmentation. In: ICCV, pp 6797–6806
Huang Z, Wang X, Huang L et al (2019) Ccnet: Criss-cross attention for semantic segmentation. In: ICCV, pp 603–612
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: ICLR
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML, pp 1126–1135
Jamal MA, Qi G (2019) Task agnostic meta-learning for few-shot learning. In: CVPR, pp 11719–11727
Chen Z, Fu Y, Chen K et al (2019) Image block augmentation for one-shot learning. In: AAAI, pp 3379–3386
Chen Z, Fu Y, Wang Y et al (2019) Image deformation meta-networks for oneshot learning. In: CVPR, pp 8680–8689
Sung F, Yang Y, Zhang L et al (2018) Learning to compare: Relation network for few–shot learning. In: CVPR, pp 1199–1208
Li H, Eigen D, Dodge S et al (2019) Finding task-relevant features for few-shot learning by category traversal. In: CVPR, pp 1–10
Allen KR, Shelhamer E, Shin H et al (2019) Infinite mixture prototypes for fewshot learning. In: ICML, pp 232–241
Hou R, Chang H, Ma B et al (2019) Cross attention network for few-shot classification. In: NIPS, pp 4005–4016
Doersch C, Gupta A, Zisserman A (2020) Crosstransformers: spatially-aware few-shot transfer. In: NIPS
Liu J, Song L, Qin Y (2020) Prototype rectification for few-shot learning. In: ECCV, pp 741–756
Snell J, Swersky K, Zemel RS (2017) Prototypical networks for few-shot learning. In: NIPS, pp 4077–4087
Zhang X, Wei Y, Yang Y et al (2020) Sgone: Similarity guidance network for one-shot semantic segmentation. IEEE Trans Cybern 50(9):3855–3865
Article Google Scholar
Zhang B, Xiao J, Qin T (2021) Self-guided and cross-guided learning for few-shot segmentation. In: CVPR, pp 8312–8321
Mao B, Zhang X, Wang L et al (2022) Learning from the target: Dual prototype network for few shot semantic segmentation. In: AAAI, pp 1953–1961
Wang Y, Wang H, Shen Y et al (2022) Semi-supervised semantic segmentation using unreliable pseudo–labels. In: CVPR, pp 4238–4247
Yang L, Zhuo W, Qi L et al (2022) ST++: make self-trainingwork better for semi-supervised semantic segmentation. In: CVPR, pp 4258–4267
Liu Y, Zhang X, Zhang S et al (2020) Part-aware prototype network for few-shot semantic segmentation. In: ECCV, pp 142–158
Everingham M, Gool LV, Williams CKI et al (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Hariharan B, Arbeláez PA, Girshick RB et al (2014) Simultaneous detection and segmentation. In: ECCV, pp 297–312
Lin T, Maire M, Belongie SJ et al (2014) Microsoft COCO: common objects in context. In: ECCV, pp 740–755
Wang K, Liew JH, Zou Y et al (2019) Panet: Few-shot image semantic segmentation with prototype alignment. In: ICCV, pp 9196–9205
Yang B, Liu C, Li B et al (2020) Prototype mixture models for few-shot semantic segmentation. In: ECCV, pp 763–778
Lu Z, He S, Zhu X et al (2021) Simpler is better: Few-shot semantic segmentation with classifier weight transformer. In: ICCV, pp 8721–8730
Lang C, Tu B, Cheng G et al (2022) Beyond the prototype: Divide-and-conquer proxies for few-shot segmentation. In: IJCAI, pp 1024–1030
Yang B, Wan F, Liu C et al (2022) Part-based semantic transform for few-shot semantic segmentation. IEEE Trans Neural Networks Learn Syst 33(12):7141–7152
Article Google Scholar
Liu H, Peng P, Chen T et al (2023) Fecanet: Boosting few-shot semantic segmentation with feature-enhanced context-aware network. IEEE Trans Multimed 1–13. https://doi.org/10.1109/TMM.2023.3238521

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 61871186 and 61771322.

Author information

Authors and Affiliations

Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Putuo District, Shanghai, 200062, China
Hailing Wang & Guitao Cao
MOE Research Center for Software/Hardware Co-Design Engineering, East China Normal University, Putuo District, Shanghai, 200062, China
Hailing Wang & Guitao Cao
College of Information Engineering, Shenzhen University, Shenzhen, 518060, China
Wenming Cao

Authors

Hailing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guitao Cao
View author publications
You can also search for this author in PubMed Google Scholar
Wenming Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guitao Cao.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, H., Cao, G. & Cao, W. A novel inference paradigm based on multi-view prototypes for one-shot semantic segmentation. Appl Intell 53, 25771–25786 (2023). https://doi.org/10.1007/s10489-023-04922-9

Download citation

Accepted: 29 July 2023
Published: 11 August 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10489-023-04922-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions