Abstract
One-shot semantic segmentation approaches aim to learn a meta-learning framework from seen classes with annotated samples, which can be applied in novel classes with only one annotated sample. However, most existing works still face the challenge of reduced generalization capability on novel classes due to two reasons: utilizing only foreground and background prototypes generated from support samples may lead to semantic bias from the model’s perspective, and negative support-query pairs may result in spatial inconsistency from the data’s perspective. To alleviate the semantic bias problem, we propose a multi-view prototype learning paradigm to reduce the appearance discrepancy between support and query images. In addition to the classical foreground and background prototypes, the multi-view prototypes include support outline view, query foreground view, seen class object view and natural background view prototypes. These proposed prototypes provide more refined semantic support information. To reduce the impact of negative samples, we propose a novel inference paradigm (n-iteration inference) for producing pseudo labels of novel classes as augmented support samples. These samples are then applied in the proposed multi-view prototype method for one-shot semantic segmentation. Experimental results show that we have achieved new state-of-the-art performance on the two standard datasets, PASCAL-5\( ^{i} \) and COCO-20\( ^{i} \). Furthermore, we apply the inference paradigm to other classical works in order to enhance the performance of one-shot semantic segmentation. Our source code will be available on https://github.com/WHL182/MVPNet.
Graphical Abstract
(left).
Similar content being viewed by others
References
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: MICCAI, pp 234–241
Gu Z, Cheng J, Fu H et al (2019) Cenet: Context encoder network for 2d medical image segmentation. IEEE Trans Med Imaging 38(10):2281–2292
Ibtehaz N, Rahman MS (2020) Multiresunet: Rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87
Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95
Hospedales TM, Antoniou A, Micaelli P et al (2022) Meta-learning in neural networks: A survey. IEEE Trans Pattern Anal Mach Intell 44(9):5149–5169. https://doi.org/10.1109/TPAMI.2021.3079209
Luo S, Li Y, Gao P et al (2022) Meta-seg: A survey of meta-learning for image segmentation. Pattern Recognit 126:108586. https://doi.org/10.1016/j.patcog.2022.108586
Li W, Xu J, Huo J et al (2019) Distribution consistency based covariance metric networks for few-shot learning. In: AAAI, pp 8642–8649
Liu J, Song L, Qin Y (2020) Prototype rectification for few-shot learning. In: ECCV, pp 741–756
Shen Z, Liu Z, Qin J et al (2021) Partial is better than all: Revisiting fine-tuning strategy for few-shot learning. In: AAAI, pp 9594–9602
Shaban A, Bansal S, Liu Z et al (2017) Oneshot learning for semantic segmentation. In: BMVC
Zhang C, Lin G, Liu F et al (2019) Canet: Class-agnostic segmentation networks with iterative refinement and attentive few–shot learning. In: CVPR, pp 5217–5226
Nguyen K, Todorovic S (2019) Feature weighting and boosting for few-shot segmentation. In: ICCV, pp 622–631
Yang L, Zhuo W, Qi L, et al (2021) Mining latent classes for few-shot segmentation. In: ICCV, pp 8701–8710
Li G, Jampani V, Sevilla-Lara L et al (2021) Adaptive prototype learning and allocation for few-shot segmentation. In: CVPR, pp 8334–8343
Tian Z, Zhao H, Shu M et al (2022) Prior guided feature enrichment network for fewshot segmentation. IEEE Trans Pattern Anal Mach Intell 44(2):1050–1065
Cheng G, Lang C, Han J (2023) Holistic prototype activation for few-shot segmentation. IEEE Trans Pattern Anal Mach Intell 45(4):4650–4666
Zhang X, Wei Y, Li Z et al (2022) Rich embedding features for one-shot semantic segmentation. IEEE Trans Neural Netw Learn Syst 33(11):6484–6493
Zhang C, Lin G, Liu F, et al (2019) Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: ICCV, pp 9586–9594
Wang H, Zhang X, Hu Y, et al (2020) Fewshot semantic segmentation with democratic attention networks. In: ECCV (13), Lecture Notes in Computer Science, vol 12358.Springer, pp 730–746
Gairola S, Hemani M, Chopra A et al (2020) Simpropnet: Improved similarity propagation for few-shot image segmentation. In: IJCAI.ijcai.org, pp 573–579
Min J, Kang D, Cho M (2021) Hypercorrelation squeeze for few-shot segmenation. In: ICCV, pp 6921–6932
Liu B, Jiao J, Ye Q (2021) Harmonic feature activation for few-shot semantic segmentation. IEEE Trans Image Process 30:3142-3153
Fan Q, Pei W, Tai Y et al (2022) Self-support few-shot semantic segmentation. In: ECCV, pp 701–719
Lang C, Cheng G, Tu B et al (2022) Learning what not to segment: A new perspective on few-shot segmentation. In: CVPR, pp 8047–8057
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: CVPR, pp 3431–3440
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoderdecoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. In: CVPR, pp 6230–6239
Chen L, Zhu Y, Papandreou G et al (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp 833–851
He J, Deng Z, Zhou L et al (2019) Adaptive pyramid context network for semantic segmentation. In: CVPR, pp 7519–7528
Chen L, Papandreou G, Kokkinos I et al (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation. In: CVPR, pp 3146–3154
Li X, Zhong Z, Wu J et al (2019) Expectation-maximization attention networks for semantic segmentation. In: ICCV, pp 9166–9175
Choi S, Kim JT, Choo J (2020) Cars can’t fly up in the sky: Improving urban-scene segmentation via height-driven attention networks. In: CVPR, pp 9370–9380
Zhang F, Chen Y, Li Z et al (2019) Acfnet: Attentional class feature network for semantic segmentation. In: ICCV, pp 6797–6806
Huang Z, Wang X, Huang L et al (2019) Ccnet: Criss-cross attention for semantic segmentation. In: ICCV, pp 603–612
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: ICLR
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML, pp 1126–1135
Jamal MA, Qi G (2019) Task agnostic meta-learning for few-shot learning. In: CVPR, pp 11719–11727
Chen Z, Fu Y, Chen K et al (2019) Image block augmentation for one-shot learning. In: AAAI, pp 3379–3386
Chen Z, Fu Y, Wang Y et al (2019) Image deformation meta-networks for oneshot learning. In: CVPR, pp 8680–8689
Sung F, Yang Y, Zhang L et al (2018) Learning to compare: Relation network for few–shot learning. In: CVPR, pp 1199–1208
Li H, Eigen D, Dodge S et al (2019) Finding task-relevant features for few-shot learning by category traversal. In: CVPR, pp 1–10
Allen KR, Shelhamer E, Shin H et al (2019) Infinite mixture prototypes for fewshot learning. In: ICML, pp 232–241
Hou R, Chang H, Ma B et al (2019) Cross attention network for few-shot classification. In: NIPS, pp 4005–4016
Doersch C, Gupta A, Zisserman A (2020) Crosstransformers: spatially-aware few-shot transfer. In: NIPS
Liu J, Song L, Qin Y (2020) Prototype rectification for few-shot learning. In: ECCV, pp 741–756
Snell J, Swersky K, Zemel RS (2017) Prototypical networks for few-shot learning. In: NIPS, pp 4077–4087
Zhang X, Wei Y, Yang Y et al (2020) Sgone: Similarity guidance network for one-shot semantic segmentation. IEEE Trans Cybern 50(9):3855–3865
Zhang B, Xiao J, Qin T (2021) Self-guided and cross-guided learning for few-shot segmentation. In: CVPR, pp 8312–8321
Mao B, Zhang X, Wang L et al (2022) Learning from the target: Dual prototype network for few shot semantic segmentation. In: AAAI, pp 1953–1961
Wang Y, Wang H, Shen Y et al (2022) Semi-supervised semantic segmentation using unreliable pseudo–labels. In: CVPR, pp 4238–4247
Yang L, Zhuo W, Qi L et al (2022) ST++: make self-trainingwork better for semi-supervised semantic segmentation. In: CVPR, pp 4258–4267
Liu Y, Zhang X, Zhang S et al (2020) Part-aware prototype network for few-shot semantic segmentation. In: ECCV, pp 142–158
Everingham M, Gool LV, Williams CKI et al (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Hariharan B, Arbeláez PA, Girshick RB et al (2014) Simultaneous detection and segmentation. In: ECCV, pp 297–312
Lin T, Maire M, Belongie SJ et al (2014) Microsoft COCO: common objects in context. In: ECCV, pp 740–755
Wang K, Liew JH, Zou Y et al (2019) Panet: Few-shot image semantic segmentation with prototype alignment. In: ICCV, pp 9196–9205
Yang B, Liu C, Li B et al (2020) Prototype mixture models for few-shot semantic segmentation. In: ECCV, pp 763–778
Lu Z, He S, Zhu X et al (2021) Simpler is better: Few-shot semantic segmentation with classifier weight transformer. In: ICCV, pp 8721–8730
Lang C, Tu B, Cheng G et al (2022) Beyond the prototype: Divide-and-conquer proxies for few-shot segmentation. In: IJCAI, pp 1024–1030
Yang B, Wan F, Liu C et al (2022) Part-based semantic transform for few-shot semantic segmentation. IEEE Trans Neural Networks Learn Syst 33(12):7141–7152
Liu H, Peng P, Chen T et al (2023) Fecanet: Boosting few-shot semantic segmentation with feature-enhanced context-aware network. IEEE Trans Multimed 1–13. https://doi.org/10.1109/TMM.2023.3238521
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant 61871186 and 61771322.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, H., Cao, G. & Cao, W. A novel inference paradigm based on multi-view prototypes for one-shot semantic segmentation. Appl Intell 53, 25771–25786 (2023). https://doi.org/10.1007/s10489-023-04922-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04922-9