Dual-branch aggregation and edge refinement network for few shot semantic segmentation

Tang, Qingsong; Ren, Yalei; Shan, Zhanghui; Bao, Chenyang; Liu, Yang

doi:10.1007/s00530-025-01718-4

Dual-branch aggregation and edge refinement network for few shot semantic segmentation

Regular Paper
Published: 08 March 2025

Volume 31, article number 142, (2025)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Qingsong Tang¹,
Yalei Ren¹,
Zhanghui Shan¹,
Chenyang Bao¹ &
…
Yang Liu²

128 Accesses
Explore all metrics

Abstract

Existing Few-Shot Semantic Segmentation (FSS) methods often focus on extracting semantic information from support images to guide the segmentation of query images, while less attention is paid to exploring the query branch. However, due to the limited number of support images, there exists significant intra-class variance between support and query images. Additionally, relying solely on a single support prototype to guide query segmentation often leads to inaccurate segmentation boundaries in the prediction results, which can affect the model’s performance. In this study, we simultaneously consider the information extraction from both the support and query branches and propose a Dual-branch Aggregation and Edge Refinement (DAER) network for accurate query image segmentation. Specifically, to better explore the information from the query branch, we introduce an Initial Mask Generation Module (IMGM) that generates an initial mask for the query image. Furthermore, we propose a Dual-Branch Aggregation Module (DBAM) that simultaneously captures information from both the support and query branches. Finally, an Edge Refinement Module (ERM) is introduced to integrate more query-specific positional information into the network. Extensive experiments on standard few-shot semantic segmentation benchmarks, including PASCAL-$5^i$ and COCO-$20^i$, demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SANet: similarity aggregation and semantic fusion for few-shot semantic segmentation

Article 10 December 2024

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

Multi-prototype collaborative perception enhancement network for few-shot semantic segmentation

Article 11 December 2024

Data Availability

Data will be made available on request.

References

Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, Massachusetts, USA, pp. 3431–3440 (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234-241. Springer International Publishing (2015)
Zhao, H., Shi, J., Qi, X., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 2881–2890 (2017)
Chen, L.C., Papandreou, G., Kokkinos, I., et al.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article Google Scholar
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (ICLR), San Juan, Puerto Ric (2016)
Vinyals, O., Blundell, C., Lillicrap, T., et al.: Matching networks for one shot learning. Adv. Neural Inf. Process. Syst. 29, 3630–3638 (2016)
Google Scholar
Shaban, A., Bansal, S., Liu, Z., et al.: One-shot learning for semantic segmentation. arXiv preprint arXiv:1709.03410 (2017)
Lin, G., Milan A., Shen, C., et al.: Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA，pp. 1925–1934 (2017)
Huang, Z., Wang, X., Huang, L., et al.: Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea，pp. 603–612 (2019)
Li, H., Eigen D., Dodge, S., et al.: Finding task-relevant features for few-shot learning by category traversal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 1–10 (2019)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 30 (2017). https://proceedings.neurips.cc/paper_files/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
Sung, F., Yang, Y., Zhang, L., et al.: Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA，pp. 1199–1208 (2018)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning (ICML), New York City, USA, pp. 1126–1135 (2017)
Jamal, M.A., Qi, G.J.: Task agnostic meta-learning for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 11719–11727 (2019)
Chen, Z., Fu, Y., Chen, K., et al.: Image block augmentation for one-shot learning. Proc. AAAI Conf. Artif. Intell. 33(01), 3379–3386 (2019)
Google Scholar
Chen, Z., Fu, Y., Wang, Y.X., et al.: Image deformation meta-networks for one-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 8680–8689 (2019)
Dong, N., Xing, E.P.: Few-shot semantic segmentation with prototype learning. Br. Mach. Vis. Conf. 3, 4 (2018)
Google Scholar
Li, G., Jampani, V., Sevilla-Lara, L., et al.: Adaptive prototype learning and allocation for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, Tennessee, USA, pp. 8334–8343 (2021)
Wang, K., Liew, J.H., Zou, Y., et al.: Panet: few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, pp. 9197–9206 (2019)
Yang, L., Zhuo, W., Qi, L., et al.: Mining latent classes for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Kyoto, Japan, pp. 8721–8730 (2021)
Zhang, X., Wei, Y., Yang, Y., et al.: Sg-one: similarity guidance network for one-shot semantic segmentation. IEEE Trans. Cybern. 50(9), 3855–3865 (2020)
Article Google Scholar
Liu, Y., Zhang, X., Zhang, S., et al.: Part-aware prototype network for few-shot semantic segmentation. In: Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IX 16, pp. 142–158. Springer International Publishing (2020)
Lang, C., Cheng G., Tu, B., et al.: Learning what not to segment: a new perspective on few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, Louisiana, USA, pp. 8057–8067 (2022)
Liu, W., Zhang, C., Lin, G., et al.: Crnet: cross-reference networks for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 4165–4173 (2020)
Zhang, C., Lin, G., Liu, F., et al.: Canet: class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 5217–5226 (2019)
Tian, Z., Zhao, H., Shu, M., et al.: Prior guided feature enrichment network for few-shot segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 44(2), 1050–1065 (2020)
Article Google Scholar
Zhang, C., Lin, G., Liu, F., et al.: Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, pp. 9587-9595 (2019)
Lu, Z., He, S., Zhu, X., et al.: Simpler is better: few-shot semantic segmentation with classifier weight transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Kyoto, Japan, pp. 8741–8750 (2021)
Yang, Y., Chen, Q., Feng, Y., et al.: MIANet: aggregating unbiased instance and general information for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, pp. 7131–7140 (2023)
Fan, Q., Pei, W., Tai, Y.W., et al.: Self-support few-shot semantic segmentation. In: European Conference on Computer Vision (ECCV), Tel Aviv, Israel, pp. 701–719 (2022)
Liu, Y., Liu, N., Yao, X., et al.: Intermediate prototype mining transformer for few-shot semantic segmentation. Adv. Neural Inf. Process. Syst. 35, 38020–38031 (2022)
Google Scholar
Liu, Y., Liu, N., Cao, Q., et al.: Learning non-target knowledge for few-shot semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, Louisiana, USA,pp. 11573–11582 ( 2022)
Min, J., Kang, D., Cho, M.: Hypercorrelation squeeze for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Kyoto, Japan, pp. 6941–6952 (2021)
Cheng, G., Lang, C., Han, J.: Holistic prototype activation for few-shot segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4650–4666 (2022)
Google Scholar
Nguyen, K., Todorovic, S.: Feature weighting and boosting for few-shot segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, pp. 622–631 (2019)
Everingham, M., Van Gool, L., Williams, C.-K.-I., et al.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis 88, 303–338 (2010)
Article Google Scholar
Hariharan, B., Arbeláez, P., Bourdev, L., et al.: Semantic contours from inverse detectors. In: International Conference on Computer Vision, pp. 991–998. IEEE (2011)
Lin, T.Y., Maire, M., Belongie, S., et al.: Microsoft coco: common objects in context. In: Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–755. Springer International Publishing (2014)
Liu, H., Peng, P., Chen, T., et al.: Fecanet: boosting few-shot semantic segmentation with feature-enhanced context-aware network. IEEE Trans. Multimed. 25, 8580–8592 (2023)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (ICLR), San Diego, California, USA (2015)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition( CVPR), Las Vegas, Nevada, USA, pp. 770–778 (2016)

Download references

Funding

No funding was obtained for this study.

Author information

Authors and Affiliations

Department of Mathematics, College of Sciences, Northeastern University, Sanhao Street, Shenyang, 110819, Liaoning, China
Qingsong Tang, Yalei Ren, Zhanghui Shan & Chenyang Bao
EyeCool Technology Co., Ltd, Zhongguancun, Beijing, 200050, China
Yang Liu

Authors

Qingsong Tang
View author publications
You can also search for this author inPubMed Google Scholar
Yalei Ren
View author publications
You can also search for this author inPubMed Google Scholar
Zhanghui Shan
View author publications
You can also search for this author inPubMed Google Scholar
Chenyang Bao
View author publications
You can also search for this author inPubMed Google Scholar
Yang Liu
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Qingosong Tang: Methodology, Investigation, Writing-original draft and review. Yalei Ren: Investigation, Code, Writing-review and editing. Zhanghui Han: Investigation, Data curation, Visualization. Chenyan Bao: Data curation, Code. Yang Liu: Investigation, Writing-review and editing.

Corresponding author

Correspondence to Qingsong Tang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this study.

Additional information

Communicated by Junyu Gao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tang, Q., Ren, Y., Shan, Z. et al. Dual-branch aggregation and edge refinement network for few shot semantic segmentation. Multimedia Systems 31, 142 (2025). https://doi.org/10.1007/s00530-025-01718-4

Download citation

Received: 15 May 2024
Accepted: 06 February 2025
Published: 08 March 2025
DOI: https://doi.org/10.1007/s00530-025-01718-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual-branch aggregation and edge refinement network for few shot semantic segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SANet: similarity aggregation and semantic fusion for few-shot semantic segmentation

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

Multi-prototype collaborative perception enhancement network for few-shot semantic segmentation

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now