SSPSNet: a single shot panoptic segmentation network for accurate scene parsing

Wang, Qi; Wang, Yuanshuai; Zhou, Yuan; Wang, Jing; Jiang, Wuming; Zhang, Xiangde

doi:10.1007/s00521-021-06350-7

SSPSNet: a single shot panoptic segmentation network for accurate scene parsing

Original Article
Published: 22 August 2021

Volume 34, pages 677–688, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Qi Wang^1,2,
Yuanshuai Wang¹,
Yuan Zhou¹,
Jing Wang¹,
Wuming Jiang³ &
…
Xiangde Zhang ORCID: orcid.org/0000-0003-4378-5381¹

421 Accesses
2 Citations
Explore all metrics

Abstract

Panoptic segmentation is a challenging task which aims to provide a comprehensive scene parsing result. Researchers have been devoted to improve its accuracy and efficiency. In this paper, we propose a single shot panoptic segmentation network (SSPSNet) to handle this task more accurately. SSPSNet novelly develops the object detection network FCOS by adding a mask segmentation branch to predict the instance mask and a semantic segmentation branch to predict the classes of background pixels. In addition, we design a parameter-free identical mapping connection module that increases shortcut on the mask segmentation, FCOS classification and regression branches, respectively, to extract more expressive feature maps for instance segmentation and object detection subtasks. More importantly, we design a parameter-free category and location aware module that transfers the category and location information of FCOS to the mask and semantic segmentation branches for improving their ability of distinguishing instances and background. Experimental results show that the proposed SSPSNet gets 44.0 /45.8PQ, 11.6/10.0FPS on COCO-Panoptic 2017 when uses ResNet-50/101-FPN as backbone, which achieves the state-of-the-art performance with smaller parameters and computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EPSNet: Efficient Panoptic Segmentation Network with Cross-layer Attention Fusion

ChaInNet: Deep Chain Instance Segmentation Network for Panoptic Segmentation

Article 08 August 2022

Contour-Aware Panoptic Segmentation Network

References

Kirillov A, He K, Girshick R et al (2019) Panoptic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 9396–9405
Zhao H, Jianping S, Xiaojuan Q et al (2017) Pyramid scene parsing network. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 6230–6239
He K, Gkioxari G, Dollar P et al (2017) Mask r-cnn. In: IEEE International Conference on Computer Vision (ICCV), pp 2980–2988
De Geus D, Meletis P, Dubbelman G (2019) Panoptic segmentation with a joint semantic and instance segmentation network. ArXiv Preprint, arXiv:1809.02110
Li J, Raventos A, Bhargava A et al (2019) Learning to fuse things and stuff. ArXiv Preprint, arXiv:1812.01192
Li Y, Chen X, Zhu Z et al (2019) Attention-guided unified network for panoptic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 7019–7028
Xiong Y, Liao R, Zhao H et al (2019) Upsnet: a unified panoptic segmentation network. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 8810–8818
Lazarow J, Lee K, Shi K et al (2020) Learning instance occlusion for panoptic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 10717–10726
Kirillov A, Girshick R, He K et al (2019) Panoptic feature pyramid networks. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 6392–6401
Yang TJ, Collins M, Zhu Y et al (2019) Deeperlab: single-shot image parser. ArXiv Preprint, arXiv:1902.05093
Hou R, Jie L, Arjun B et al (2020) Real-time panoptic segmentation from dense detections. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 8520–8529
Chen Q, Cheng A, He X et al (2020) Spatialflow: bridging all tasks for panoptic segmentation. In: IEEE Transactions on Circuits and Systems for Video Technology 31(6):2288–2300
Lin TY, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (ICCV), pp 2999–3007
Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 779–788
Ren S, He K, Girshick R et al (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. In: Proceedings of the European conference on computer vision (ECCV), pp 21–37
Tian Z, Shen C, Chen H et al (2019) Fcos: fully convolutional one-stage object detection. In: IEEE International Conference on Computer Vision (ICCV), pp 9626–9635
Duan K, Bai S, Xie L et al (2019) Centernet: keypoint triplets for object detection. In: IEEE International Conference on Computer Vision (ICCV), pp 6568–6577
Yang Z, Liu S, Hu H et al (2019) Reppoints: point set representation for object detection. In: IEEE International Conference on Computer Vision (ICCV), pp 9657–9666
Law H, Deng J (2019) Cornernet: detecting objects as paired keypoints. ArXiv Preprint, arXiv:1808.01244
Zhang S, Chi C, Yao Y et al (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 9756–9765
Liu S, Jia J, Fidler S et al (2017) SGN: sequential grouping networks for instance segmentation. In: IEEE International Conference on Computer Vision (ICCV), pp 3516–3524
Newell A, Huang Z, Deng J (2017) Associative embedding: end-to-end learning for joint detection and grouping. In: Advances in Neural Information Processing Systems, pp 2277–2287
Chen H, Sun K, Tian Z et al (2020) Blendmask: top-down meets bottom-up for instance segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 8570–8578
Xie E, Peize S, Xiaoge S et al (2020) Polarmask: single shot instance segmentation with polar representation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 12190–12199
Wang X, Kong T, Shen C et al (2020) Solo: segmenting objects by locations. In: Proceedings of the European conference on computer vision (ECCV), pp 649–665
Wang X, Zhang R, Kong T et al (2020) SOLOv2: dynamic and fast instance segmentation. In: Advances in Neural Information Processing Systems, pp 17721–17732
Uijlings JR, van de Sande KE, Gevers T et al (2013) Selective search for object recognition. Int J Comput Vision 104(2):154–171
Article Google Scholar
Carreira J, Sminchisescu C (2012) CPMC: automatic object segmentation using constrained parametric min-cuts. IEEE Transa Pattern Anal Mach Intell 34(7):1312–1328
Article Google Scholar
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning, pp 282–289
Boykov YY, Jolly M (2001) Interactive graph cuts for optimal boundary region segmentation of objects in n-d images. In: IEEE International Conference on Computer Vision (ICCV), pp 105–112
Long J, Shelhamer S, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 3431–3440
Yu F, Vladlen K, Thomas F (2017) Dilated residual networks. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 636–644
Chen LC, Papandreou G, Schroff F et al (2017) Rethinking atrous convolution for semantic image segmentation. ArXiv Preprint, arXiv:1706.05587
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Dai J, Haozhi Q, Yuwen X et al. (2017) Deformable convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp 764–773
Zhu H, Zhang M, Zhang X et al (2021) Two-branch encoding and iterative attention decoding network for semantic segmentation. Neural Comput Appl 33:5151–5166
Google Scholar
Li Q, Arnab A, Torr PH (2018) Weakly-and semi-supervised panoptic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 102–118
Chen Y, Lin G, Li S et al (2020) Banet: bidirectional aggregation network with occlusion handling for panoptic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 3792–3801
Li Q, Qi X, Torr PH (2020) Unifying training and inference for panoptic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 13317–13325
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 770–778
Lin TY, Dollar P, Girshick R et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 936–944
Rezatofighi H, Nathan T, JunYoung G et al (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 658–666
Yi-de M, Qing, L, Zhi-Bai Q (2004) Automated image segmentation using improved PCNN model based on cross-entropy. In: International Symposium on Intelligent Multimedia, Video and Speech Processing, pp 743–746
Lin TY, Michael M, Serge B et al (2014) Microsoft coco: common objects in context. In: Proceedings of the European conference on computer vision (ECCV), pp 740–755
Sofiiuk K, Barinova O, Konushin A (2019) Adaptis: adaptive instance selection network. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 7354–7362
Liu H, Chao P, Changqian Y et al (2019) An end-to-end network for panoptic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 6165–6174
Wu Y, Zhang G, Gao Y et al (2020) Bidirectional graph reasoning network for panoptic segmentation. In: Proceedings of the IEEE on computer vision and pattern recognition (CVPR), pp 9077–9086
Hwang S, Oh SW, Kim SJ (2020) Single-shot path integrated panoptic segmentation. ArXiv Preprint, arXiv:2012.01632
Gao N, Shan Y, Wang Y et al (2020) SSAP: single-shot instance segmentation with affinity pyramid. In: IEEE International Conference on Computer Vision, pp 642–651
Chen K, Wang J, Pang J et al (2019) MMDetection: open mmlab detection toolbox and benchmark. In: CORR. ArXiv Preprint, arxiv:1906.07155
Paszke A, Sam G, Soumith C et al (2017) Automatic differentiation in pytorch. In: Advances in Neural Information Processing Systems Workshop

Download references

Funding

This work is supported by National Natural Science Foundation of China (Grant No.61703088), the Fundamental Research Funds for the Central Universities (Grant No.N2105009) and the Doctoral Scientific Research Foundation of Liaoning Province (Grant No.20170520326).

Author information

Authors and Affiliations

Department of Mathematics, College of Sciences, Northeastern University, Shenyang, 110819, China
Qi Wang, Yuanshuai Wang, Yuan Zhou, Jing Wang & Xiangde Zhang
Key Laboratory of Data Analytics and Optimization for Smart Industry (Northeastern University), Ministry of Education, Shenyang, China
Qi Wang
Beijing Eyecool Technology Co., Ltd, Beijing, 100089, China
Wuming Jiang

Authors

Qi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanshuai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wuming Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangde Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiangde Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (AUX 11 KB)

Supplementary file2 (BLG 17 KB)

Supplementary file3 (LOG 30 KB)

Supplementary file4 (OUT 1 KB)

Supplementary file5 (FILE 722 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Q., Wang, Y., Zhou, Y. et al. SSPSNet: a single shot panoptic segmentation network for accurate scene parsing. Neural Comput & Applic 34, 677–688 (2022). https://doi.org/10.1007/s00521-021-06350-7

Download citation

Received: 06 February 2021
Accepted: 18 July 2021
Published: 22 August 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s00521-021-06350-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSPSNet: a single shot panoptic segmentation network for accurate scene parsing

Abstract

Access this article

Similar content being viewed by others

EPSNet: Efficient Panoptic Segmentation Network with Cross-layer Attention Fusion

ChaInNet: Deep Chain Instance Segmentation Network for Panoptic Segmentation

Contour-Aware Panoptic Segmentation Network

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (AUX 11 KB)

Supplementary file2 (BLG 17 KB)

Supplementary file3 (LOG 30 KB)

Supplementary file4 (OUT 1 KB)

Supplementary file5 (FILE 722 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SSPSNet: a single shot panoptic segmentation network for accurate scene parsing

Abstract

Access this article

Similar content being viewed by others

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation