Abstract
We consider the competition between instance and semantic segmentation in panoptic segmentation to develop the deep chain instance segmentation network (ChaInNet) to mitigate this problem. Segmentation competition is caused by the usual contradiction between instance and semantic segmentation when predicting instance objects. ChaInNet alternately performs inter-reference learning by stacking two-branch chain blocks to improve feature extraction from network layers. Panoptic segmentation using ChaInNet accurately extracts the contour of instance objects and improves the accuracy of instance segmentation, thus reducing the adverse effects of segmentation competition on the quality of the outcome. ChaInNet is a general instance segmentation architecture that can be widely used in various object recognition tasks. Experimental results on the MS COCO and Cityscapes benchmark datasets show that ChaInNet provides state-of-the-art segmentation and outperforms Mask R-CNN, which is commonly used for identifying instance objects in panoptic segmentation.
Similar content being viewed by others
References
He K, Gkioxari G, Dollár P, Girshick R, Mask R-CNN (2017) Proceedings of the IEEE International Conference on Computer Vision pp. 2980–2988
Ye L, Liu Z, Wang Y (2017) Depth-aware object instance segmentation, Proceedings of the IEEE International Conference on Image Processing (ICIP) pp. 325–329
Shang C, Wu Q, Meng F, Xu L (2019) Instance Segmentation by Learning Deep Feature in Embedding Space, Proceedings of the IEEE International Conference on Image Processing (ICIP) pp. 2444–2448
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 3431–3440
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation, Proceedings of the European Conference on Computer Vision (ECCV) pp. 801–818
Kirillov A, He K, Girshick R, Dollár P, Segmentation P (2019) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.9396–9405
Kirillov A, Girshick R, He K, Dollár P Panoptic Feature Pyramid Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019. pp. 6392–6401
Xiong Y, Liao R, Zhao H, Hu R, Bai M, Yumer E, Urtasun R (2019) UPSNet: A Unified Panoptic Segmentation Network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp.8810–8818
Li Y, Chen X, Zhu Z, Zhu Z, Xie L, Huang G, Du D, Wang X (2019) Attention-Guided Unified Network for Panoptic Segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 7019–7028
He K, Cao X, Shi Y, Nie D, Gao Y, Shen D (2019) “Pelvic Organ Segmentation Using Distinctive Curve Guided Fully Convolutional Networks,“ in IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 585–595, Feb. doi: https://doi.org/10.1109/TMI.2018.2867837
Zhao H, Shi J, Qi X, Wang X, Jia J Pyramid Scene Parsing Network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, pp. 6230–6239
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Geus D, Meletis P, Dubbelman G (2018) Panoptic segmentation with a joint semantic and instance segmentation network.arXiv preprintarXiv:1809.02110,
Li J, Raventos A, Bhargava A, Tagawa T, Gaidon A (2018) Learning to fuse things and stuff. arXiv preprint arXiv:1812.01192,
Liu H, Peng C, Yu C, Wang J, Liu X, Yu G, Jiang W, Recognition (2019) pp. 6165–6174
Lazarow J, Lee K, Shi K, Tu Z Learning Instance Occlusion for Panoptic Segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2020, pp. 10717–10726
Sofiiuk K, Sofiyuk K, Barinova O, Konushin A, Barinova O (2019) AdaptIS: Adaptive Instance Selection Network, Proceedings of the IEEE International Conference on Computer Vision pp. 7354–7362
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 770–778
Xie S, Girshick R, Dollár P, Tu Z, He K, Recognition (2017) pp. 5987–5995
Zagoruyko S, Komodakis N Wide Residual Networks. arXiv preprint arXiv:1605.07146,2017
Zhang H, Wu C, Zhang Z, Zhu Y, Zhang Z, Lin H, Sun Y, He T, Mueller J, Manmatha R, Li M, Smola A ResNeSt: Split-Attention Networks. arXiv preprint arXiv: 2004.08955v1,2020
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ Densely Connected Convolutional Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition2017, pp. 2261–2269
Nair V, Hinton GE Rectified linear units improve restricted boltzmann machines. Proceedings of the international conference on machine learning 2010, pp. 807–814
Dai J et al (2017) “Deformable Convolutional Networks,“ 2017 IEEE International Conference on Computer Vision (ICCV), Venice, pp. 764–773, doi: https://doi.org/10.1109/ICCV.2017.89
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick L, Microsoft COCO Common Objects in Context, Proceedings of the European Conference on Computer Vision (ECCV) 2014, pp. 740–755
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 3213–3223
Liu X et al (2020) “Multiple Kernel $k$k-Means with Incomplete Kernels,“ in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 5, pp. 1191–1204, 1 doi: https://doi.org/10.1109/TPAMI.2019.2892416
Chen Q, Cheng A, He X, Wang P, Cheng J “SpatialFlow: Bridging All Tasks for Panoptic Segmentation,“ in IEEE Transactions on Circuits and Systems for Video Technology, doi: https://doi.org/10.1109/TCSVT.2020.3020257
Chen Y, Recognition P et al (2020) (CVPR), Seattle, WA, USA, pp. 3792–3801, doi: https://doi.org/10.1109/CVPR42600.2020.00385
Yang T-J, .Collins MD, Zhu Y, Hwang J-J, Liu T, Zhang X, Sze V, Papandreou G, Chen L-C DeeperLab:Single-Shot Image Parser. arXiv preprint arXiv:1902.09053,2017.
Cheng B, Recognition P et al (2020) (CVPR), Seattle, WA, USA, pp. 12472–12482, doi: https://doi.org/10.1109/CVPR42600.2020.01249
Funding
This work was supported by National Natural Science Foundation of China (Grant No. 61673084) and Natural Science Foundation of Liaoning Province (Grant No. 20180550866).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Declarations of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mao, L., Ren, F., Yang, D. et al. ChaInNet: Deep Chain Instance Segmentation Network for Panoptic Segmentation. Neural Process Lett 55, 615–630 (2023). https://doi.org/10.1007/s11063-022-10899-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-10899-2