Deep-recursive residual network for image semantic segmentation

Zhang, Yue; Li, Xianrui; Lin, Mingquan; Chiu, Bernard; Zhao, Mingbo

doi:10.1007/s00521-020-04738-5

Deep-recursive residual network for image semantic segmentation

Original Article
Published: 24 January 2020

Volume 32, pages 12935–12947, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Yue Zhang¹,
Xianrui Li¹,
Mingquan Lin²,
Bernard Chiu² &
…
Mingbo Zhao¹

767 Accesses
17 Citations
3 Altmetric
Explore all metrics

Abstract

A good semantic segmentation method for visual scene understanding should consider both accuracy and efficiency. However, the existing networks tend to concentrate only on segmentation results but not on simplifying the network. As a result, a heavy network will be made and it is difficult to deploy such heavy network on some hardware with limited memory. To address this problem, we in this paper develop a novel architecture by involving the recursive block to reduce parameters and improve prediction, as recursive block can improve performance without introducing new parameters for additional convolutions. In detail, for the purpose of mitigating the difficulty of training recursive block, we have adopted a residual unit to give the data more choices to flow through and utilize concatenation layer to combine the output maps of the recursive convolution layers with same resolution but different field-of-views. As a result, richer semantic information can be included in the feature maps, which is good to achieve satisfying pixel-wise prediction. Meriting from the above strategy, we also extend it to enhance Mask-RCNN for instance segmentation. Extensive simulations based on different benchmark datasets, such as DeepFashion, Cityscapes and PASCAL VOC 2012, show that our method can improve segmentation results as well as reduce the parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

LDANet: the laplace-guided detail-constrained asymmetric network for real-time semantic segmentation

Article 27 November 2023

Zhifang Zhu, Wenhao Wu, … Xiaohuang Zhan

ParallelNet: A Depth-Guided Parallel Convolutional Network for Scene Segmentation

A Rapid Image Semantic Segment Method Based on Deeplab V3+

References

Bai M, Urtasun R (2017) Deep watershed transform for instance segmentation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 2858–2866
Cao X, Chen Y, Zhao Q, Meng D, Wang Y, Wang D, Xu Z (2015) Low-rank matrix factorization under general mixture noise distributions. In: Proceedings of the IEEE international conference on computer vision, pp 1493–1501
Chaurasia A, Culurciello E (2017) Linknet: exploiting encoder representations for efficient semantic segmentation. In: Visual communications and image processing (VCIP), 2017 IEEE, IEEE, pp 1–4
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. In: Advances in neural information processing systems, pp 1269–1277
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2147–2154
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2012) The PASCAL visual object classes challenge 2012 (VOC2012) results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Han S, Mao H, Dally WJ (2015a) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:151000149
Han S, Pool J, Tran J, Dally W (2015b) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems, pp 1135–1143
He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision, Springer, pp 630–645
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: 2017 IEEE international conference on computer vision (ICCV), IEEE, pp 2980–2988
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:170404861
Huang FJ, Boureau YL, LeCun Y, et al (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: IEEE conference on computer vision and pattern recognition, 2007, CVPR’07, IEEE, pp 1–8
Kim J, Kwon Lee J, Mu Lee K (2016) Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1637–1645
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Kundu A, Vineet V, Koltun V (2016) Feature space optimization for semantic video segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3168–3175
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. Academic Press, New York
Google Scholar
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, et al. (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 105–114
Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1096–1104
McIntosh L, Maheswaranathan N, Sussillo D, Shlens J (2017) Recurrent segmentation for variable computational budgets. arXiv preprint arXiv:171110151
Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 689–696
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:160602147
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, Springer, pp 234–241
Salvador A, Bellver M, Campos V, Baradad M, Marques F, Torres J, Giro-i Nieto X (2017) Recurrent neural networks for semantic instance segmentation. arXiv preprint arXiv:171200617
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556
Srinivas S, Babu RV (2015) Data-free parameter pruning for deep neural networks. arXiv preprint arXiv:150706149
Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. In: Advances in neural information processing systems, pp 2553–2561
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tai Y, Yang J, Liu X (2017) Image super-resolution via deep recursive residual network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 2790–2798
Toderici G, O’Malley SM, Hwang SJ, Vincent D, Minnen D, Baluja S, Covell M, Sukthankar R (2015) Variable rate image compression with recurrent neural networks. arXiv preprint arXiv:151106085
Xingjian S, Chen Z, Wang H, Yeung DY, Wong WK, Woo Wc (2015) Convolutional lstm network: a machine learning approach for precipitation nowcasting. In: Advances in neural information processing systems, pp 802–810
Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2881–2890
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr PH (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537
Zhu S, Urtasun R, Fidler S, Lin D, Change Loy C (2017) Be your own prada: fashion synthesis with structural coherence. In: Proceedings of the IEEE international conference on computer vision, pp 1680–1688

Download references

Acknowledgements

The work is supported by the National Natural Science Foundation of China (Nos. 61971121, 61601112), the Fundamental Research Funds for the Central Universities and DHU Distinguished Young Professor Program.

Author information

Authors and Affiliations

Donghua University, Shanghai, People’s Republic of China
Yue Zhang, Xianrui Li & Mingbo Zhao
City University of Hong Kong, Hong Kong, Hong Kong S.A.R.
Mingquan Lin & Bernard Chiu

Authors

Yue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xianrui Li
View author publications
You can also search for this author in PubMed Google Scholar
Mingquan Lin
View author publications
You can also search for this author in PubMed Google Scholar
Bernard Chiu
View author publications
You can also search for this author in PubMed Google Scholar
Mingbo Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingbo Zhao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Li, X., Lin, M. et al. Deep-recursive residual network for image semantic segmentation. Neural Comput & Applic 32, 12935–12947 (2020). https://doi.org/10.1007/s00521-020-04738-5

Download citation

Received: 10 October 2019
Accepted: 10 January 2020
Published: 24 January 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s00521-020-04738-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Deep-recursive residual network for image semantic segmentation

Abstract

Access this article

Similar content being viewed by others

LDANet: the laplace-guided detail-constrained asymmetric network for real-time semantic segmentation

ParallelNet: A Depth-Guided Parallel Convolutional Network for Scene Segmentation

A Rapid Image Semantic Segment Method Based on Deeplab V3+

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep-recursive residual network for image semantic segmentation

Abstract

Access this article

Similar content being viewed by others

LDANet: the laplace-guided detail-constrained asymmetric network for real-time semantic segmentation

ParallelNet: A Depth-Guided Parallel Convolutional Network for Scene Segmentation

A Rapid Image Semantic Segment Method Based on Deeplab V3+

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation