Image semantic segmentation with an improved fully convolutional network

Tseng, Kuo-Kun; Sun, Haichuan; Liu, Junwu; Li, Jiaqi; Yung, K. L.; Ip, W. H.

doi:10.1007/s00500-019-04537-8

Image semantic segmentation with an improved fully convolutional network

Focus
Published: 23 November 2019

Volume 24, pages 8253–8273, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

Kuo-Kun Tseng ORCID: orcid.org/0000-0002-3340-8710¹,
Haichuan Sun¹,
Junwu Liu¹,
Jiaqi Li¹,
K. L. Yung² &
…
W. H. Ip^2,3

646 Accesses
6 Citations
Explore all metrics

Abstract

With the development of deep learning and the emergence of unmanned driving, fully convolutional networks are a feasible and effective for image semantic segmentation. DeepLab is an algorithm based on the fully convolutional networks. However, DeepLab algorithm still has room for improvement, and we design three improved methods: (1) the global context structure module, (2) highly efficient decoder module, and (3) multi-scale feature fusion module. The experimental results show that the three improved methods that we proposed in this paper can make the model obtain more expressive features and improve the accuracy of the algorithm. At the same time, we do some experiments on the Cityscapes dataset to further verify robustness and effectiveness of the improved algorithm. Finally, the improved algorithm is applied to the actual scene and has certain practical value.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Convolutional neural network: a review of models, methodologies and applications to object detection

Article 20 December 2019

References

Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Chaurasia A, Culurciello E (2017) LinkNet: exploiting encoder representations for efficient semantic segmentation. arXiv preprint arXiv:1707.03718
Chen LC, Papandreou G, Kokkinos I et al (2014) Semantic image segmentation with deep convolutional nets and fully connected CRFs. Comput Sci 4:357–361
Google Scholar
Chen LC, Yang Y, Wang J et al (2016) Attention to scale: scale-aware semantic image segmentation. In: IEEE conference on computer vision and pattern recognition, Las Vegas, pp 3640–3649
Chen LC, Papandreou G, Schroff F et al (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Chen LC, Papandreou G, Kokkinos I et al (2018a) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Chen LC, Zhu Y, Papandreou G et al (2018b) Encoder–decoder with atrous separable convolution for semantic image segmentation. arXiv preprint arXiv:1802.02611
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. arXiv preprint arXiv:1610.02357
Cordts M, Omran M, Ramos S et al (2016) The cityscapes dataset for semantic urban scene understanding. In: IEEE conference on computer vision and pattern recognition, Las Vegas, pp 3213–3223
Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3):81–227
MATH Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE conference on computer vision and pattern recognition, San Diego, pp 886–893
Dvornik N, Shmelkov K, Mairal J et al (2017) BlitzNet: a Real-time deep network for scene understanding. In: IEEE international conference on computer vision, Venice, pp 4174–4182
Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: IEEE international conference on computer vision, Santiago, pp 2650–2658
Everingham M, Gool LV, Williams CKI et al (2010) The Pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Geiger A (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite. In: IEEE conference on computer vision and pattern recognition, Portland, pp 3354–3361
Ghiasi G, Fowlkes CC (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In: European conference on computer vision, Amsterdam, pp 519–534
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feed forward neural networks. In: Proceedings of the 13th international conference on artificial intelligence and statistics, Sardinia, pp 249–256
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the 14th international conference on artificial intelligence and statistics, Fort Lauderdale, pp 315–323
Hariharan B, Arbelaez P, Bourdev L et al (2011) Semantic contours from inverse detectors. In: IEEE international conference on computer vision, Barcelona, pp 991–998
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, Las Vegas, pp 770–778
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Article MathSciNet Google Scholar
Huang G, Liu Z, Weinberger K Q et al (2017) Densely connected convolutional networks. In: IEEE conference on computer vision and pattern recognition, Hawaii, vol 1, no 2, p 3
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on international conference on machine learning, Lille, pp 448–456
Kong T, Yao A, Chen Y et al (2016) HyperNet: towards accurate region proposal generation and joint object detection. In: IEEE conference on computer vision and pattern recognition, Las Vegas, pp 845–853
Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in neural information processing systems, Granada, pp 109–117
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, Lake Tahoe, pp 1097–1105
Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Lee CY, Xie S, Gallagher P et al (2014) Deeply-supervised nets. In: Artificial intelligence and statistics, Reykjavik, pp 562–570
Li H, Xiong P, Fan H, Sun J (2019) DFAnet: deep feature aggregation for real-time semantic segmentation. arXiv.org
Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: International conference on image processing, vol 1, I-900–I-903
Lin TY, Maire M, Belongie S et al (2014) Microsoft coco: common objects in context. In: European conference on computer vision, Zurich, pp 740–755
Lin G, Milan A, Shen C et al (2017a) RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: IEEE conference on computer vision and pattern recognition, Hawaii, pp 5168–5177
Lin T, Dollar P, Girshick RB et al (2017b) Feature pyramid networks for object detection. In: IEEE conference on computer vision and pattern recognition, Hawaii, pp 936–944
Liu W, Rabinovich A, Berg AC (2015) Parsenet: looking wider to see better. arXiv preprint arXiv:1506.04579
Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot MultiBox detector. In: European conference on computer vision, Amsterdam, pp 21–37
Liu X, Deng Z, Yang Y (2018) Recent progress in semantic image segmentation. Artif Intell Rev 6:1–18
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conference on computer vision and pattern recognition, Boston, pp 3431–3440
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. arXiv.org
Pai-Hsuen Chen, Chih-Jen Lin, Bernhard Schölkopf (2005) A tutorial on v-support vector machines. Appl Stoch Models Bus Ind 21(2):111–136
Article MathSciNet Google Scholar
Romera E, Alvarez J, Bergasa L, Arroyo R (2017) Efficient ConvNet for real-time semantic segmentation. In: 2017 IEEE intelligent vehicles symposium (IV). IEEE
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Munich, pp 234–241
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumenhart DE, McCelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition. MIT Press, Cambridge, pp 318–362
Chapter Google Scholar
Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Salscheider N (2019) Simultaneous object detection and semantic segmentation. arXiv.org
Shetty S (2012) Application of convolutional neural network for image classification on pascal voc challenge 2012 dataset. arXiv.org
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way toprevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Sun Z, Xue L, Xu Y (2012) A review of in-depth learning. Comput Appl Res 29(8):2806–2810
Google Scholar
Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition, Boston, pp 1–9
Wang P, Chen P, Yuan Y et al (2017) Understanding convolution for semantic segmentation. arXiv preprint arXiv:1702.08502
Wei Y, Zhao Y (2016) Review of image semantic segmentation based on DCNN. J Beijing Jiaotong Univ 40(4):82–91
MathSciNet Google Scholar
Wu Y, He K (2018) Group normalization. arXiv preprint arXiv:1803.08494
Yang F (2014) Development status and prospects of driverless cars. Shanghai Automot 3:35–40
Google Scholar
Yu H, Yang Z, Tan L, Wang Y, Sun W, Sun M et al (2018) Methods and datasets on semantic segmentation: a review. Neurocomputing 304:S0925231218304077
Article Google Scholar
Zhao H, Shi J, Qi X et al (2017a) Pyramid scene parsing network. In: IEEE conference on computer vision and pattern recognition, Hawaii, pp 2881–2890
Zhao H, Qi X, Shen X et al (2017b) Icnet for real-time semantic segmentation on high-resolution images. arXiv preprint arXiv:1704.08545

Download references

Funding

This study was funded by Shenzhen Government (Grant Nos. KQJSCX20170726104033 357, JCYJ20150513151706567 and JCYJ20160531191837793). Furthermore, this work was partially supported by the Department of Industrial and Systems Engineering of the Hong Kong Polytechnic University (Grant No. H-ZG3K).

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China
Kuo-Kun Tseng, Haichuan Sun, Junwu Liu & Jiaqi Li
Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China
K. L. Yung & W. H. Ip
Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, Canada
W. H. Ip

Authors

Kuo-Kun Tseng
View author publications
You can also search for this author in PubMed Google Scholar
Haichuan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Junwu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Li
View author publications
You can also search for this author in PubMed Google Scholar
K. L. Yung
View author publications
You can also search for this author in PubMed Google Scholar
W. H. Ip
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuo-Kun Tseng.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by Mu-Yen Chen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tseng, KK., Sun, H., Liu, J. et al. Image semantic segmentation with an improved fully convolutional network. Soft Comput 24, 8253–8273 (2020). https://doi.org/10.1007/s00500-019-04537-8

Download citation

Published: 23 November 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s00500-019-04537-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image semantic segmentation with an improved fully convolutional network

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

YOLO-based Object Detection Models: A Review and its Applications

Convolutional neural network: a review of models, methodologies and applications to object detection

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image semantic segmentation with an improved fully convolutional network

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

YOLO-based Object Detection Models: A Review and its Applications

Convolutional neural network: a review of models, methodologies and applications to object detection

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation