Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation

Malekijoo, Amirhossein; Fadaeieslam, Mohammad Javad

doi:10.1007/s11042-019-07990-7

Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation

Published: 02 August 2019

Volume 78, pages 32379–32392, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

393 Accesses
3 Citations
Explore all metrics

Abstract

Recognizing the content of an image is an important challenge in machine vision. Semantic segmentation is one of the most important ways to overcome this challenge. It is utilized in different applications such as autonomous driving, indoor navigation, virtual or augmented reality systems, and recognition tasks. In this paper, a novel and practical deep fully convolutional neural network architecture was introduced for semantic pixel-wise segmentation termed as P-DecovNet. The proposed architecture combines the Convolution-Deconvolution Neural Network architecture with the Pyramid Pooling Module. In this project, the high-level features were extracted from the image using the Convolutional Neural Network. To reinforce the local information, the Pooling module was added to the architecture. CamVid road scene dataset was used to evaluate the performance of the P-DecovNet. With respect to different criteria (including - but not limited to - accuracy and mIoU), the experimental results demonstrated that P-DecovNet practically has a good performance in the domain of Convolution-Deconvolution Network. To achieve such performance, this work uses a smaller number of training images with lesser iterations compared to the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network

Article Open access 03 June 2019

Jin Chen, Chuanya Wang & Ying Tong

Learning More Accurate Features for Semantic Segmentation in CycleNet

A Rapid Image Semantic Segment Method Based on Deeplab V3+

Notes

https://github.com/malekijoo/P-DecovNet.

References

Alhaija H, Mustikovela S, Mescheder L, Geiger A, Rother C (2018) Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes. International Journal of Computer Vision (IJCV)
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12)
Article Google Scholar
Brostow G, Fauqueur J, Cipolla R (2009) Semantic object classes in video: A high-definition ground truth database. PRL 30(2):88–97
Article Google Scholar
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2015) Semantic image segmentation with deep convolutional nets and fully connected. crfs. In: ICLR
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4)
Article Google Scholar
Dumoulin et al (2018) Feature-wise transformations. Distill. https://doi.org/10.23915/distill.00011
A. Garcia-Garcia, et al. (2017) A Review on Deep Learning Techniques Applied to Semantic Segmentation. arXiv:1704.06857
Hariharan B, Arbelaez P, Girshick R, Malik J (2015) Hyper-columns for object segmentation and fine-grained localization. In: CVPR
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. CVPR
Jerripothula KR, Cai J, Yuan J (2016) Image Co-segmentation via saliency co-fusion. IEEE Trans on Multimedia 18(9):1896–1909
Article Google Scholar
Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. Proceedings of the 3rd International Conference on Learning Representations (ICLR)
Kong S, Fowlkes C (2018) Pixel-wise Attentional Gating for Parsimonious Pixel Labeling. arXiv:1805.01556
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105
LeCun Y, Boser B, Denker J, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to hand-written zip code recognition. Neural Comput
LeCun YA, Bottou L, Orr GB, Müller K-R (1998) Efficient backprop. In: Neural networks: Tricks of the trade, pages 9–48. Springer
Lin G, Milan A, Shen C, Reid I (2017) Refinenet: Multipath refinement networks with identity mappings for highresolution semantic segmentation. In: CVPR
F. Liu, C. Shen, G. Lin, and I. D. Reid (2015) Learning depth from single monocular images using deep convolutional neural fields. CoRR, abs/150207411
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: CVPR
Mortensen EN, Barrett WA (1998. [Online) Interactive Segmentation with Intelligent Scissors. Graphical Models and Image Processing 60(5):349–384. https://doi.org/10.1006/gmip.1998.0480
Article MATH Google Scholar
Mostajabi M, Yadollahpour P, Shakhnarovich G (2015) Feedforward semantic segmentation with zoom-out features. Proc IEEE Conf Comput Vis Pattern Recognit:3376–3385
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In ICCV
Pohlen T, Hermans A, Mathias M, Leibe B (2017) Fullresolution residual networks for semantic segmentation in street scenes. In: CVPR
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In MICCAI
Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314
Article Google Scholar
Thoma M (2016) A survey of semantic segmentation, CoRR, vol. abs/1602.06541, Available: http://arxiv.org/abs/1602.06541
Wang SH, Lv YD, Sui Y, Liu S, Wang SJ, Zhang YD (2018) Alcoholism Detection by Data Augmentation and Convolutional Neural Network with Stochastic Pooling. J Med Syst 42(2)
Wenzhe S, Jose C, Lucas T, Ference H, Andrew A, Christian L (2016) Wang Zehan: “Is the deconvolution layer the same as a convolutional layer,” arXiv 1609:07009
Google Scholar
Yang G, Zhao H, Shi J, Deng Z, Jia J (2018) SegStereo: Exploiting Semantic Information for Disparity Estimation. arXiv:1807.11699
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. In: International Conference on Learning Representations (ICLR), IEEE, Scottsdale, pp 1–7.
Zhang YD, Muhammad K, Tan C (2018) Twelve-layer deep convolutional neural network with stochastic pooling for tea category classification on GPU platform. Multimed Tools Appl 77:22821
Article Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. In: ICCV

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
Amirhossein Malekijoo & Mohammad Javad Fadaeieslam

Authors

Amirhossein Malekijoo
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Javad Fadaeieslam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amirhossein Malekijoo.

Ethics declarations

Conflict of interest

The authors declared no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Malekijoo, A., Fadaeieslam, M.J. Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation. Multimed Tools Appl 78, 32379–32392 (2019). https://doi.org/10.1007/s11042-019-07990-7

Download citation

Received: 01 July 2018
Revised: 21 May 2019
Accepted: 10 July 2019
Published: 02 August 2019
Issue Date: November 2019
DOI: https://doi.org/10.1007/s11042-019-07990-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network

Learning More Accurate Features for Semantic Segmentation in CycleNet

A Rapid Image Semantic Segment Method Based on Deeplab V3+

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

AtICNet: semantic segmentation with atrous spatial pyramid pooling in image cascade network

Learning More Accurate Features for Semantic Segmentation in CycleNet

A Rapid Image Semantic Segment Method Based on Deeplab V3+

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation