Online Hard Region Mining for Semantic Segmentation

Yin, Jin; Xia, Pengfei; He, Jingsong

doi:10.1007/s11063-019-10047-3

Online Hard Region Mining for Semantic Segmentation

Published: 14 May 2019

Volume 50, pages 2665–2679, (2019)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

818 Accesses
6 Citations
Explore all metrics

Abstract

Recent advances in semantic segmentation have made significant progress by enlarging the reception fields or capturing contextual information. Semantic segmentation is considered as a per-pixel classification problem. Hard discriminate region existing in an image will limit segmentation accuracy. In this work, we propose an approach to increase the attention to local semantic segmentation performance by region-based hard region mining. To analyse the performance on three popular semantic segmentation datasets, including PASCAL VOC 2012, PASCAL Context and Camvid, we experiment two different semantic segmentation networks, Deeplab v3 and FCN. Our experimental results show consistent improvement, which demonstrating the efficacy of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale deep context convolutional neural networks for semantic segmentation

Article 19 April 2018

Deep Context Convolutional Neural Networks for Semantic Segmentation

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

References

Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: European conference on computer vision. Springer, pp 44–57
Cai Z, Vasconcelos N (2018) Cascade R-CNN: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Chen LC, Yang Y, Wang J, Xu W, Yuille AL (2016) Attention to scale: scale-aware semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3640–3649
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Chapter Google Scholar
Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745
Dai J, He K, Sun J (2016) Instance-aware semantic segmentation via multi-task network cascades. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3150–3158
Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Article Google Scholar
Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Garcia-Rodriguez J (2017) A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: International conference on computer vision. pp 991–998
Hariharan B, Arbeláez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. In: European conference on computer vision. Springer, pp 297–312
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Machine Intell 37(9):1904–1916
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hong C, Yu J, Chen X (2013) Image-based 3D human pose recovery with locality sensitive sparse retrieval. In: 2013 IEEE international conference on systems, man, and cybernetics. IEEE, pp 2103–2108
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Article MathSciNet Google Scholar
Hong C, Yu J, You J, Chen X, Tao D (2015) Multi-view ensemble manifold regularization for 3D object recognition. Inf Sci 320:395–405
Article MathSciNet Google Scholar
Hong C, Yu J, Zhang J, Jin X, Lee K (2018) Multi-modal face pose estimation with multi-task manifold deep learning. IEEE Trans Ind Inform. https://doi.org/10.1109/TII.2018.2884211
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Li X, Liu Z, Luo P, Change Loy C, Tang X (2017) Not all pixels are equal: difficulty-aware semantic segmentation via deep layer cascade. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3193–3202
Liu W, Rabinovich A, Berg AC (2015) ParseNet: Looking wider to see better. arXiv preprint arXiv:1506.04579
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Loshchilov I, Hutter F (2015) Online batch selection for faster training of neural networks. arXiv preprint arXiv:1511.06343
Murthy VN, Singh V, Chen T, Manmatha R, Comaniciu D (2016) Deep decision network for multi-class image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2240–2248
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528
Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters—improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4353–4361
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761–769
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Toshev A, Szegedy C (2014) DeepPose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660
Wei Y, Liang X, Chen Y, Shen X, Cheng MM, Feng J, Zhao Y, Yan S (2017) STC: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(11):2314–2320
Article Google Scholar
Wu Z, Shen C, van den Hengel A (2016) High-performance semantic segmentation using very deep fully convolutional networks. arXiv preprint arXiv:1604.04339
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122
Yu J, Kuang Z, Zhang B, Zhang W, Lin D, Fan J (2018) Leveraging content sensitiveness and user trustworthiness to recommend fine-grained privacy settings for social image sharing. IEEE Trans Inf Forensics Secur 13(5):1317–1332
Article Google Scholar
Yu J, Zhang B, Kuang Z, Lin D, Fan J (2017) iPrivacy: image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Trans Inf Forensics Secur 12(5):1005–1016
Article Google Scholar
Yuan Y, Wang J (2018) OCNet: Object context network for scene parsing. arXiv preprint arXiv:1809.00916
Zhang L, Lin L, Liang X, He K (2016) Is faster R-CNN doing well for pedestrian detection? In: European conference on computer vision. Springer, pp 443–457
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890

Download references

Author information

Authors and Affiliations

University of Science and Technology of China, Hefei, China
Jin Yin, Pengfei Xia & Jingsong He

Authors

Jin Yin
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Xia
View author publications
You can also search for this author in PubMed Google Scholar
Jingsong He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingsong He.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jin Yin and Pengfei Xia have the same contribution to this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, J., Xia, P. & He, J. Online Hard Region Mining for Semantic Segmentation. Neural Process Lett 50, 2665–2679 (2019). https://doi.org/10.1007/s11063-019-10047-3

Download citation

Published: 14 May 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11063-019-10047-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online Hard Region Mining for Semantic Segmentation

Abstract

Access this article

Similar content being viewed by others

Multi-scale deep context convolutional neural networks for semantic segmentation

Deep Context Convolutional Neural Networks for Semantic Segmentation

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Online Hard Region Mining for Semantic Segmentation

Abstract

Access this article

Similar content being viewed by others

Multi-scale deep context convolutional neural networks for semantic segmentation

Deep Context Convolutional Neural Networks for Semantic Segmentation

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation