Abstract
Weakly-supervised semantic segmentation with image-level labels has much significance as it facilitates related practical applications under lightweight manual annotations. Recent researches first infer the visual cues, referred as discriminative regions corresponding to each object category on images using deep convolutional classification networks. Then they expand visual cues to generate initial segmentation masks. Despite the remarkable progress, the segmentation performance still remains unsatisfactory due to the absence of complete visual cues. Using these low-quality visual cues as prior will have the limitation on improving segmentation performance. To overcome this problem, we propose a novel context propagation embedding network, i.e., the CPENet to generate high-quality visual cues, which focuses on learning semantic relationship between each region and its surrounding neighbors and selectively propagates discriminative information to non-discriminative, object related regions. Our methods can provide reliable initial segmentation masks for training subsequent segmentation network to generate final segmentation results. In addition, we refined convolutional block attention module (CBAM) [30] to hierarchically extract more category-aware features by capturing global context related information and further promote the propagation process. Experiments on benchmark demonstrate that our proposed method obtains superior performance over the state-of-the-arts.






Similar content being viewed by others
References
Ahn J, Kwak S (2018) Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. arXiv:1803.10464
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1635–1643
Deng J, Dong W, Socher R, Li JL, Li K, Li FF (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Di L, Dai J, Jia J, He K, Jian S (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Durand T, Mordan T, Thome N, Cord M (2017) Wildcat: Weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR 2017), vol 2
Jun F, Liu J, Tian H, Fang Z, Lu H (2018) Dual attention network for scene segmentation. arXiv:1809.02983
Hong S, Noh H, Han B (2015) Decoupled deep neural network for semi-supervised semantic segmentation. In: Advances in neural information processing systems, pp 1495–1503
Hong S, Junhyuk O, Lee H, Han B (2016) Learning transferrable knowledge for semantic segmentation with deep convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3204–3212
Hong S, Yeo D, Kwak S, Lee H, Han B (2017) Weakly supervised semantic segmentation using web-crawled videos. arXiv:1701.00352
Huang Z, Wang X, Wang J, Liu W, Wang J (2018) Weakly-supervised semantic segmentation network with deep seeded region growing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7014–7023
Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: European conference on computer vision. Springer, pp 695–711
Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFS with gaussian edge potentials. In: Advances in neural information processing systems, pp 109–117
Lei Z, Zi H, Liu X, He X, Sun J, Zhou X (2017) Discrete multi-modal hashing with canonical views for robust mobile landmark search. IEEE Trans Multimed 19(9):2066–2079
Li K, Wu Z, Peng K-C, Ernst J, Fu Y (2018) Tell me where to look:, Guided attention inference network. arXiv:1802.10171
Lu J, Xiong C, Parikh D, Socher R Knowing when to look: Adaptive attention via a visual sentinel for image captioning
Pathak D, Krahenbuhl P, Darrell T (2015) Constrained convolutional neural networks for weakly supervised segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1796–1804
Pinheiro PO, Collobert R (2015) From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1713–1721
Qi X, Liu Z, Shi J, Zhao H, Jia J (2016) Augmented feedback in semantic segmentation under image level supervision. In: European conference on computer vision. Springer, pp 90–105
Saleh FS, Aliakbarian MS, Salzmann M, Petersson L, Alvarez JM, Gould S (2018) Incorporating network built-in priors in weakly-supervised semantic segmentation. IEEE transactions on pattern analysis and machine intelligence 40(6):1382–1396
Saleh F, Aliakbarian MS, Salzmann M, Petersson L, Gould S, Alvarez JM (2016) Built-in foreground/background prior for weakly-supervised semantic segmentation. In: European conference on computer vision. Springer, pp 413–432
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Song X, Feng F, Han X, Xin Y, Nie L (2018) Neural compatibility modeling with attentive knowledge distillation
Song X, Feng F, Liu J, Li Z, Nie L, Ma J (2017) Neurostylist:, Neural compatibility modeling for clothing matching., vol 10
Wang X, You S, Xi L, Ma H (2018) Weakly-supervised semantic segmentation by iteratively mining common object features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1354–1362
Wei Y, Feng J, Liang X, Cheng M-M, Zhao Y, Yan S (2017) Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: IEEE CVPR, vol 1, p 3
Wei Y, Liang X, Chen Y, Jie Z, Xiao Y, Zhao Y, Yan S (2016) Learning to segment with image-level annotations. Pattern Recogn 59:234–244
Wei Y, Liang X, Chen Y, Shen X, Cheng M-M, Feng J, Zhao Y, Yan S (2017) Stc: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(11):2314–2320
Wei Y, Xiao H, Shi H, Jie Z, Feng J, Huang TS (2018) Revisiting dilated convolution: a simple approach for weakly-and semi-supervised semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7268–7277
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Yang Z, He X, Gao J, Deng L, Smola A Stacked attention networks for image question answering
Yuan Y, Wang J (2018) Ocnet:, Object context network for scene parsing. arXiv:1809.00916
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
Zhu L, Huang Z, Li Z, Xie L, Shen HT (2018) Exploring auxiliary context: Discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learn Syst 29(11):2564–5276
Zhu L, Shen J, Xie L (2017) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486
Acknowledgements
Zhendong Mao is the corresponding author. This work is supported by the National Natural Science Foundation of China (grants No.U19A2057).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, Y., Mao, Z., Chen, Z. et al. Context propagation embedding network for weakly supervised semantic segmentation. Multimed Tools Appl 79, 33925–33942 (2020). https://doi.org/10.1007/s11042-020-08787-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-08787-9