Skip to main content
Log in

Context propagation embedding network for weakly supervised semantic segmentation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Weakly-supervised semantic segmentation with image-level labels has much significance as it facilitates related practical applications under lightweight manual annotations. Recent researches first infer the visual cues, referred as discriminative regions corresponding to each object category on images using deep convolutional classification networks. Then they expand visual cues to generate initial segmentation masks. Despite the remarkable progress, the segmentation performance still remains unsatisfactory due to the absence of complete visual cues. Using these low-quality visual cues as prior will have the limitation on improving segmentation performance. To overcome this problem, we propose a novel context propagation embedding network, i.e., the CPENet to generate high-quality visual cues, which focuses on learning semantic relationship between each region and its surrounding neighbors and selectively propagates discriminative information to non-discriminative, object related regions. Our methods can provide reliable initial segmentation masks for training subsequent segmentation network to generate final segmentation results. In addition, we refined convolutional block attention module (CBAM) [30] to hierarchically extract more category-aware features by capturing global context related information and further promote the propagation process. Experiments on benchmark demonstrate that our proposed method obtains superior performance over the state-of-the-arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Ahn J, Kwak S (2018) Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. arXiv:1803.10464

  2. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  3. Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1635–1643

  4. Deng J, Dong W, Socher R, Li JL, Li K, Li FF (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  5. Di L, Dai J, Jia J, He K, Jian S (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  6. Durand T, Mordan T, Thome N, Cord M (2017) Wildcat: Weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR 2017), vol 2

  7. Jun F, Liu J, Tian H, Fang Z, Lu H (2018) Dual attention network for scene segmentation. arXiv:1809.02983

  8. Hong S, Noh H, Han B (2015) Decoupled deep neural network for semi-supervised semantic segmentation. In: Advances in neural information processing systems, pp 1495–1503

  9. Hong S, Junhyuk O, Lee H, Han B (2016) Learning transferrable knowledge for semantic segmentation with deep convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3204–3212

  10. Hong S, Yeo D, Kwak S, Lee H, Han B (2017) Weakly supervised semantic segmentation using web-crawled videos. arXiv:1701.00352

  11. Huang Z, Wang X, Wang J, Liu W, Wang J (2018) Weakly-supervised semantic segmentation network with deep seeded region growing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7014–7023

  12. Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: European conference on computer vision. Springer, pp 695–711

  13. Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFS with gaussian edge potentials. In: Advances in neural information processing systems, pp 109–117

  14. Lei Z, Zi H, Liu X, He X, Sun J, Zhou X (2017) Discrete multi-modal hashing with canonical views for robust mobile landmark search. IEEE Trans Multimed 19(9):2066–2079

    Article  Google Scholar 

  15. Li K, Wu Z, Peng K-C, Ernst J, Fu Y (2018) Tell me where to look:, Guided attention inference network. arXiv:1802.10171

  16. Lu J, Xiong C, Parikh D, Socher R Knowing when to look: Adaptive attention via a visual sentinel for image captioning

  17. Pathak D, Krahenbuhl P, Darrell T (2015) Constrained convolutional neural networks for weakly supervised segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1796–1804

  18. Pinheiro PO, Collobert R (2015) From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1713–1721

  19. Qi X, Liu Z, Shi J, Zhao H, Jia J (2016) Augmented feedback in semantic segmentation under image level supervision. In: European conference on computer vision. Springer, pp 90–105

  20. Saleh FS, Aliakbarian MS, Salzmann M, Petersson L, Alvarez JM, Gould S (2018) Incorporating network built-in priors in weakly-supervised semantic segmentation. IEEE transactions on pattern analysis and machine intelligence 40(6):1382–1396

    Article  Google Scholar 

  21. Saleh F, Aliakbarian MS, Salzmann M, Petersson L, Gould S, Alvarez JM (2016) Built-in foreground/background prior for weakly-supervised semantic segmentation. In: European conference on computer vision. Springer, pp 413–432

  22. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  23. Song X, Feng F, Han X, Xin Y, Nie L (2018) Neural compatibility modeling with attentive knowledge distillation

  24. Song X, Feng F, Liu J, Li Z, Nie L, Ma J (2017) Neurostylist:, Neural compatibility modeling for clothing matching., vol 10

  25. Wang X, You S, Xi L, Ma H (2018) Weakly-supervised semantic segmentation by iteratively mining common object features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1354–1362

  26. Wei Y, Feng J, Liang X, Cheng M-M, Zhao Y, Yan S (2017) Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: IEEE CVPR, vol 1, p 3

  27. Wei Y, Liang X, Chen Y, Jie Z, Xiao Y, Zhao Y, Yan S (2016) Learning to segment with image-level annotations. Pattern Recogn 59:234–244

    Article  Google Scholar 

  28. Wei Y, Liang X, Chen Y, Shen X, Cheng M-M, Feng J, Zhao Y, Yan S (2017) Stc: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(11):2314–2320

    Article  Google Scholar 

  29. Wei Y, Xiao H, Shi H, Jie Z, Feng J, Huang TS (2018) Revisiting dilated convolution: a simple approach for weakly-and semi-supervised semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7268–7277

  30. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

  31. Yang Z, He X, Gao J, Deng L, Smola A Stacked attention networks for image question answering

  32. Yuan Y, Wang J (2018) Ocnet:, Object context network for scene parsing. arXiv:1809.00916

  33. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

  34. Zhu L, Huang Z, Li Z, Xie L, Shen HT (2018) Exploring auxiliary context: Discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learn Syst 29(11):2564–5276

    Article  MathSciNet  Google Scholar 

  35. Zhu L, Shen J, Xie L (2017) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486

    Article  Google Scholar 

Download references

Acknowledgements

Zhendong Mao is the corresponding author. This work is supported by the National Natural Science Foundation of China (grants No.U19A2057).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhendong Mao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Y., Mao, Z., Chen, Z. et al. Context propagation embedding network for weakly supervised semantic segmentation. Multimed Tools Appl 79, 33925–33942 (2020). https://doi.org/10.1007/s11042-020-08787-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08787-9

Keywords

Navigation