Skip to main content
Log in

Improved sub-category exploration and attention hybrid network for weakly supervised semantic segmentation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Since image-wise supervised labels can be obtained effortlessly, weakly supervised semantic segmentation has made great achievements in recent years. Most advanced methods use the class activation maps (CAMs) to generate the initial localization map. However, the CAMs generated by the convolutional neural network only contain the discriminative parts of the object, and it is not sufficient for segmenting the image. In this paper, we propose an effective two-stage weakly supervised semantic segmentation method called SRANet with sub-category exploration network (SEN) and self-correlation module (SCM). It enriches the object information and adjusts the generated CAM by applying improved sub-category task and second-order self-supervision mechanism. Specifically, we perform clustering on features to obtain the sub-category pseudolabels, which are employed to generate high qualitative CAMs. Then, we design a self-attention module to further improve the quality of the response map. The extensive experiments results with some state-of-the-art methods show that the proposed SRANet model can achieve 71.5% and 72.8% mIoU on the PASCAL VOC 2012 training set and testing set, respectively. It also obtains 44.6% mIoU on MS COCO 2014 dataset, which is the best performance in all comparable algorithms. All these verify the performance and superiority of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Ahn J, Cho S, Kwak S (2019) Weakly supervised learning of instance segmentation with inter-pixel relations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2209–2218

  2. Ahn J, Kwak S (2018) Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4981–4990

  3. Aneja J, Deshpande A, Schwing AG (2018) Convolutional image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5561–5570

  4. Azzopardi G, Strisciuglio N, Vento M, Petkov N (2015) Trainable cosfire filters for vessel delineation with application to retinal images. Med Image Anal 19(1):46–57

    Article  Google Scholar 

  5. Bearman A, Russakovsky O, Ferrari V, Fei-Fei L (2016) What’s the point: Semantic segmentation with point supervision. In: European conference on computer vision, pp 549–565. Springer

  6. Chang YT, Wang Q, Hung WC, Piramuthu R, Tsai YH, Yang MH (2020) Weakly-supervised semantic segmentation via sub-category exploration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8991–9000

  7. Chen K, Chan PP, Xiang T, Kees N, Yeung DS (2021) Class-specific affinity based weakly supervised semantic segmentation with neutral region exploration. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp 1–8. IEEE

  8. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  9. Chen X, Lian Y, Jiao L, Wang H, Gao Y, Lingling S (2020) Supervised edge attention network for accurate image instance segmentation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII 16, pp 617–631. Springer

  10. Cheng J, Liu J, Xu Y, Yin F, Wong DWK, Tan NM, Tao D, Cheng CY, Aung T, Wong TY (2013) Superpixel classification based optic disc and optic cup segmentation for glaucoma screening. IEEE Trans Med Imaging 32(6):1019–1032

    Article  Google Scholar 

  11. Cherukuri V, Ssenyonga P, Warf BC, Kulkarni AV, Monga V, Schiff SJ (2017) Learning based segmentation of ct brain images: application to postoperative hydrocephalic scans. IEEE Trans Biomed Eng 65(8):1871–1884

    Google Scholar 

  12. Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1635–1643

  13. Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

  14. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. Ieee

  15. Fan J, Zhang Z, Tan T, Song C, Xiao J (2020) Cian: Cross-image affinity net for weakly supervised semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp 10762–10769

  16. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3146–3154

  17. Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: 2011 International Conference on Computer Vision, pp 991–998. IEEE

  18. Hou Q, Jiang PT, Wei Y, Cheng MM (2018) Self-erasing network for integral object attention. arXiv preprint arXiv:1810.09821

  19. Huang Z, Wang X, Wang J, Liu W, Wang J (2018) Weakly-supervised semantic segmentation network with deep seeded region growing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7014–7023

  20. Jiang PT, Han LH, Hou Q, Cheng MM, Wei Y (2021) Online attention accumulation for weakly supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence

  21. Jiang PT, Yang Y, Hou Q, Wei Y (2022) L2g: A simple local-to-global knowledge transfer framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16886–16896

  22. Kaneko AM, Yamamoto K (2016) Landmark recognition based on image characterization by segmentation points for autonomous driving. In: 2016 SICE International Symposium on Control Systems (ISCS), pp 1–8. https://doi.org/10.1109/SICEISCS.2016.7470160

  23. Karpathy A, Fei-Fei L (2015) Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3128–3137

  24. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  25. Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In: European conference on computer vision, pp 695–711. Springer

  26. Krähenbühl P, Koltun V (2011) Efficient inference in fully connected crfs with gaussian edge potentials. Adv Neural Inf Process Syst 24:109–117

    Google Scholar 

  27. Lee J, Kim E, Lee S, Lee J, Yoon S (2019) Ficklenet: Weakly and semi-supervised semantic image segmentation using stochastic inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5267–5276

  28. Lee J, Oh SJ, Yun S, Choe J, Kim E, Yoon S (2022) Weakly supervised semantic segmentation using out-of-distribution data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16897–16906

  29. Lee S, Lee M, Lee J, Shim H (2021) Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5495–5505

  30. Li J, Fan J, Zhang Z (2022) Towards noiseless object contours for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16856–16865

  31. Lin D, Dai J, Jia J, He K, Sun J (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3159–3167

  32. Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400

  33. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp 740–755. Springer

  34. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  35. Pan X, Gao Y, Lin Z, Tang F, Dong W, Yuan H, Huang F, Xu C (2021) Unveiling the potential of structure preserving for weakly supervised object localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11642–11651

  36. Song TH, Sanchez V, Eidaly H, Rajpoot NM (2017) Dual-channel active contour model for megakaryocytic cell segmentation in bone marrow trephine histology images. IEEE Trans Biomed Eng 64(12):2913–2923

    Article  Google Scholar 

  37. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  38. Vernaza P, Chandraker M (2017) Learning random-walk label propagation for weakly-supervised semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7158–7166

  39. Wang C, Bai X, Wang S, Zhou J, Ren P (2018) Multiscale visual attention networks for object detection in vhr remote sensing images. IEEE Geosci Remote Sens Lett 16(2):310–314

    Article  Google Scholar 

  40. Wang X, You S, Li X, Ma H (2018) Weakly-supervised semantic segmentation by iteratively mining common object features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1354–1362

  41. Wang Y, Zhang J, Kan M, Shan S, Chen X (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12275–12284

  42. Wei Y, Feng J, Liang X, Cheng MM, Zhao Y, Yan S (2017) Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1568–1576

  43. Wu Z, Shen C, Van Den Hengel A (2019) Wider or deeper: revisiting the resnet model for visual recognition. Pattern Recogn 90:119–133

    Article  Google Scholar 

  44. Xie J, Hou X, Ye K, Shen L (2022) Clims: Cross language image matching for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4483–4492

  45. Yang K, Hu X, Fang Y, Wang K, Stiefelhagen R (2020) Omnisupervised omnidirectional semantic segmentation. IEEE Transactions on Intelligent Transportation Systems

  46. Yao Q, Gong X (2020) Saliency guided self-attention network for weakly and semi-supervised semantic segmentation. IEEE Access 8:14413–14423

    Article  Google Scholar 

  47. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

  48. Zhou T, Zhang M, Zhao F, Li J (2022) Regional semantic contrast and aggregation for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4299–4309

Download references

Acknowledgements

This study was funded by the Natural Science Foundation of Liaoning Province (NO. 2020-MS-080), the National Key Research and Development Program of China (NO. 2017YFF0108800).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hegui Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, H., Geng, T., Wang, J. et al. Improved sub-category exploration and attention hybrid network for weakly supervised semantic segmentation. Neural Comput & Applic 35, 10573–10587 (2023). https://doi.org/10.1007/s00521-023-08250-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08250-4

Keywords

Navigation