Abstract
Semantic segmentation is a fundamental task in computer vision that entails classifying each pixel of an image into predefined categories. Despite significant advancements in deep learning, obtaining accurately labeled datasets remains a costly and labor-intensive process. This research aims to mitigate the need for extensive, precise tags by exploring Weakly Supervised Semantic Segmentation (WSSS), which seeks to achieve accurate pixel-level classification with minimal supervision. We introduce WS-GCA, a novel unified framework that synergistically combines the Gaussian Mixture Model (GMM), Label Cohesion Loss (LC Loss), and self-attention mechanism to enhance segmentation quality. The WS-GCA framework models the distribution of weak labels using a mixed Gaussian distribution, amalgamates global and local feature information to substantially boost model prediction accuracy, incorporates LC Loss to improve spatial consistency in segmentation, and employs a self-attention mechanism to enhance feature extraction efficiency. Experimental results on the Pascal and Cityscapes datasets demonstrate the WS-GCA framework’s ability to generate superior segmentation results from initially weak labels. The proposed framework increases the mean Intersection over Union (mIoU) by 2.2% compared to baseline models, significantly reducing category mispredictions and advancing the state of the art in the segmentation of large-area objects with minimal supervision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahn, J., Cho, S., Kwak, S.: Weakly supervised learning of instance segmentation with inter-pixel relations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2209–2218 (2019)
Chan, L., Hosseini, M.S., Plataniotis, K.N.: A comprehensive analysis of weakly-supervised semantic segmentation in different image domains. Int. J. Comput. Vision 129(2), 361–384 (2021)
Chen, H., et al.: Seminar learning for click-level weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6920–6929 (2021)
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88, 303–338 (2010)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, Z., Zhang, S., Cheng, D., Liang, R., Jiang, M.: Multi-branch residual fusion network for imbalanced visual regression. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, pp. 392–406. Springer (2023). https://doi.org/10.1007/978-981-97-2303-4_26
Ke, T.W., Hwang, J.J., Yu, S.X.: Universal weakly supervised segmentation by pixel-to-segment contrastive learning (2021). arXiv preprint arXiv:2105.00957
Lee, J., Choi, J., Mok, J., Yoon, S.: Reducing information bottleneck for weakly supervised semantic segmentation. Adv. Neural. Inf. Process. Syst. 34, 27408–27421 (2021)
Lee, J., Kim, E., Lee, S., Lee, J., Yoon, S.: Ficklenet: weakly and semi-supervised semantic image segmentation using stochastic inference. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5267–5276 (2019)
Liang, R., Zhang, S., Zhang, W., Zhang, G., Tang, J.: Nonlocal hybrid network for long-tailed image classification. ACM Trans. Multimed. Comput. Commun. Appl. 20(4), 1–22 (2024)
Liang, Z., Wang, T., Zhang, X., Sun, J., Shen, J.: Tree energy loss: towards sparsely annotated semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 16907–16916 (2022)
Lin, D., Dai, J., Jia, J., He, K., Sun, J.: Scribblesup: scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3159–3167 (2016)
Lu, G., Li, J., Wei, J.: Aspect sentiment analysis with heterogeneous graph neural networks. Inf. Proc. Manage. 59(4), 102953 (2022)
Marin, D., Tang, M., Ayed, I.B., Boykov, Y.: Beyond gradient descent for regularized segmentation losses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10187–10196 (2019)
Ople, J.J.M., Yeh, P.Y., Sun, S.W., Tsai, I.T., Hua, K.L.: Multi-scale neural network with dilated convolutions for image deblurring. IEEE Access 8, 53942–53952 (2020)
Pan, Z., Jiang, P., Wang, Y., Tu, C., Cohn, A.G.: Scribble-supervised semantic segmentation by uncertainty reduction on neural representation and self-supervision on neural eigenspace. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7416–7425 (2021)
Redondo-Cabrera, C., Baptista-Rios, M., López-Sastre, R.J.: Learning to exploit the prior network knowledge for weakly supervised semantic segmentation. IEEE Trans. Image Process. 28(7), 3649–3661 (2019)
Ru, L., Zheng, H., Zhan, Y., Du, B.: Token contrast for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3093–3102 (2023)
Subhashdas, S.K., Choi, B.S., Yoo, J.H., Ha, Y.H.: Color image enhancement based on particle swarm optimization with gaussian mixture. In: Color imaging XX: Displaying, processing, hardcopy, and applications, vol. 9395, pp. 66–76. SPIE (2015)
Tang, M., Djelouah, A., Perazzi, F., Boykov, Y., Schroers, C.: Normalized cut loss for weakly-supervised cnn segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1818–1827 (2018)
Tang, M., Perazzi, F., Djelouah, A., Ben Ayed, I., Schroers, C., Boykov, Y.: On regularized losses for weakly-supervised cnn segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 507–522 (2018)
Vernaza, P., Chandraker, M.: Learning random-walk label propagation for weakly-supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7158–7166 (2017)
Wang, B., et al.: Boundary perception guidance: a scribble-supervised semantic segmentation approach. In: IJCAI International joint conference on artificial intelligence (2019)
Wang, W., Sun, G., Van Gool, L.: Looking beyond single images for weakly supervised semantic segmentation learning. IEEE Trans. Pattern Anal. Mach. Intell. 46(3), 1635–1649 (2022)
Wang, Y., Zhang, J., Kan, M., Shan, S., Chen, X.: Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12275–12284 (2020)
Wu, L., Fang, L., Yue, J., Zhang, B., Ghamisi, P., He, M.: Deep bilateral filtering network for point-supervised semantic segmentation in remote sensing images. IEEE Trans. Image Process. 31, 7419–7434 (2022)
Wu, L., et al.: Sparsely annotated semantic segmentation with adaptive gaussian mixtures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15454–15464 (2023)
Xu, J., et al.: Scribble-supervised semantic segmentation inference. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15354–15363 (2021)
Xu, R., Wang, C., Sun, J., Xu, S., Meng, W., Zhang, X.: Self correspondence distillation for end-to-end weakly-supervised semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 3045–3053 (2023)
Yi, R., Zeng, R., Weng, Y., Yu, M., Lai, Y.K., Liu, Y.J.: Lesion region segmentation via weakly supervised learning. Quant. Biol. 10(3), 239–252 (2022)
Zhang, B., Xiao, J., Jiao, J., Wei, Y., Zhao, Y.: Affinity attention graph neural network for weakly supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 8082–8096 (2021)
Zhang, B., Xiao, J., Wei, Y., Zhao, Y.: Credible dual-expert learning for weakly supervised semantic segmentation. Int. J. Comput. Vision 131(8), 1892–1908 (2023)
Zhang, G., Zhang, S., Yuan, G.: Bayesian graph local extrema convolution with long-tail strategy for misinformation detection. ACM Trans. Knowl. Discov. Data 18(4), 1–21 (2024)
Zhou, H., Song, K., Zhang, X., Gui, W., Qian, Q.: Wails: Watershed algorithm with image-level supervision for weakly supervised semantic segmentation. IEEE Access 7, 42745–42756 (2019)
Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International conference on learning representations (2018)
Zu, X., Yu, H., Li, B., Xue, X.: Weakly-supervised text instance segmentation. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 1915–1923 (2023)
Acknowledgment
The work was supported partly by the Project of Guangxi Science and Technology(No. GuiKeAB23026040), Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, China, Intelligent Processing and the Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (Nos. 20-A-01-01, MIMS21-M01 and MIMS24-02), the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and the Guangxi “Bagui” Teams for Innovation and Research, China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, Z., Zhang, W., Song, J., Chen, B., Hu, Y., Zhang, S. (2024). WS-GCA: A Synergistic Framework for Precise Semantic Segmentation with Comprehensive Supervision. In: Zhang, W., Tung, A., Zheng, Z., Yang, Z., Wang, X., Guo, H. (eds) Web and Big Data. APWeb-WAIM 2024. Lecture Notes in Computer Science, vol 14961. Springer, Singapore. https://doi.org/10.1007/978-981-97-7232-2_29
Download citation
DOI: https://doi.org/10.1007/978-981-97-7232-2_29
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-7231-5
Online ISBN: 978-981-97-7232-2
eBook Packages: Computer ScienceComputer Science (R0)