Abstract
Though adversarial erasing has prevailed in weakly supervised semantic segmentation to help activate integral object regions, existing approaches still suffer from the dilemma of under-activation and over-expansion due to the difficulty in determining when to stop erasing. In this paper, we propose a Knowledge Transfer with Simulated Inter-Image Erasing (KTSE) approach for weakly supervised semantic segmentation to alleviate the above problem. In contrast to existing erasing-based methods that remove the discriminative part for more object discovery, we propose a simulated inter-image erasing scenario to weaken the original activation by introducing extra object information. Then, object knowledge is transferred from the anchor image to the consequent less activated localization map to strengthen network localization ability. Considering the adopted bidirectional alignment will also weaken the anchor image activation if appropriate constraints are missing, we propose a self-supervised regularization module to maintain the reliable activation in discriminative regions and improve the inter-class object boundary recognition for complex images with multiple categories of objects. In addition, we resort to intra-image erasing and propose a multi-granularity alignment module to gently enlarge the object activation to boost the object knowledge transfer. Extensive experiments and ablation studies on PASCAL VOC 2012 and COCO datasets demonstrate the superiority of our proposed approach. Codes and models are available at https://nust-machine-intelligence-laboratory.github.io/project-KTSE.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahn, J., Cho, S., Kwak, S.: Weakly supervised learning of instance segmentation with inter-pixel relations. In: CVPR, pp. 2209–2218 (2019)
Ahn, J., Kwak, S.: Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In: CVPR, pp. 4981–4990 (2018)
Bearman, A., Russakovsky, O., Ferrari, V., Fei-Fei, L.: What’s the point: semantic segmentation with point supervision. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_34
Cai, X., Lai, Q., Wang, Y., Wang, W., Sun, Z., Yao, Y.: Poly kernel inception network for remote sensing detection. In: CVPR, pp. 27706–27716 (2024)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE TPAMI 40, 834–848 (2017)
Chen, L., Lei, C., Li, R., Li, S., Zhang, Z., Zhang, L.: FPR: false positive rectification for weakly supervised semantic segmentation. In: ICCV, pp. 1108–1118 (2023)
Chen, L., Wu, W., Fu, C., Han, X., Zhang, Y.: Weakly supervised semantic segmentation with boundary exploration. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 347–362. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_21
Chen, Q., Yang, L., Lai, J.H., Xie, X.: Self-supervised image-specific prototype exploration for weakly supervised semantic segmentation. In: CVPR, pp. 4288–4298 (2022)
Chen, T., et al.: Semantically meaningful class prototype learning for one-shot image segmentation. IEEE TMM 24, 968–980 (2021)
Chen, T., Yao, Y., Huang, X., Li, Z., Nie, L., Tang, J.: Spatial structure constraints for weakly supervised semantic segmentation. IEEE TIP 33, 1136–1148 (2024)
Chen, T., Yao, Y., Tang, J.: Multi-granularity denoising and bidirectional alignment for weakly supervised semantic segmentation. IEEE TIP 32, 2960–2971 (2023)
Chen, T., Yao, Y., Zhang, L., Wang, Q., Xie, G.S., Shen, F.: Saliency guided inter-and intra-class relation constraints for weakly supervised semantic segmentation. IEEE TMM 25, 1727–1737 (2022)
Chen, Z., Sun, Q.: Extracting class activation maps from non-discriminative features as well. In: CVPR, pp. 3135–3144 (2023)
Chen, Z., Wang, T., Wu, X., Hua, X.S., Zhang, H., Sun, Q.: Class re-activation maps for weakly-supervised semantic segmentation. In: CVPR, pp. 969–978 (2022)
Cheng, Z., et al.: Out-of-candidate rectification for weakly supervised semantic segmentation. In: CVPR, pp. 23673–23684 (2023)
Dai, J., He, K., Sun, J.: BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: ICCV, pp. 1635–1643 (2015)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009)
Du, Y., Fu, Z., Liu, Q., Wang, Y.: Weakly supervised semantic segmentation by pixel-to-prototype contrast. In: CVPR, pp. 4320–4329 (2022)
Everingham, M., Gool, L.V., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88, 303–338 (2010)
Fan, J., Zhang, Z., Song, C., Tan, T.: Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In: CVPR, pp. 4283–4292 (2020)
Fan, J., Zhang, Z., Tan, T., Song, C., Xiao, J.: CIAN: cross-image affinity net for weakly supervised semantic segmentation. In: AAAI, vol. 34, pp. 10762–10769 (2020)
Hariharan, B., Arbeláez, P., Bourdev, L.D., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: ICCV, pp. 991–998 (2011)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Jiang, P.T., Yang, Y., Hou, Q., Wei, Y.: L2G: a simple local-to-global knowledge transfer framework for weakly supervised semantic segmentation. In: CVPR, pp. 16886–16896 (2022)
Jing, L., Chen, Y., Tian, Y.: Coarse-to-fine semantic segmentation from image-level labels. IEEE TIP 29, 225–236 (2019)
Jo, S., Yu, I.J., Kim, K.: MARS: model-agnostic biased object removal without additional supervision for weakly-supervised semantic segmentation. arXiv preprint arXiv:2304.09913 (2023)
Khoreva, A., Benenson, R., Hosang, J., Hein, M., Schiele, B.: Simple does it: weakly supervised instance and semantic segmentation. In: CVPR, pp. 876–885 (2017)
Kim, B., Han, S., Kim, J.: Discriminative region suppression for weakly-supervised semantic segmentation. In: AAAI, vol. 35, pp. 1754–1761 (2021)
Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 695–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_42
Kweon, H., Yoon, S.H., Kim, H., Park, D., Yoon, K.J.: Unlocking the potential of ordinary classifier: class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: ICCV, pp. 6994–7003 (2021)
Lee, J., Choi, J., Mok, J., Yoon, S.: Reducing information bottleneck for weakly supervised semantic segmentation. In: NeurIPS, vol. 34, pp. 27408–27421 (2021)
Lee, J., Kim, E., Yoon, S.: Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation. In: CVPR, pp. 4071–4080 (2021)
Lee, J., Oh, S.J., Yun, S., Choe, J., Kim, E., Yoon, S.: Weakly supervised semantic segmentation using out-of-distribution data. In: CVPR, pp. 16897–16906 (2022)
Lee, S., Lee, M., Lee, J., Shim, H.: Railroad is not a train: saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. In: CVPR, pp. 5495–5505 (2021)
Li, J., Jie, Z., Wang, X., Wei, X., Ma, L.: Expansion and shrinkage of localization for weakly-supervised semantic segmentation, vol. 35, pp. 16037–16051 (2022)
Li, X., Zhou, T., Li, J., Zhou, Y., Zhang, Z.: Group-wise semantic mining for weakly supervised semantic segmentation. In: AAAI, pp. 1984–1992 (2021)
Lin, D., Dai, J., Jia, J., He, K., Sun, J.: ScribbleSup: scribble-supervised convolutional networks for semantic segmentation. In: CVPR, pp. 3159–3167 (2016)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Lin, Y., et al.: CLIP is also an efficient segmenter: a text-driven approach for weakly supervised semantic segmentation. In: CVPR, pp. 15305–15314 (2023)
Liu, W., Zhang, C., Lin, G., Hung, T.Y., Miao, C.: Weakly supervised segmentation with maximum bipartite graph matching. In: ACM MM, pp. 2085–2094 (2020)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Pei, G., Chen, T., Jiang, X., Liu, H., Sun, Z., Yao, Y.: VideoMAC: video masked autoencoders meet convnets. In: CVPR, pp. 22733–22743 (2024)
Peng, Z., Wang, G., Xie, L., Jiang, D., Shen, W., Tian, Q.: USAGE: a unified seed area generation paradigm for weakly supervised semantic segmentation. arXiv preprint arXiv:2303.07806 (2023)
Qin, J., Wu, J., Xiao, X., Li, L., Wang, X.: Activation modulation and recalibration scheme for weakly supervised semantic segmentation. In: AAAI, vol. 36, pp. 2117–2125 (2022)
Rong, S., Tu, B., Wang, Z., Li, J.: Boundary-enhanced co-training for weakly supervised semantic segmentation. In: CVPR, pp. 19574–19584 (2023)
Rossetti, S., Zappia, D., Sanzari, M., Schaerf, M., Pirri, F.: Max pooling with vision transformers reconciles class and shape in weakly supervised semantic segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13690, pp. 446–463. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20056-4_26
Ru, L., Zhan, Y., Yu, B., Du, B.: Learning affinity from attention: end-to-end weakly-supervised semantic segmentation with transformers. In: CVPR, pp. 16846–16855 (2022)
Ru, L., Zheng, H., Zhan, Y., Du, B.: Token contrast for weakly-supervised semantic segmentation. In: CVPR, pp. 3093–3102 (2023)
Saleh, F., Aliakbarian, M.S., Salzmann, M., Petersson, L., Gould, S., Alvarez, J.M.: Built-in foreground/background prior for weakly-supervised semantic segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 413–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_25
Sheng, M., Sun, Z., Cai, Z., Chen, T., Zhou, Y., Yao, Y.: Adaptive integration of partial label learning and negative learning for enhanced noisy label learning. In: AAAI, vol. 38, pp. 4820–4828 (2024)
Shimoda, W., Yanai, K.: Self-supervised difference detection for weakly-supervised semantic segmentation. In: ICCV, pp. 5208–5217 (2019)
Singh, K.K., Yu, H., Sarmasi, A., Pradeep, G., Lee, Y.J.: Hide-and-seek: a data augmentation technique for weakly-supervised localization and beyond. arXiv preprint arXiv:1811.02545 (2018)
Song, C., Huang, Y., Ouyang, W., Wang, L.: Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation. In: CVPR, pp. 3136–3145 (2019)
Su, Y., Sun, R., Lin, G., Wu, Q.: Context decoupling augmentation for weakly supervised semantic segmentation. In: ICCV (2021)
Sun, K., Shi, H., Zhang, Z., Huang, Y.: ECS-Net: improving weakly supervised semantic segmentation by using connections between class activation maps. In: ICCV, pp. 7283–7292 (2021)
Vernaza, P., Chandraker, M.: Learning random-walk label propagation for weakly-supervised semantic segmentation. In: CVPR, pp. 7158–7166 (2017)
Wang, W., Sun, G., Van Gool, L.: Looking beyond single images for weakly supervised semantic segmentation learning. IEEE TPAMI 46(3), 1635–1649 (2022)
Wang, X., Liu, S., Ma, H., Yang, M.H.: Weakly-supervised semantic segmentation by iterative affinity learning. Int. J. Comput. Vis. 128, 1736–1749 (2020)
Wang, Y., Zhang, J., Kan, M., Shan, S., Chen, X.: Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: CVPR, pp. 12275–12284 (2020)
Wei, Y., Feng, J., Liang, X., Cheng, M.M., Zhao, Y., Yan, S.: Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: CVPR, pp. 1568–1576 (2017)
Wei, Y., Xiao, H., Shi, H., Jie, Z., Feng, J., Huang, T.: Revisiting dilated convolution: a simple approach for weakly-and semi-supervised semantic segmentation. In: CVPR, pp. 7268–7277 (2018)
Wu, Z., Shen, C., Van Den Hengel, A.: Wider or deeper: revisiting the ResNet model for visual recognition. PR 90, 119–133 (2019)
Xie, J., Hou, X., Ye, K., Shen, L.: CLIMS: cross language image matching for weakly supervised semantic segmentation. In: CVPR, pp. 4483–4492 (2022)
Xu, L., Ouyang, W., Bennamoun, M., Boussaid, F., Sohel, F., Xu, D.: Leveraging auxiliary tasks with affinity learning for weakly supervised semantic segmentation. In: ICCV, pp. 6984–6993 (2021)
Xu, L., Ouyang, W., Bennamoun, M., Boussaid, F., Xu, D.: Multi-class token transformer for weakly supervised semantic segmentation. In: CVPR, pp. 4310–4319 (2022)
Yao, Y., et al.: Non-salient region object mining for weakly supervised semantic segmentation. In: CVPR, pp. 2623–2632 (2021)
Yoon, S.H., Kweon, H., Cho, J., Kim, S., Yoon, K.J.: Adversarial erasing framework via triplet with gated pyramid pooling layer for weakly supervised semantic segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13689, pp. 326–344. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19818-2_19
Zhang, D., Zhang, H., Tang, J., Hua, X.S., Sun, Q.: Causal intervention for weakly-supervised semantic segmentation. In: NeurIPS, vol. 33 (2020)
Zhang, X., Wei, Y., Feng, J., Yang, Y., Huang, T.S.: Adversarial complementary learning for weakly supervised object localization. In: CVPR, pp. 1325–1334 (2018)
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: AAAI, vol. 34, pp. 13001–13008 (2020)
Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR, pp. 2921–2929 (2016)
Zhou, T., Li, L., Li, X., Feng, C.M., Li, J., Shao, L.: Group-wise learning for weakly supervised semantic segmentation. IEEE TIP 31, 799–811 (2021)
Zhou, T., Zhang, M., Zhao, F., Li, J.: Regional semantic contrast and aggregation for weakly supervised semantic segmentation. In: CVPR, pp. 4299–4309 (2022)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (No. 62202227 and 62102182), Natural Science Foundation of Jiangsu Province (No. BK20220938 and BK20220934), China Postdoctoral Science Foundation (No. 2022M711635), Jiangsu Funding Program for Excellent Postdoctoral Talent (No. 2022ZB267), Fundamental Research Funds for the Central Universities (No. 30923010303).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, T., Jiang, X., Pei, G., Sun, Z., Wang, Y., Yao, Y. (2025). Knowledge Transfer with Simulated Inter-image Erasing for Weakly Supervised Semantic Segmentation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15100. Springer, Cham. https://doi.org/10.1007/978-3-031-72946-1_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-72946-1_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72945-4
Online ISBN: 978-3-031-72946-1
eBook Packages: Computer ScienceComputer Science (R0)