Abstract
Annotation-scarce semantic segmentation aims to obtain meaningful pixel-level discrimination with scarce or even no manual annotations, of which the crux is how to utilize unlabeled data by pseudo-label learning. Typical works focus on ameliorating the error-prone pseudo-labeling, e.g., only utilizing high-confidence pseudo labels and filtering low-confidence ones out. But we think differently and resort to exhausting informative semantics from multiple probably correct candidate labels. This brings our method the ability to learn more accurately even though pseudo labels are unreliable. In this paper, we propose Adaptive Fuzzy Positive Learning (A-FPL) for correctly learning unlabeled data in a plug-and-play fashion, targeting adaptively encouraging fuzzy positive predictions and suppressing highly probable negatives. Specifically, A-FPL comprises two main components: (1) Fuzzy positive assignment (FPA) that adaptively assigns fuzzy positive labels to each pixel, while ensuring their quality through a T-value adaption algorithm (2) Fuzzy positive regularization (FPR) that restricts the predictions of fuzzy positive categories to be larger than those of negative categories. Being conceptually simple yet practically effective, A-FPL remarkably alleviates interference from wrong pseudo labels, progressively refining semantic discrimination. Theoretical analysis and extensive experiments on various training settings with consistent performance gain justify the superiority of our approach. Codes are at A-FPL.











Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ahmed, W., Morerio, P., & Murino, V. (2022). Cleaning noisy labels by negative ensemble learning for source-free unsupervised domain adaptation. In IEEE/CVF winter conference on applications of computer vision (pp 1616–1625). https://doi.org/10.1109/wacv51458.2022.00043
Arazo, E., Ortego, D., Albert, P., O’Connor, N. E., & McGuinness, K. (2020). Pseudo-labeling and confirmation bias in deep semi-supervised learning. In International joint conference on neural networks (pp. 1–8), IEEE. https://doi.org/10.1109/ijcnn48605.2020.9207304
Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF international conference on computer vision (pp 9297–9307). https://doi.org/10.1109/iccv.2019.00939
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., & Raffel, C. A. (2019). Mixmatch: A holistic approach to semi-supervised learning. Advances in Neural Information Processing Systems 32
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., & Joulin, A. (2020). Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems, 33, 9912–9924.
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., & Joulin, A. (2021). Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision. https://doi.org/10.1109/iccv48922.2021.00951
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848. https://doi.org/10.1109/tpami.2017.2699184
Cho, J. H., Mall, U., Bala, K., & Hariharan, B., (2021). Picie: Unsupervised semantic segmentation using invariance and equivariance in clustering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 16794–16804). https://doi.org/10.1109/cvpr46437.2021.01652
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp 3213–3223). https://doi.org/10.1109/cvpr.2016.350
Croitoru, I., Bogolin, S. V., & Leordeanu, M. (2019). Unsupervised learning of foreground object segmentation. International Journal of Computer Vision, 127, 1279–1302. https://doi.org/10.1007/s11263-019-01183-3
Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., & Garcia, R., (2001). Incorporating second-order functional knowledge for better option pricing. Advances in Neural Information Processing Systems, 472–478
Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88, 303–338. https://doi.org/10.1007/s11263-009-0275-4
Fan, J., Gao, B., Jin, H., & Jiang, L. (2022). Ucc: Uncertainty guided cross-head co-training for semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 9947–9956). https://doi.org/10.1109/cvpr52688.2022.00971
Feng, Z., Zhou, Q., Gu, Q., Tan, X., Cheng, G., Lu, X., Shi, J., & Ma, L. (2022). Dmt: Dynamic mutual training for semi-supervised learning. Pattern Recognition, 108777. https://doi.org/10.1016/j.patcog.2022.108777
Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier neural networks. In International conference on artificial intelligence and statistics (pp 315–323). https://doi.org/10.1109/icassp.2013.6639016
Guo, X., Yang, C., Li, B., & Yuan, Y. (2021). Metacorrection: Domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 3927–3936). https://doi.org/10.1109/cvpr46437.2021.00392
Hamilton, M., Zhang, Z., Hariharan, B., Snavely, N., & Freeman, W. T. (2021). Unsupervised semantic segmentation by distilling feature correspondences. In International conference on learning representations
Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., & Malik, J. (2011). Semantic contours from inverse detectors. In: Proceedings of the IEEE/CVF international conference on computer vision (pp 991–998). IEEE. https://doi.org/10.1109/iccv.2011.6126343
Huang, J., Guan, D., Xiao, A., & Lu, S. (2021). Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. Advances in Neural Information Processing Systems, 34, 3635–3649.
Hwang, J. J., Yu, S. X., Shi, J., Collins, M. D., Yang, T. J., Zhang, X., & Chen, L. C. (2019). Segsort: Segmentation by discriminative sorting of segments. In Proceedings of the IEEE/CVF international conference on computer vision (pp 7334–7344). https://doi.org/10.1109/iccv.2019.00743
Ji, X., Henriques, J. F., & Vedaldi, A. (2019). Invariant information clustering for unsupervised image classification and segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp 9865–9874). https://doi.org/10.1109/iccv.2019.00996
Ke, T. W., Hwang, J. J., Guo, Y., Wang, X., & Yu, S. X. (2022). Unsupervised hierarchical semantic segmentation with multiview cosegmentation and clustering transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 2571–2581). https://doi.org/10.1109/cvpr52688.2022.00260
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C., Lo, W.Y., Dollár, P., & Girshick, R. (2023). Segment anything. In Proceedings of the IEEE/CVF international conference on computer vision. https://doi.org/10.1109/ICCV51070.2023.00371
Kundu, J. N., Kulkarni, A., Singh, A., Jampani, V., & Babu, R. V. (2021). Generalize then adapt: Source-free domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp 7046–7056). https://doi.org/10.1109/iccv48922.2021.00696
Lee, J., & Lee, G. (2023). Feature alignment by uncertainty and self-training for source-free unsupervised domain adaptation. Neural Networks, 161, 682–692. https://doi.org/10.1016/j.neunet.2023.02.009
Lee, J., Jung, D., Yim, J., & Yoon, S. (2022). Confidence score for source-free unsupervised domain adaptation. In International conference on machine learning (pp 12365–12377). PMLR
Li, H., Wan, R., Wang, S., & Kot, A. C. (2021). Unsupervised domain adaptation in the wild via disentangling representation learning. International Journal of Computer Vision, 129, 267–283. https://doi.org/10.1007/s11263-020-01364-5
Li, K., Wang, Z., Cheng, Z., Yu, R., Zhao, Y., Song, G., Liu, C., Yuan, L., & Chen, J. (2023). Acseg: Adaptive conceptualization for unsupervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 7162–7172). https://doi.org/10.1109/cvpr52729.2023.00692
Li, R., Li, S., He, C., Zhang, Y., Jia, X., & Zhang, L. (2022a). Class-balanced pixel-level self-labeling for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 11593–11603). https://doi.org/10.1109/cvpr52688.2022.01130
Li, X., Dai, Y., Ge, Y., Liu, J., Shan, Y., & Duan, L. Y. (2022b). Uncertainty modeling for out-of-distribution generalization. International Conference on Learning Representations
Li, Y. F., Zha, H. W., & Zhou, Z. H. (2017). Learning safe prediction for semi-supervised regression. In Proceedings of the AAAI conference on artificial intelligence. https://doi.org/10.1609/aaai.v31i1.10856
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017). Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 212–220). https://doi.org/10.1109/cvpr.2017.713
Liu, Y., Zhang, W., & Wang, J. (2021). Source-free domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 1215–1224). https://doi.org/10.1109/cvpr46437.2021.00127
Liu, Y., Tian, Y., Chen, Y., Liu, F., Belagiannis, V., & Carneiro, G. (2022). Perturbed and strict mean teachers for semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 4258–4267). https://doi.org/10.1109/cvpr52688.2022.00422
McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC. https://doi.org/10.1201/9780429029608
Mei, K., Zhu, C., Zou, J., & Zhang, S. (2020). Instance adaptive self-training for unsupervised domain adaptation. In European conference on computer vision (pp 415–430). Springer. https://doi.org/10.1007/978-3-030-58574-7_25
Melas-Kyriazi, L., Rupprecht, C., Laina, I., & Vedaldi, A. (2021). Finding an unsupervised image segmenter in each of your deep generative models. In International conference on learning representations
Melas-Kyriazi, L., Rupprecht, C., Laina, I., & Vedaldi, A. (2022). Deep spectral methods: A surprisingly strong baseline for unsupervised semantic segmentation and localization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 8364–8375). https://doi.org/10.1109/cvpr52688.2022.00818
Minaee, S., Boykov, Y. Y., Porikli, F., Plaza, A. J., Kehtarnavaz, N., & Terzopoulos, D. (2021). Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2021.3059968
Newman, M. E., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. https://doi.org/10.1103/physreve.69.026113
Nielsen, F., & Sun, K. (2017). Guaranteed bounds on information-theoretic measures of univariate mixtures using piecewise log-sum-exp inequalities. Differential Geometrical Theory of Statistics, 18(442), 287. https://doi.org/10.3390/e18120442
Oliver, A., Odena, A., Raffel, C., Cubuk, E. D., & Goodfellow, I. J. (2018). Realistic evaluation of deep semi-supervised learning algorithms. In Advances in Neural Information Processing Systems (pp 3239–3250)
Pan, F., Shin, I., Rameau, F., Lee, S., & Kweon, I. S. (2020). Unsupervised intra-domain adaptation for semantic segmentation through self-supervision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 3764–3773). https://doi.org/10.1109/cvpr42600.2020.00382
Pintér, J. D. (2001). Globally optimized spherical point arrangements: Model variants and illustrative results. Annals of Operations Research, 104(1), 213–230. https://doi.org/10.1023/A:1013107507150
Qiao, P., Wei, Z., Wang, Y., Wang, Z., Song, G., Xu, F., Ji, X., Liu, C., & Chen, J. (2023). Fuzzy positive learning for semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 15465–15474). https://doi.org/10.1109/cvpr52729.2023.01484
Prabhu Teja, S., & Fleuret, F. (2021). Uncertainty reduction for model adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 9613–9623). https://doi.org/10.1109/cvpr46437.2021.00949
Richter, S. R., Vineet, V., Roth, S., & Koltun, V. (2016). Playing for data: Ground truth from computer games. In European conference on computer vision (pp 102–118), Springer. https://doi.org/10.1007/978-3-319-46475-6_7
Rizve, M. N., Duarte, K., Rawat, Y. S., & Shah, M. (2020). In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. In International conference on learning representations
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., & Lopez, A. M. (2016). The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 3234–3243). https://doi.org/10.1109/cvpr.2016.352
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1985). Learning internal representations by error propagation. https://doi.org/10.21236/ada164453
Seitzer, M., Horn, M., Zadaianchuk, A., D. Zietlow, D., Xiao, T., Simon-Gabriel, C. J., He, T., Zhang, Z., Schölkopf, B., Brox, T., & Locatello, F. (2022). Bridging the gap to real-world object-centric learning. In International conference on learning representations
Siméoni, O., Sekkat, C., Puy, G., Vobecký, A., Zablocki, É., & P’erez, P. (2023). Unsupervised object localization: Observing the background to discover objects. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 3176–3186). https://doi.org/10.1109/cvpr52729.2023.00310
Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C. A., Cubuk, E. D., Kurakin, A., & Li, C. L. (2020). Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in Neural Information Processing Systems 33
Stan, S., & Rostami, M. (2021). Unsupervised model adaptation for continual semantic segmentation. In Proceedings of the AAAI conference on artificial intelligence (pp 2593–2601). https://doi.org/10.1609/aaai.v35i3.16362
Sun, Y., Cheng, C., Zhang, Y., Zhang, C., Zheng, L., Wang, Z., & Wei, Y. (2020). Circle loss: A unified perspective of pair similarity optimization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 6398–6407). https://doi.org/10.1109/cvpr42600.2020.00643
Van Gansbeke, W., Vandenhende, S., Georgoulis, S., & Van Gool, L. (2021). Unsupervised semantic segmentation by contrasting object mask proposals. In Proceedings of the IEEE/CVF international conference on computer vision (pp 10052–10062). https://doi.org/10.1109/iccv48922.2021.00990
Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., & Liu, W. (2018). Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 5265–5274). https://doi.org/10.1109/cvpr.2018.00552
Wang, X., Yu, Z., De Mello, S., Kautz, J., Anandkumar, A., Shen, C., & Alvarez, J. M. (2022a). Freesolo: Learning to segment objects without annotations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 14176–14186). https://doi.org/10.1109/cvpr52688.2022.01378
Wang, Y., Wang, H., Shen, Y., Fei, J., Li, W., Jin, G., Wu, L., Zhao, R., & Le, X. (2022b). Semi-supervised semantic segmentation using unreliable pseudo-labels. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 4248–4257). https://doi.org/10.1109/cvpr52688.2022.00421
Wen, X., Zhao, B., Zheng, A., Zhang, X., & Qi, X. (2022). Self-supervised visual representation learning with semantic grouping. Advances in Neural Information Processing Systems, 35, 16423–16438.
Xiao, T., Liu, Y., Zhou, B., Jiang, Y., & Sun, J. (2018). Unified perceptual parsing for scene understanding. In European conference on computer vision (pp 418–434). https://doi.org/10.1007/978-3-030-01228-1_26
Yang, L., Zhuo, W., Qi, L., Shi, Y., & Gao, Y. (2022). St++: Make self-training work better for semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 4268–4277). https://doi.org/10.1109/cvpr52688.2022.00423
Ye, M., Zhang, J., Ouyang, J., & Yu, D. (2021). Source data-free unsupervised domain adaptation for semantic segmentation. In Proceedings of the 29th ACM international conference on multimedia (pp 2233–2242). https://doi.org/10.1145/3474085.3475384
Yin, Z., Wang, P., Wang, F., Xu, X., Zhang, H., Li, H., & Jin, R. (2022). Transfgu: a top-down approach to fine-grained unsupervised semantic segmentation. In European conference on computer vision (pp 73–89). Springer. https://doi.org/10.1007/978-3-031-19818-2_5
You, F., Li, J., Zhu, L., Chen, Z., & Huang, Z. (2021). Domain adaptive semantic segmentation without source data. In Proceedings of the 29th ACM international conference on multimedia (pp 3293–3302). https://doi.org/10.1145/3474085.3475482
Zadaianchuk, A., Kleindessner, M., Zhu, Y., Locatello, F., & Brox, T. (2023). Unsupervised semantic segmentation with self-supervised object-centric representations. In International conference on learning representations
Zhang, P., Zhang, B., Zhang, T., Chen, D., Wang, Y., & Wen, F. (2021). Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 12414–12424). https://doi.org/10.1109/cvpr46437.2021.01223
Zhang, R., Isola, P., & Efros, A. A. (2016). Colorful image colorization. In European conference on computer vision (pp 649–666). Springer. https://doi.org/10.1007/978-3-319-46487-9_40
Zhao, D., Wang, S., Zang, Q., Quan, D., Ye, X., & Jiao, L. (2023). Towards better stability and adaptability: Improve online self-training for model adaptation in semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 11733–11743). https://doi.org/10.1109/cvpr52729.2023.01129
Zheng, Z., & Yang, Y. (2021). Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision, 129(4), 1106–1120. https://doi.org/10.1007/s11263-020-01395-y
Ziegler, A., & Asano, Y. M. (2022). Self-supervised learning of object parts for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 14502–14511). https://doi.org/10.1109/cvpr52688.2022.01410
Funding
This work was supported in part by the National Key R &D Program of China (No. 2022ZD0118201), Natural Science Foundation of China (Nos. 61972217, 32071459, 62176249, 62006133, 62271465), the Shenzhen Medical Research Funds in China (No. B2302037), and AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Ming-Hsuan Yang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qiao, P., Wang, Y., Liu, C. et al. Adaptive Fuzzy Positive Learning for Annotation-Scarce Semantic Segmentation. Int J Comput Vis 133, 1048–1066 (2025). https://doi.org/10.1007/s11263-024-02217-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-024-02217-1