Skip to main content
Log in

Crots: Cross-Domain Teacher–Student Learning for Source-Free Domain Adaptive Semantic Segmentation

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Source-free domain adaptation (SFDA) aims to transfer source knowledge to the target domain from pre-trained source models without accessing private source data. Existing SFDA methods typically adopt the self-training strategy employing the pre-trained source model to generate pseudo-labels for unlabeled target data. However, these methods are subject to strict limitations: (1) The discrepancy between source and target domains results in intense noise and unreliable pseudo-labels. Overfitting noisy pseudo-labeled target data will lead to drastic performance degradation. (2) Considering the class-imbalanced pseudo-labels, the target model is prone to forget the minority classes. Aiming at these two limitations, this study proposes a CROss domain Teacher–Student learning framework (namely CROTS) to achieve source-free domain adaptive semantic segmentation. Specifically, with pseudo-labels provided by the intra-domain teacher model, CROTS incorporates Spatial-Aware Data Mixing to generate diverse samples by randomly mixing different patches respecting to their spatial semantic layouts. Meanwhile, during inter-domain teacher–student learning, CROTS fosters Rare-Class Patches Mining strategy to mitigate the class imbalance phenomenon. To this end, the inter-domain teacher model helps exploit long-tailed rare classes and promote their contributions to student learning. Extensive experimental results have demonstrated that: (1) CROTS mitigates the overfitting issue and contributes to stable performance improvement, i.e., + 16.0% mIoU and + 16.5% mIoU for SFDA in GTA5\(\rightarrow \)Cityscapes and SYNTHIA\(\rightarrow \)Cityscapes, respectively; (2) CROTS improves task performance for long-tailed rare classes, alleviating the issue of class imbalance; (3) CROTS achieves superior performance comparing to other SFDA competitors; (4) CROTS can be applied under the black-box SFDA setting, even outperforming many white-box SFDA methods. Our codes will be publicly available at https://github.com/luoxin13/CROTS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

The datasets used in this manuscript can be downloaded publicly from the official websites.

References

  • Ahmed, S.M., Raychaudhuri, D.S., Paul, S., Oymak, S., & Roy-Chowdhury A.K. (2021). Unsupervised multi-source domain adaptation without access to source data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10098–10107).

  • Asgari Taghanaki, S., Abhishek, K., Cohen, J. P., Cohen-Adad, J., & Hamarneh, G. (2021). Deep semantic segmentation of natural and medical images: A review. Artificial Intelligence Review, 54(1), 137–178.

    Article  Google Scholar 

  • Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 .

  • Chang, W.-L., Wang, H.-P., Peng, W.-H., & Chiu, W.-C. (2019). All about structure: Adapting structural information across domains for boosting semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1900–1909).

  • Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern analysis and Machine Intelligence, 40(4), 834–848.

    Article  Google Scholar 

  • Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3213–3223).

  • Du, Z., Li, J., Su, H., Zhu, L., & Lu, K. (2021). Crossdomain gradient discrepancy minimization for unsupervised domain adaptation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3937–3946).

  • French, G., Laine, S., Aila, T., Mackiewicz, M., & Finlayson, G.D. (2020a). Semi-supervised semantic segmentation needs strong, varied perturbations. In Proceedings of the British machine vision conference. BMVA Press.

  • French, G., Laine, S., Aila, T., Mackiewicz, M., & Finlayson, G.D. (2020b). Semi-supervised semantic segmentation needs strong, varied perturbations. In Proceedings of the 31st British machine vision conference.

  • Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., & Lempitsky, V. (2016). Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 17(1), 2096.

    MathSciNet  Google Scholar 

  • Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2021). Knowledge distillation: A survey. International Journal of Computer Vision, 129(6), 1789–1819.

    Article  Google Scholar 

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 770–778). Las Vegas, NV, USA: IEEE. https://doi.org/10.1109/CVPR.2016.90

  • Hoyer, L., Dai, D., & Gool, L.V. (2022). DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9924–9935).

  • Hu, X., Tang, C., Chen, H., Li, X., Li, J., & Zhang, Z. (2022). Improving image segmentation with boundary patch refinement. International Journal of Computer Vision, 130(11), 2571–2589.

    Article  Google Scholar 

  • Huang, J., Guan, D., Xiao, A., & Lu, S. (2021). Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. In Advances in neural information processing systems (Vol. 34, pp. 3635–3649).

  • Kamann, C., & Rother, C. (2021). Benchmarking the robustness of semantic segmentation models with respect to common corruptions. International Journal of Computer Vision, 129(2), 462–483.

    Article  Google Scholar 

  • Kundu, J.N., Kulkarni, A., Singh, A., Jampani, V., & Babu, R.V. (2021). Generalize then Adapt: Source-Free Domain Adaptive Semantic Segmentation. In Proceedings of the IEEE/CVF International conference on computer vision (pp. 7026–7036).

  • Kurmi, V.K., Subramanian, V.K., & Namboodiri, V.P. (2021). Domain impression: A source data free domain adaptation method. In Proceedings of the IEEE winter conference on applications of computer vision (pp. 615–625).

  • Lee, C.-Y., Batra, T., Baig, M.H., & Ulbricht, D. (2019). Sliced wasserstein discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10285–10295).

  • Li, H., Wan, R., Wang, S., & Kot, A. C. (2021). Unsupervised domain adaptation in the wild via disentangling representation learning. International Journal of Computer Vision, 129(2), 267–283. https://doi.org/10.1007/s11263-020-01364-5

    Article  MathSciNet  Google Scholar 

  • Li, R., Jiao, Q., Cao, W., Wong, H.-S., & Wu, S. (2020). Model adaptation: Unsupervised domain adaptation without source data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9638–9647).

  • Li, R., Li, S., He, C., Zhang, Y., Jia, X., & Zhang, L. (2022). Class-balanced pixel-level self-labeling for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

  • Li, S., Lv, F., Xie, B., Liu, C.H., Liang, J., & Qin, C. (2021). Bi-classifier determinacy maximization for unsupervised domain adaptation. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, pp. 8455–8464).

  • Li, Y., Yuan, L., & Vasconcelos, N. (2019). Bidirectional learning for domain adaptation of semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6936–6945).

  • Liang, J., Hu, D., & Feng, J. (2020). Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In Proceedings of the 37th international conference on machine learning (pp. 6028–6039). PMLR.

  • Liang, J., Hu, D., Jiashi, F., & He, R. (2022). Dine: Domain adaptation from single and multiple black-box predictors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

  • Liu, Y., Tian, Y., Chen, Y., Liu, F., Belagiannis, V., & Carneiro, G. (2022). Perturbed and strict mean teachers for semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (p. 10).

  • Liu, Y., Zhang, W., & Wang, J. (2021). Source-free domain adaptation for semantic segmentation. In Proceeedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1215–1224).

  • Loshchilov, I., & Hutter, F. (2017). Sgdr: Stochastic gradient descent with warm restarts. In Proceedings of the international conference on learning representations.

  • Lu, Z., Yang, Y., Zhu, X., Liu, C., Song, Y.-Z., & Xiang, T. (2020). Stochastic classifiers for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9111–9120).

  • Luo, Y., Zheng, L., Guan, T., Yu, J., & Yang, Y. (2019). Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2507–2516).

  • Nath Kundu, J., Venkat, N., Rahul, M. V., & Venkatesh Babu, R. (2020). Universal sourcefree domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4543–4552).

  • Olsson, V., Tranheden, W., Pinto, J., & Svensson, L. (2021). Classmix: Segmentation-based data augmentation for semi-supervised learning. In Proceedings of the IEEE winter conference on applications of computer vision (pp. 1368–1377).

  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., & Chintala, S. (2019). Pytorch: An imperative style, highperformance deep learning library. In Advances in neural information processing systems (Vol. 32).

  • Richter, S. R., Vineet, V., Roth, S., & Koltun, V. (2016). Playing for data: Ground truth from computer games. In European conference on computer vision (pp. 102–118).

  • Ros, G., Sellart, L., Materzynska, J., Vazquez, D., & Lopez, A.M. (2016). The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3234–3243).

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

    Article  MathSciNet  Google Scholar 

  • Sivaprasad, P. T., & Fleuret, F. (2021). Uncertainty reduction for model adaptation in semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9608–9618).

  • Saito, K., Watanabe, K., Ushiku, Y., & Harada, T. (2018). Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3723–3732).

  • Summa, M.G., Bottou, L., Goldfarb, B., Murtagh, F., Pardoux, C., & Touati, M. (2010). Largescale machine learning with stochastic gradient descent léon bottou. In Proceedings of the international conference on computational statistics (pp. 33–42). Chapman and Hall/CRC. https://doi.org/10.1201/b11429-6

  • Tarvainen, A., & Valpola, H. (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in neural information processing systems (Vol. 30). Curran Associates, Inc.

  • Tranheden, W., Olsson, V., Pinto, J., & Svensson, L. (2021). DACS: Domain adaptation via crossdomain mixed sampling. In Proceedings of the IEEE winter conference on applications of computer vision (pp. 1378–1388). IEEE. https://doi.org/10.1109/WACV48630.2021.00142

  • Tsai, Y.-H., Hung, W.-C., Schulter, S., Sohn, K., Yang, M.-H., & Chandraker, M. (2018). Learning to adapt structured output space for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7472–7481).

  • Vu, T. -H., Jain, H., Bucher, M., Cord, M., & Pérez, P. (2019). Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2517–2526).

  • Wang, D., Shelhamer, E., Liu, S., Olshausen, B., & Darrell, T. (2021). Tent: Fully test-time adaptation by entropy minimization. In Proceedings of the international conference on learning representations.

  • Wang, Y., Liang, J., & Zhang, Z. (2022). Source data-free cross-domain semantic segmentation: Align, teach and propagate (No. arXiv:2106.11653)

  • Yang, S., Wang, Y., van de Weijer, J., Herranz, L., & Jui, S. (2021). Generalized sourcefree domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8958–8967).

  • Yang, Y., Lao, D., Sundaramoorthi, G., & Soatto, S. (2020). Phase consistent ecological domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9011–9020).

  • Ye, M., Zhang, J., Ouyang, J., & Yuan, D. (2021). Source data-free unsupervised domain adaptation for semantic segmentation. In Proceedings of the 29th ACM international conference on multimedia (p. 2233–2242).

  • You, F., Li, J., Zhu, L., Chen, Z., & Huang, Z. (2021). Domain adaptive semantic segmentation without source data. In Proceedings of the 29th ACM international conference on multimedia (pp. 3293–3302).

  • Yu, F., Zhang, M., Dong, H., Hu, S., Dong, B., & Zhang, L. (2021). Dast: Unsupervised domain adaptation in semantic segmentation based on discriminator attention and self-training. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, pp. 10754–10762).

  • Yu, L., Li, Z., Xu, M., Gao, Y., Luo, J., & Zhang, J. (2022). Distribution-aware margin calibration for semantic segmentation in images. International Journal of Computer Vision, 130(1), 95–110.

    Article  Google Scholar 

  • Yun, S., Han, D., Chun, S., Oh, S. J., Yoo, Y., Choe, J. (2019). Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision (p. 6022–6031).

  • Zhang, H., Cisse, M., Dauphin, Y.N., & Lopez-Paz, D. (2018). Mixup: BEYOND EMPIRICAL RISK MINIMIZATION. In Proceedings of the international conference on learning representations (p. 13).

  • Zhang, H., Zhang, Y., Jia, K., & Lei, Z. (2021). Unsupervised domain adaptation of blackbox source models. In Proceedings of the British machine vision conference.

  • Zhang, P., Zhang, B., Zhang, T., Chen, D., Wang, Y., & Wen, F. (2021). Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12409–12419).

  • Zhao, S., Li, B., Xu, P., Yue, X., Ding, G., & Keutzer, K. (2021). MADAN: multi-source adversarial domain aggregation network for domain adaptation. International Journal of Computer Vision, 129(8), 2399–2424.

    Article  Google Scholar 

  • Zheng, Z., & Yang, Y. (2021). Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision, 129(4), 1106–1120. https://doi.org/10.1007/s11263-020-01395-y

    Article  Google Scholar 

  • Zou, Y., Yu, Z., Kumar, B., & Wang, J. (2018). Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In Proceedings of the European conference on computer vision (pp. 289–305).

  • Zou, Y., Yu, Z., Liu, X., Kumar, B., & Wang, J. (2019). Confidence regularized self-training. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 5982–5991).

Download references

Acknowledgements

We extend our gratitude to Mengzhu Wang and Xifeng Guo, who have given us valuable suggestions to improve our manuscript.

Funding

This work was supported by the Natural Science Foundation of Hunan Province of China (No. 2022JJ30666), the Independent and Open Subject Fund from State Key Laboratory of High Performance Computing, National University of Defense Technology (No. 202101-10), and the National Key Technologies Research and Development Program of China (No. 2018YFB0204301).

Author information

Authors and Affiliations

Authors

Contributions

XL and WC made substantial contributions to the conception or design of the work. ZL, LY, SW, SW, and CL made contributions to the acquisition, analysis, or interpretation of data. All the authors drafted the work or revised it critically. All the authors approved the version to be published.

Corresponding author

Correspondence to Wei Chen.

Ethics declarations

Conflicts of interest

The authors have no financial or proprietary interests in any material discussed in this article.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

The authors confirm that: (1) the work described has not been published before; (2) the manuscript is not under consideration for publication elsewhere; (3) the publication has been approved by all co-authors; (4) the publication has been approved by the responsible authorities at the institution where the work is carried out.

Code availability

The codes used in this manuscript will be made publicly available.

Additional information

Communicated by Ming-Hsuan Yang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, X., Chen, W., Liang, Z. et al. Crots: Cross-Domain Teacher–Student Learning for Source-Free Domain Adaptive Semantic Segmentation. Int J Comput Vis 132, 20–39 (2024). https://doi.org/10.1007/s11263-023-01863-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-023-01863-1

Keywords

Navigation