Skip to main content
Log in

Discovering latent target subdomains for domain adaptive semantic segmentation via style clustering

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Despite the great progress in unsupervised domain adaptation for semantic segmentation, most previous methods solely consider reducing the inter-domain gap caused by the distribution discrepancy between the source and target domain while not considering the sizeable intra-domain gap among the target domain itself due to the discrepancy among the target data. General intra-domain adaptation methods separate the target data into two splits based on how easily a sample can be segmented, which may not effectively capture the distributions within the target domain. In this paper, based on the observation that there exist diverse styles in the target samples, we propose a style clustering-based unsupervised domain adaptation method to separate the target data into subdomains iteratively. Since the target subdomain labels are unknown, we exploit multi-channel soft labels for adversarial training to close the intra-domain gap among these subdomains. In comparison with general intra-domain adaptation methods, our method can capture the latent distributions within the target data more sufficiently to close the intra-domain gap more effectively. The experiments of unsupervised domain adaptive segmentation tasks on benchmark datasets are conducted and the experimental results show the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

  1. Bakkouri I, Afdel K (2020) Computer-aided diagnosis (cad) system based on multi-layer feature fusion network for skin lesion recognition in dermoscopy images. Multimed Tools Appl 79(29):20483–20518

    Article  Google Scholar 

  2. Bakkouri I, Afdel K, Benois-pineau J et al (2022) bg-3dm2f: Bidirectional gated 3d multi-scale feature fusion for alzheimer’s disease diagnosis. Multimed Tools Appl 81(8):10743–10776

    Article  Google Scholar 

  3. Ben-David S, Blitzer J, Crammer K, Pereira F (2006) Analysis of representations for domain adaptation. Advances in neural information processing systems, p 19

  4. Berahmand K, Mohammadi M, Faroughi A, Mohammadiani RP (2022) A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Clust Comput 25(2):869–888

    Article  Google Scholar 

  5. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp 177–186. Springer

  6. Chen Y-C, Lin Y-Y, Yang M-H, Huang J-B (2019) Crdoco: Pixel-level domain transfer with cross-domain consistency. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1791–1800

  7. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv:1412.7062

  8. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  9. Chen M, Xue H, Cai D (2019) Domain adaptation for semantic segmentation with maximum squares loss. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 2090–2099

  10. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 801–818

  11. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3213–3223

  12. Dash AK, Mohapatra P (2022) A fine-tuned deep convolutional neural network for chest radiography image classification on covid-19 cases. Multimed Tools Appl 81(1):1055–1075

    Article  Google Scholar 

  13. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE

  14. Du L, Tan J, Yang H, Feng J, Xue X, Zheng Q, Ye X, Zhang X (2019) Ssf-dan: Separated semantic feature based domain adaptation network for semantic segmentation. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 982–991

  15. Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Computer Vis 111(1):98–136

    Article  Google Scholar 

  16. Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2414–2423

  17. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3354–3361. IEEE

  18. Gong R, Li W, Chen Y, Gool LV (2019) Dlow: Domain flow for adaptation and generalization. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2477–2486

  19. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in neural information processing systems, p 27

  20. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770–778

  21. Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros A, Darrell T (2018) Cycada: Cycle-consistent adversarial domain adaptation. In: International conference on machine learning, pp 1989–1998. PMLR

  22. Hoffman J, Wang D, Yu F, Darrell T (2016) Fcns in the wild:, Pixel-level adversarial and constraint-based adaptation. arXiv:1612.02649

  23. Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International conference on computer vision, pp 1501–1510

  24. Kim M, Byun H (2020) Learning texture invariant representation for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 12975–12984

  25. Kim M, Joung S, Kim S, Park J, Kim I-J, Sohn K (2020) Cross-domain grouping and alignment for domain adaptive semantic segmentation. arXiv:2012.08226

  26. Kingma DP, Ba J (2014) Adam:, A method for stochastic optimization. arXiv:1412.6980

  27. Kundu R, Singh PK, Ferrara M, Ahmadian A, Sarkar R (2022) Et-net: an ensemble of transfer learning models for prediction of covid-19 infection through chest ct-scan images. Multimed Tools Appl 81(1):31–50

    Article  Google Scholar 

  28. Lee C-Y, Batra T, Baig MH, Ulbricht D (2019) Sliced wasserstein discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 10285–10295

  29. Lee S, Hyun J, Seong H, Kim E (2020) Unsupervised domain adaptation for semantic segmentation by content transfer. arXiv:2012.12545

  30. Lee S, Kim J, Oh T-H, Jeong Y, Yoo D, Lin S, Kweon IS (2019) Visuomotor understanding for representation learning of driving scenes. arXiv:1909.06979

  31. Lee D-H et al (2013) Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML, vol 3, p 896

  32. Li G, Kang G, Liu W, Wei Y, Yang Y (2020) Content-consistent matching for domain adaptive semantic segmentation. In: European conference on computer vision, pp 440–456. Springer

  33. Li Y, Wang N, Liu J, Hou X (2017) Demystifying neural style transfer. arXiv:1701.01036

  34. Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 6936–6945

  35. Lian Q, Lv F, Duan L, Gong B (2019) Constructing self-motivated pyramid curriculums for cross-domain semantic segmentation: a non-adversarial approach. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 6758–6767

  36. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3431–3440

  37. Luc P, Neverova N, Couprie C, Verbeek J, LeCun Y (2017) Predicting deeper into the future of semantic segmentation. In: Proceedings of the IEEE International conference on computer vision, pp 648–657

  38. Luo Y, Liu P, Guan T, Yu J, Yang Y (2019) Significance-aware information bottleneck for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 6778–6787

  39. Luo Y, Zheng L, Guan T, Yu J, Yang Y (2019) Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2507–2516

  40. Maas AL, Hannun AY, Ng AY et al (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proc. Icml, vol 30, p 3. Citeseer

  41. MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–297. Oakland, CA, USA

  42. Mancini M, Porzi L, Bulo SR, Caputo B, Ricci E (2018) Boosting domain adaptation by discovering latent domains. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3771–3780

  43. Maria Carlucci F, Porzi L, Caputo B, Ricci E, Rota Bulo S (2017) Autodial: Automatic domain alignment layers. In: Proceedings of the IEEE International conference on computer vision, pp 5067–5075

  44. Matsuura T, Harada T (2020) Domain generalization using a mixture of multiple latent domains. In: Proceedings of the AAAI Conference on artificial intelligence, vol 34, pp 11749–11756

  45. Murez Z, Kolouri S, Kriegman D, Ramamoorthi R, Kim K (2018) Image to image translation for domain adaptation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4500–4509

  46. Musto L, Zinelli A (2020) Semantically adaptive image-to-image translation for domain adaptation of semantic segmentation. arXiv:2009.01166

  47. Pan F, Shin I, Rameau F, Lee S, Kweon IS (2020) Unsupervised intra-domain adaptation for semantic segmentation through self-supervision. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 3764–3773

  48. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch

  49. Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: Ground truth from computer games. In: European conference on computer vision, pp 102–118. Springer

  50. Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3234–3243

  51. Rostami M, Berahmand K, Nasiri E, Forouzandeh S (2021) Review of swarm intelligence-based feature selection methods. Eng Appl Artif Intell 100:104210

    Article  Google Scholar 

  52. Rostami M, Forouzandeh S, Berahmand K, Soltani M, Shahsavari M, Oussalah M (2022) Gene selection for microarray data classification via multi-objective graph theoretic-based method. Artif Intell Med 123:102228

    Article  Google Scholar 

  53. Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3723–3732

  54. Sankaranarayanan S, Balaji Y, Jain A, Lim SN, Chellappa R (2018) Learning from synthetic data: Addressing domain shift for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3752–3761

  55. Tsai Y-H, Hung W-C, Schulter S, Sohn K, Yang M-H, Chandraker M (2018) Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 7472–7481

  56. Tsai Y-H, Shen X, Lin Z, Sunkavalli K, Lu X, Yang M-H (2017) Deep image harmonization. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3789–3797

  57. Tsai Y-H, Sohn K, Schulter S, Chandraker M (2019) Domain adaptation for structured output via discriminative patch representations. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 1456–1465

  58. Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. Journal of machine learning research 9(11)

  59. Vu T-H, Jain H, Bucher M, Cord M, Pérez P (2019) Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2517–2526

  60. Wang H, Shen T, Zhang W, Duan L-Y, Mei T (2020) Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation. In: European conference on computer vision, pp 642–659. Springer

  61. Wang Z, Yu M, Wei Y, Feris R, Xiong J, Hwu W-M, Huang TS, Shi H (2020) Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 12635–12644

  62. Wrenninge M, Unger J (2018) Synscapes:, A photorealistic synthetic dataset for street scene parsing. arXiv:1810.08705

  63. Wu Z, Han X, Lin Y-L, Uzunbas MG, Goldstein T, Lim SN, Davis LS (2018) Dcan: Dual channel-wise alignment networks for unsupervised scene adaptation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 518–534

  64. Yang Y, Soatto S (2020) Fda: Fourier domain adaptation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 4085–4095

  65. Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122

  66. Zhang Y, David P, Gong B (2017) Curriculum domain adaptation for semantic segmentation of urban scenes. In: Proceedings of the IEEE International conference on computer vision, pp 2020–2030

  67. Zhang Y, Qiu Z, Yao T, Ngo C-W, Liu D, Mei T (2020) Transferring and regularizing prediction for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 9621–9630

  68. Zhang Q, Zhang J, Liu W, Tao D (2019) Category anchor-guided unsupervised domain adaptation for semantic segmentation. arXiv:1910.13049

  69. Zhao A, Balakrishnan G, Durand F, Guttag JV, Dalca AV (2019) Data augmentation using learned transformations for one-shot medical image segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 8543–8553

  70. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2881–2890

  71. Zheng Z, Yang Y (2019) Unsupervised scene adaptation with memory regularization in vivo. arXiv:1912.11164

  72. Zheng Z, Yang Y (2021) Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. Int J Comput Vis 129(4):1106–1120

    Article  Google Scholar 

  73. Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International conference on computer vision, pp 2223–2232

  74. Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 289–305

  75. Zou Y, Yu Z, Liu X, Kumar B, Wang J (2019) Confidence regularized self-training. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 5982–5991

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2020YFA0714103), the Innovation Capacity Construction Project of Jilin Province Development and Reform Commission(2021FGWCXNLJSSZ10), the Science & Technology Development Project of Jilin Province China (20190302117GX) and the Fundamental Research Funds for the Central Universities, JLU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shengsheng Wang.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, A., Wang, S., Zhao, X. et al. Discovering latent target subdomains for domain adaptive semantic segmentation via style clustering. Multimed Tools Appl 83, 7785–7809 (2024). https://doi.org/10.1007/s11042-023-15620-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15620-6

Keywords

Navigation