Abstract
Despite the great progress in unsupervised domain adaptation for semantic segmentation, most previous methods solely consider reducing the inter-domain gap caused by the distribution discrepancy between the source and target domain while not considering the sizeable intra-domain gap among the target domain itself due to the discrepancy among the target data. General intra-domain adaptation methods separate the target data into two splits based on how easily a sample can be segmented, which may not effectively capture the distributions within the target domain. In this paper, based on the observation that there exist diverse styles in the target samples, we propose a style clustering-based unsupervised domain adaptation method to separate the target data into subdomains iteratively. Since the target subdomain labels are unknown, we exploit multi-channel soft labels for adversarial training to close the intra-domain gap among these subdomains. In comparison with general intra-domain adaptation methods, our method can capture the latent distributions within the target data more sufficiently to close the intra-domain gap more effectively. The experiments of unsupervised domain adaptive segmentation tasks on benchmark datasets are conducted and the experimental results show the effectiveness of our method.
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from the corresponding author, upon reasonable request.
References
Bakkouri I, Afdel K (2020) Computer-aided diagnosis (cad) system based on multi-layer feature fusion network for skin lesion recognition in dermoscopy images. Multimed Tools Appl 79(29):20483–20518
Bakkouri I, Afdel K, Benois-pineau J et al (2022) bg-3dm2f: Bidirectional gated 3d multi-scale feature fusion for alzheimer’s disease diagnosis. Multimed Tools Appl 81(8):10743–10776
Ben-David S, Blitzer J, Crammer K, Pereira F (2006) Analysis of representations for domain adaptation. Advances in neural information processing systems, p 19
Berahmand K, Mohammadi M, Faroughi A, Mohammadiani RP (2022) A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Clust Comput 25(2):869–888
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp 177–186. Springer
Chen Y-C, Lin Y-Y, Yang M-H, Huang J-B (2019) Crdoco: Pixel-level domain transfer with cross-domain consistency. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1791–1800
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv:1412.7062
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen M, Xue H, Cai D (2019) Domain adaptation for semantic segmentation with maximum squares loss. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 2090–2099
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 801–818
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3213–3223
Dash AK, Mohapatra P (2022) A fine-tuned deep convolutional neural network for chest radiography image classification on covid-19 cases. Multimed Tools Appl 81(1):1055–1075
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE
Du L, Tan J, Yang H, Feng J, Xue X, Zheng Q, Ye X, Zhang X (2019) Ssf-dan: Separated semantic feature based domain adaptation network for semantic segmentation. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 982–991
Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Computer Vis 111(1):98–136
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2414–2423
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3354–3361. IEEE
Gong R, Li W, Chen Y, Gool LV (2019) Dlow: Domain flow for adaptation and generalization. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2477–2486
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in neural information processing systems, p 27
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770–778
Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros A, Darrell T (2018) Cycada: Cycle-consistent adversarial domain adaptation. In: International conference on machine learning, pp 1989–1998. PMLR
Hoffman J, Wang D, Yu F, Darrell T (2016) Fcns in the wild:, Pixel-level adversarial and constraint-based adaptation. arXiv:1612.02649
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International conference on computer vision, pp 1501–1510
Kim M, Byun H (2020) Learning texture invariant representation for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 12975–12984
Kim M, Joung S, Kim S, Park J, Kim I-J, Sohn K (2020) Cross-domain grouping and alignment for domain adaptive semantic segmentation. arXiv:2012.08226
Kingma DP, Ba J (2014) Adam:, A method for stochastic optimization. arXiv:1412.6980
Kundu R, Singh PK, Ferrara M, Ahmadian A, Sarkar R (2022) Et-net: an ensemble of transfer learning models for prediction of covid-19 infection through chest ct-scan images. Multimed Tools Appl 81(1):31–50
Lee C-Y, Batra T, Baig MH, Ulbricht D (2019) Sliced wasserstein discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 10285–10295
Lee S, Hyun J, Seong H, Kim E (2020) Unsupervised domain adaptation for semantic segmentation by content transfer. arXiv:2012.12545
Lee S, Kim J, Oh T-H, Jeong Y, Yoo D, Lin S, Kweon IS (2019) Visuomotor understanding for representation learning of driving scenes. arXiv:1909.06979
Lee D-H et al (2013) Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML, vol 3, p 896
Li G, Kang G, Liu W, Wei Y, Yang Y (2020) Content-consistent matching for domain adaptive semantic segmentation. In: European conference on computer vision, pp 440–456. Springer
Li Y, Wang N, Liu J, Hou X (2017) Demystifying neural style transfer. arXiv:1701.01036
Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 6936–6945
Lian Q, Lv F, Duan L, Gong B (2019) Constructing self-motivated pyramid curriculums for cross-domain semantic segmentation: a non-adversarial approach. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 6758–6767
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3431–3440
Luc P, Neverova N, Couprie C, Verbeek J, LeCun Y (2017) Predicting deeper into the future of semantic segmentation. In: Proceedings of the IEEE International conference on computer vision, pp 648–657
Luo Y, Liu P, Guan T, Yu J, Yang Y (2019) Significance-aware information bottleneck for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 6778–6787
Luo Y, Zheng L, Guan T, Yu J, Yang Y (2019) Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2507–2516
Maas AL, Hannun AY, Ng AY et al (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proc. Icml, vol 30, p 3. Citeseer
MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–297. Oakland, CA, USA
Mancini M, Porzi L, Bulo SR, Caputo B, Ricci E (2018) Boosting domain adaptation by discovering latent domains. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3771–3780
Maria Carlucci F, Porzi L, Caputo B, Ricci E, Rota Bulo S (2017) Autodial: Automatic domain alignment layers. In: Proceedings of the IEEE International conference on computer vision, pp 5067–5075
Matsuura T, Harada T (2020) Domain generalization using a mixture of multiple latent domains. In: Proceedings of the AAAI Conference on artificial intelligence, vol 34, pp 11749–11756
Murez Z, Kolouri S, Kriegman D, Ramamoorthi R, Kim K (2018) Image to image translation for domain adaptation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4500–4509
Musto L, Zinelli A (2020) Semantically adaptive image-to-image translation for domain adaptation of semantic segmentation. arXiv:2009.01166
Pan F, Shin I, Rameau F, Lee S, Kweon IS (2020) Unsupervised intra-domain adaptation for semantic segmentation through self-supervision. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 3764–3773
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch
Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: Ground truth from computer games. In: European conference on computer vision, pp 102–118. Springer
Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3234–3243
Rostami M, Berahmand K, Nasiri E, Forouzandeh S (2021) Review of swarm intelligence-based feature selection methods. Eng Appl Artif Intell 100:104210
Rostami M, Forouzandeh S, Berahmand K, Soltani M, Shahsavari M, Oussalah M (2022) Gene selection for microarray data classification via multi-objective graph theoretic-based method. Artif Intell Med 123:102228
Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3723–3732
Sankaranarayanan S, Balaji Y, Jain A, Lim SN, Chellappa R (2018) Learning from synthetic data: Addressing domain shift for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3752–3761
Tsai Y-H, Hung W-C, Schulter S, Sohn K, Yang M-H, Chandraker M (2018) Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 7472–7481
Tsai Y-H, Shen X, Lin Z, Sunkavalli K, Lu X, Yang M-H (2017) Deep image harmonization. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3789–3797
Tsai Y-H, Sohn K, Schulter S, Chandraker M (2019) Domain adaptation for structured output via discriminative patch representations. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 1456–1465
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. Journal of machine learning research 9(11)
Vu T-H, Jain H, Bucher M, Cord M, Pérez P (2019) Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2517–2526
Wang H, Shen T, Zhang W, Duan L-Y, Mei T (2020) Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation. In: European conference on computer vision, pp 642–659. Springer
Wang Z, Yu M, Wei Y, Feris R, Xiong J, Hwu W-M, Huang TS, Shi H (2020) Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 12635–12644
Wrenninge M, Unger J (2018) Synscapes:, A photorealistic synthetic dataset for street scene parsing. arXiv:1810.08705
Wu Z, Han X, Lin Y-L, Uzunbas MG, Goldstein T, Lim SN, Davis LS (2018) Dcan: Dual channel-wise alignment networks for unsupervised scene adaptation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 518–534
Yang Y, Soatto S (2020) Fda: Fourier domain adaptation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 4085–4095
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122
Zhang Y, David P, Gong B (2017) Curriculum domain adaptation for semantic segmentation of urban scenes. In: Proceedings of the IEEE International conference on computer vision, pp 2020–2030
Zhang Y, Qiu Z, Yao T, Ngo C-W, Liu D, Mei T (2020) Transferring and regularizing prediction for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 9621–9630
Zhang Q, Zhang J, Liu W, Tao D (2019) Category anchor-guided unsupervised domain adaptation for semantic segmentation. arXiv:1910.13049
Zhao A, Balakrishnan G, Durand F, Guttag JV, Dalca AV (2019) Data augmentation using learned transformations for one-shot medical image segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 8543–8553
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2881–2890
Zheng Z, Yang Y (2019) Unsupervised scene adaptation with memory regularization in vivo. arXiv:1912.11164
Zheng Z, Yang Y (2021) Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. Int J Comput Vis 129(4):1106–1120
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International conference on computer vision, pp 2223–2232
Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 289–305
Zou Y, Yu Z, Liu X, Kumar B, Wang J (2019) Confidence regularized self-training. In: Proceedings of the IEEE/CVF International conference on computer vision, pp 5982–5991
Acknowledgements
This work is supported by the National Key Research and Development Program of China (No. 2020YFA0714103), the Innovation Capacity Construction Project of Jilin Province Development and Reform Commission(2021FGWCXNLJSSZ10), the Science & Technology Development Project of Jilin Province China (20190302117GX) and the Fundamental Research Funds for the Central Universities, JLU.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, A., Wang, S., Zhao, X. et al. Discovering latent target subdomains for domain adaptive semantic segmentation via style clustering. Multimed Tools Appl 83, 7785–7809 (2024). https://doi.org/10.1007/s11042-023-15620-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15620-6