Skip to main content
Log in

Unsupervised Domain Adaptation in the Wild via Disentangling Representation Learning

International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Most recently proposed unsupervised domain adaptation algorithms attempt to learn domain invariant features by confusing a domain classifier through adversarial training. In this paper, we argue that this may not be an optimal solution in the real-world setting (a.k.a. in the wild) as the difference in terms of label information between domains has been largely ignored. As labeled instances are not available in the target domain in unsupervised domain adaptation tasks, it is difficult to explicitly capture the label difference between domains. To address this issue, we propose to learn a disentangled latent representation based on implicit autoencoders. In particular, a latent representation is disentangled into a global code and a local code. The global code is capturing category information via an encoder with a prior, and the local code is transferable across domains, which captures the “style” related information via an implicit decoder. Experimental results on digit recognition, object recognition and semantic segmentation demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. We omit other baseline methods under this setting as they can be categorized into the aforementioned baselines and achieved poorer performance.

  2. We adopt the LeNet as backbone network, which is the benchmark for MNIST and USPS datasets.

References

  • Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., & Vaughan, J. W. (2010). A theory of learning from different domains. Machine Learning, 79(1–2), 151–175.

    Article  MathSciNet  Google Scholar 

  • Benford, F. (1938). The law of anomalous numbers. Proceedings of the American Philosophical Society, 78(4), 551–572.

    MATH  Google Scholar 

  • Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., & Krishnan, D. (2017). Unsupervised pixel-level domain adaptation with generative adversarial networks. In CVPR.

  • Cao, Z., You, K., Long, M., Wang, J., & Yang, Q. (2019). Learning to transfer examples for partial domain adaptation. In CVPR.

  • Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. (2016). Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In NeurIPS.

  • Chen, Y., Li, W., & Van Gool, L. (2018). Road: Reality oriented adaptation for semantic segmentation of urban scenes. In CVPR.

  • Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In CVPR.

  • Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In CVPR.

  • French, G., Mackiewicz, M., & Fisher, M. (2017). Self-ensembling for visual domain adaptation. arXiv preprint arXiv:1706.05208.

  • Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., et al. (2016). Domain-adversarial training of neural networks. Journal of Machine Learning Research, 17, 2096–2030.

    MathSciNet  MATH  Google Scholar 

  • Ghifary, M., Kleijn, W. B., Zhang, M., Balduzzi, D., & Li, W. (2016). Deep reconstruction-classification networks for unsupervised domain adaptation. In ECCV.

  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In NeurIPS.

  • Haeusser, P., Frerix, T., Mordvintsev, A., & Cremers, D. (2017). Associative domain adaptation. In CVPR.

  • Hendrycks, Da., & Dietterich, T. (2019). Benchmarking neural network robustness to common corruptions and perturbations. In ICLR.

  • Hoffman, J., Tzeng, E., Park, T., Zhu, J.-Y., Isola, P., Saenko, K., et al. (2018). Cycada: Cycle-consistent adversarial domain adaptation. In ICML.

  • Hoffman, J., Wang, D., Yu, F., & Darrell, T. (2016). Fcns in the wild: Pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649.

  • Huang, J., Gretton, A., Borgwardt, K. M., Schölkopf, B., & Smola, A. J. (2006). Correcting sample selection bias by unlabeled data. In NeurIPS.

  • Huang, X., Liu, M.-Y., Belongie, S., & Kautz, J. (2018). Multimodal unsupervised image-to-image translation. arXiv preprint arXiv:1804.04732.

  • Hull, J. J. (1994). A database for handwritten text recognition research. In PAMI.

  • Isola, P., Zhu, J.-Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. arXiv preprint.

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

  • Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

  • Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In NeurIPS.

  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.

    Article  Google Scholar 

  • Levinkov, E., & Fritz, M. (2013). Sequential Bayesian model update under structured scene prior for semantic road scenes labeling. In CVPR.

  • Lin, G., Milan, A., Shen, C., & Reid, I. (2017). Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Liu, M.-Y., Breuel, T., & Kautz, J. (2017). Unsupervised image-to-image translation networks. In NeurIPS.

  • Liu, M.-Y., & Tuzel, O. (2016). Coupled generative adversarial networks. In NeurIPS.

  • Liu, Y.-C., Yeh, Y.-Y., Fu, T.-C., Wang, S.-D., Chiu, W.-C., & Wang, Y.-C. F. (2018). Detach and adapt: Learning cross-domain disentangled deep representation. In CVPR.

  • Long, M., Cao, Y., Wang, J., & Jordan, M. (2015). Learning transferable features with deep adaptation networks. In ICML.

  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR.

  • Long, M., Zhu, H., Wang, J., & Jordan, M. I. (2016). Unsupervised domain adaptation with residual transfer networks. In NeurIPS.

  • Makhzani, A. (2018). Implicit autoencoders. arXiv preprint arXiv:1805.09804.

  • Makhzani, A., & Frey, B. J. (2017). Pixelgan autoencoders. In NeurIPS.

  • Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2015). Adversarial autoencoders. arXiv preprint arXiv:1511.05644.

  • Ming Harry Hsu, T., Yu Chen, W., Hou, C.-A., Hubert Tsai, Y.-H., Yeh, Y.-R., & Frank Wang, Y.-C. (2015). Unsupervised domain adaptation with imbalanced cross-domain data. In ICCV.

  • Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning. In NeurIPS Workshop, volume 2011.

  • Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., & Ng, A. Y. (2011). Multimodal deep learning. In ICML.

  • Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2011). Domain adaptation via transfer component analysis. In TNN.

  • Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. In TKDE.

  • Pan, Y., Yao, T., Li, Y., Wang, Y., Ngo, C.-W., & Mei, T. (2019). Transferrable prototypical networks for unsupervised domain adaptation. In CVPR.

  • Richter, S. R., Vineet, V., Roth, S., & Koltun, V. (2016). Playing for data: Ground truth from computer games. In ECCV.

  • Ros, G., Sellart, L., Materzynska, J., Vazquez, D., & Lopez, A. M. (2016). The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3234–3243).

  • Russo, P., Carlucci, F. M., Tommasi, T., & Caputo, B. (2018). From source to target and back: Symmetric bi-directional adaptive gan. In CVPR.

  • Saito, K., Watanabe, K., Ushiku, Y., & Harada, T. (2018). Maximum classifier discrepancy for unsupervised domain adaptation. In CVPR.

  • Sankaranarayanan, S., Balaji, Y., Jain, A., Lim, S. N., & Chellappa, R. (2017). Unsupervised domain adaptation for semantic segmentation with gans. arXiv preprint arXiv:1711.06969.

  • Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., & Mooij, J. (2012). On causal and anticausal learning. In ICML.

  • Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., & Webb, R. (2017). Learning from simulated and unsupervised images through adversarial training. In CVPR, Vol. 2, p. 5.

  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

  • Taigman, Y., Polyak, A., & Wolf, L. (2016). Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200.

  • Tsai, Y.-H., Hung, W.-C., Schulter, S., Sohn, K., Yang, M.-H., & Chandraker, M. (2018). Learning to adapt structured output space for semantic segmentation. arXiv preprint arXiv:1802.10349.

  • Tzeng, E., Hoffman, J., Darrell, T., & Saenko, K. (2015). Simultaneous deep transfer across domains and tasks. In ICCV.

  • Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In CVPR.

  • Vu, T.-H., Jain, H., Bucher, M., Cord, M., & Pérez, P. (2019). Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In CVPR.

  • Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., & Catanzaro, B. (2017). High-resolution image synthesis and semantic manipulation with conditional gans. arXiv preprint arXiv:1711.11585.

  • Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. arXiv preprint arXiv:1605.07146.

  • Zhang, Y., David, P., & Gong, B. (2017). Curriculum domain adaptation for semantic segmentation of urban scenes. In ICCV.

  • Zhang, Y., Qiu, Z., Yao, T., Liu, D., & Mei, T. (2018). Fully convolutional adaptation networks for semantic segmentation. In CVPR.

  • Zhu, J.-Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593.

Download references

Acknowledgements

The research work was done at the Rapid-Rich Object Search (ROSE) Lab, Nanyang Technological University. This research is supported in part by the Wallenberg-NTU Presidential Postdoctoral Fellowship, the NTU-PKU Joint Research Institute, a collaboration between the Nanyang Technological University and Peking University that is sponsored by a donation from the Ng Teng Fong Charitable Foundation, and the Science and Technology Foundation of Guangzhou Huangpu Development District under Grant 201902010028.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haoliang Li.

Additional information

Communicated by Mei Chen, Cha Zhang and Katsushi Ikeuchi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Wan, R., Wang, S. et al. Unsupervised Domain Adaptation in the Wild via Disentangling Representation Learning. Int J Comput Vis 129, 267–283 (2021). https://doi.org/10.1007/s11263-020-01364-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-020-01364-5

Keywords

Navigation