TarGAN: Generating target data with class labels for unsupervised domain adaptation
Introduction
In the recent years, deep neural networks have achieved great successes in diverse machine learning tasks. However, the corresponding performance leaps heavily rely on massive amounts of labeled data. As large-scale data annotation is usually prohibitively expensive, domain adaptation, which aims to reduce the labeling consumption for a certain classification task short of labeled data through leveraging the off-the-shelf annotated data from a different but related source domain [1], [2], is becoming increasingly attractive.
In general, domain adaptation includes supervised adaptation, where a small amount of labeled target data are available for training, and unsupervised adaptation, where no labeled target data exist. In this paper, we focus on the latter setting (i.e., unsupervised domain adaptation), which is a more challenging task. Given a sufficiently labeled source domain and an unlabeled target domain, the goal of unsupervised domain adaptation is to learn a classifier that can alleviate the domain shift as well as achieve good generalization performance in the target domain. One of the main approaches to domain adaptation is to bridge the source and the target domains through learning a feature space that can reduce their distribution discrepancy [3], [4], [5], [6]. Specifically, deep models have been widely explored in domain adaptation in the recent years [7], [8], [9]. Through reducing the domain discrepancy in the task-specific layers of deep neural networks, “deep” features that are domain-invariant can be obtained [7], [8]. However, domain invariance cannot necessarily induce discriminative representations for the target domain due to the lack of labeled target data. Although advances can be achieved by simultaneously learning “deep” features as well as inferring the labels of the target data in the training phase [10], [11], issues still remain since incorrect inferences will impact the final classification accuracy.
In this work, we propose a new unsupervised domain adaptation method named TarGAN, which can generate Target data with given class labels based on Generative Adversarial Networks (GANs) [12], in order to improve the classifier’s classification accuracy on the target domain. TarGAN is based on the hypothesis that the source and the target domains differ in their low-level details, while sharing the same high-level abstract. Fortunately, this is true for most domain adaptation tasks.
Our model consists of one classifier network shared by both the source and the target domains, and a couple of GANs for respective domains. Specifically, we adopt deep domain confusion (DDC) proposed in [7] as the baseline, which enforces the deep model to learn domain-invariant representations, while maintaining good predictive ability on the source data. To generate images with given class labels, we supply both of the generators with categorical latent factors and encourage high mutual information between the categorical code and the generated data through tasking the classifier to reconstruct the categorical code of the generated samples. As demonstrated in [13], this can enforce the source generator to decode the categorical latent code as class semantics. Therefore, the class labels of the generated source images can be reasonably controlled through the categorical code. However, due to the lack of data annotations in the target domain, the target generator cannot be guided to decode it as class semantics. As a result, the categorical code of the target generator might be inherently unordered, or represent other variation factors instead of the class labels.
To solve the problem indicated above, TarGAN ties the first few layers of the two generators, which are responsible for decoding high-level semantics. This architecture makes the generators of both the source and the target domains process the high-level representations in the identical fashion. As a result, the target generator can share the decoding ability of the source generator and decode the categorical latent code as class semantics. Subsequently, the last few layers of each generator, which are responsible for decoding low-level details, will process the shared representation to generate samples in respective domains. Concurrently, the high mutual information constraint between the categorical code and the generated target samples can prevent the class semantics encoded in the categorical code from being lost in the generation process. Through the cooperation of the high mutual information constraint and the weight-sharing mechanism, our model can successfully disentangle the class and the style codes of the target generator in the absence of labeled target data. Note that if the source and the target domains indeed share high-level semantics, the synthesized target data can be guaranteed to be accompanied with correct labels. With the labeled target data generated by our proposed framework, the classifier can obtain improved discriminative power over the target domain. Extensive experiments on several standard domain adaptation benchmarks demonstrate the effectiveness of our proposed method.
Overall, the main contributions of our work are mainly three-fold:
- •
Our model can successfully disentangle the class and the style codes of the target generator in the absence of labeled target data by tying the high-level layers of the source and the target generators, and concurrently enforcing high mutual information between the class code and the generated images;
- •
TarGAN is able to generate target data with given class labels, which can effectively boost the classifier’s classification accuracy over the target domain;
- •
We can achieve state-of-the-art results on several standard benchmarks.
This paper is organized as follows. Section 2 reviews the related works on both domain adaptation and generative adversarial networks. Section 3 proposes TarGAN for unsupervised domain adaptation. Section 4 displays our experimental results on several standard benchmarks. Finally, Section 5 summarizes this paper.
Section snippets
Unsupervised domain adaptation
A large number of unsupervised domain adaptation methods have been proposed so far. Basically, the major concern in domain adaptation is the distribution discrepancy between different domains, which makes the classifier trained with the source data fail to work on the target domain. Before the “deep” era, most domain adaptation methods attempted to alleviate this through learning a shallow representation in which the domain shift can be explicitly reduced [3], [4], [6].
The recent advances have
Proposed method: TarGAN
In unsupervised domain adaptation, we are provided with a labeled source dataset accompanied with the corresponding labels ,1 as well as an unlabeled target dataset . Specifically, the source and the target domains, respectively characterized by probability distributions and , are different but related. The goal of domain adaptation is to learn a classifier with good generalization performance on
Experiments
In our experiments, we evaluate our method on several standard adaptation benchmarks against the existing state-of-the-art unsupervised domain adaptation methods, including CoGAN [35], Tri-training [11], Associative Domain Adaptation (ADA) [21], UNsupervised Image-to-image Translation (UNIT) [36] and Maximum Classifier Discrepancy (MCD) [19].
Conclusion
In this paper, we propose TarGAN for unsupervised domain adaptation. TarGAN consists of one classification network shared by both the source and the target domains, as well as two GANs for respective domains. Although no labeled target data are available, our model can successfully disentangle the class code and the style code of the target generator and generate target data with given class labels through the cooperation of the high mutual information constraint and the weight-sharing
Acknowledgments
This paper is supported by National Natural Science Foundation of China (No. 61572019) and Fundamental Research Funds for the Central Universities, China (No. JBK140507 and JBK1806002) of China. The authors would like to thank the anonymous reviewers for their careful reading of this paper and for their helpful and constructive comments.
References (41)
- et al.
Local linear Laplacian eigenmaps: A direct extension of LLE
Pattern Recognit. Lett.
(2016) - et al.
L1-norm locally linear representation regularization multi-source adaptation learning
Neural Netw.
(2015) - et al.
Effective data generation for imbalanced learning using conditional generative adversarial networks
Expert Syst. Appl.
(2018) - et al.
A survey on transfer learning
IEEE Trans. Knowl. Data Eng.
(2010) - et al.
Visual domain adaptation: A survey of recent advances
IEEE Signal Process. Mag.
(2015) - et al.
Domain adaptation via transfer component analysis
IEEE Trans. Neural Netw.
(2011) - et al.
LSDT: latent sparse domain transfer learning for visual adaptation
IEEE Trans. Image Process.
(2016) - et al.
Optimal transport for domain adaptation
IEEE Trans. Pattern Anal. Mach. Intell.
(2017) - et al.
Domain invariant transfer kernel learning
IEEE Trans. Knowl. Data Eng.
(2015) - E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, T. Darrell, Deep domain confusion: Maximizing for domain invariance, arXiv...