Elsevier

Pattern Recognition

Volume 108, December 2020, 107536
Pattern Recognition

Discovering and incorporating latent target-domains for domain adaptation

https://doi.org/10.1016/j.patcog.2020.107536Get rights and content

Highlights

  • We focus on the practical problem in unsupervised domain adaptation that multiple latent domains are observed in the target domain.

  • We propose a domain discovery scheme based on the characteristics of the target domain and the relationship between source and target domains.

  • We propose a method to jointly learn the mapping function based on multiple latent domains.

Abstract

In this paper, we aim to address the unsupervised domain adaptation problem where the data in the target domain are much more diverse compared with the data in the source domain. In particular, this problem is formulated as discovering and incorporating latent domains underlying target data of interest for unsupervised domain adaptation. More specifically, the discovery of the latent target domains is based on three criteria, including the maximization of compactness and distinctiveness of the data in the individual latent target-domain, as well as the minimization of total divergence from the latent target-domains to the source domain. For each pair formed by a latent target domain and the source domain, we learn a feature space where the discrepancy between the source domain and the specific latent target domain is shrunk. Finally, we consider the projected source domain data on the learned latent feature spaces as different views of the source domain, and propose an extended multiple kernel learning algorithm to train a more robust and precise classifier for predicting the unlabeled target data. The effectiveness of our proposed method is demonstrated on various benchmark datasets for object recognition and human activity recognition. Moreover, we also show that our proposed method can be treated as an effective complement to the deep learning based unsupervised domain adaptation.

Introduction

In real-world visual recognition problems, it is common that training and testing data differ in various ways. For example, training data may be collected from a domain (a.k.a. source domain) that is different from the testing data (a.k.a. target domain). Due to the domain discrepancy, a model trained on source domain training data may fail to perform well on the target domain. Therefore, how to reduce the domain discrepancy between the source and the target domains, and reuse the source domain training data to build a precise classifier for the target domain are vital in domain adaptation. Many works have been proposed for domain adaptation problems in the literature, e.g. instance re-weighting (e.g. [1], [2]), subspace learning for distribution alignment (e.g. [3], [4], [5], [6], [7], [8], [9]), etc.

Most existing domain adaptation methods consider the “balanced” setting that the source and target domains are both from single domains. However, in many circumstances the training and testing data may be diverse and contain multiple latent domains. Directly applying domain adaptation methods may not be optimal. It has been observed in the literature that simply treating the labeled data collected from multiple domains may lead to poor adaptation performance [10]. This is largely because traditional distribution alignment methods generally assume the source and target domains are compact, and their supports are overlapped as well, which however, may not always hold when the domain is diverse. Participating a complex domain into multiple (small and compact) latent domains helps to reduce the difficulty for distribution alignment, which was also verified by the previous works on discovering latent domains for the source domains [10], [11]. Here, we argue that in real-world applications, compared with training data, testing data could be even more diverse, implying the existence of multiple “latent target domains”. For example, the images or videos for testing could be acquired from arbitrary viewpoints, under different illuminations, or using different devices. However, most existing latent domain discovery methods cannot be directly applied to the target domain, as they rely on label information based on source domain to learn the latent domain. Such label information is not available in the target domain.

In this paper, we aim to address a new challenging issue for the unsupervised domain adaptation by discovering latent target domains for improving the domain adaptation performance. Our intuition is that the main difficulty in domain adaptation for many visual recognition problems originates from the large diversity of the testing data. In other words, the testing data may be from different latent target domains, resulting in the underlying distribution to be extremely complicated. Therefore, we propose to first partition the target domain into multiple compact and distinctive latent domains, such that the distribution of each latent domain becomes simpler, and thus domain adaptation between the source and each latent target-domain could be less challenging. When partitioning the target domain, we also enforce each latent target-domain to be as similar to the source domain as possible, which can further facilitate knowledge transfer from the source domain. After learning latent target domains, for each pair of the source domain and a latent target-domain, we apply a state-of-the-art subspace-based domain adaptation method, Joint Geometrical and Statistical Alignment (JGSA) [12], to map all the data into a latent feature space, such that in the latent feature space instances from the domains can be well-aligned. Finally, to incorporate information from all the latent target domains, we propose an extended Multiple Kernel Learning (MKL) algorithm to train a robust classifier for making predictions on target data. Experiments are conducted on three benchmark datasets on object recognition and human activity recognition, and the results demonstrate the effectiveness of our proposed approach for exploiting multiple latent target domains to improve domain adaptation performance.

The contributions of this paper are summarized as follows.

  • 1.

    We focus on the practical problem in unsupervised domain adaptation that multiple latent domains are observed in the target domain. We propose an integrated solution by discovering and incorporating the latent target domains.

  • 2.

    We propose a latent domain discovery scheme based on the inherent characteristics of the target domain and the external relationship between source and target domains.

  • 3.

    We propose a method to jointly learn the mapping function based on multiple latent domains, which achieves superior performance on different computer vision tasks.

Section snippets

Related work

Traditional balanced domain adaptation approaches focused on either subspace learning or instance re-weighting. For example, in Huang et al. [1], an instance-weighting based on source domain data was proposed to minimize the distribution discrepancy between source and target domains. Subspace learning based unsupervised domain adaptation assumes that there exists a latent space such that the distribution between source and target domain can be minimized [3], [8], [13], [14], and can be further

Proposed methodology

For the consistency in the presentation, we use lowercase/uppercase letter in boldface to represent a vector/matrix, e.g., a denotes a vector and A denotes a matrix. The transpose of a vector/matrix is denoted by the superscript ⊤. The symbol ⊙ defines the element-wise product between two vectors/matrices of the same size.

Object recognition

We first use images collected from Amazon dataset (A), DSLR dataset (D), webcam dataset (W) and Caltech-256 dataset (C). We provide several samples from these four datasets in Fig. 1. Ten common categories in all these datasets are used for evaluation. We consider to use SURF feature [36] by using K-means to build a codebook of 800 clusters, leading to a final 800 dimension features for each image. Moreover, the Decaf6 feature [37] is extracted from pretrained AlexNet. We then consider Office31

Conclusion and future work

In this paper, we propose a new method to discover latent target domain for unsupervised domain adaptation. In particular, we propose three criteria for latent domains discovery: minimizing entropy within each latent domain, maximizing distinctiveness among different latent domain, and minimizing distinctiveness between source domain and each latent target domain. After latent target domains are learned, we leverage the latent target domain information by learning a common subspace for each

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Wen Li is supported by Major Project for New Generation of AI under Grant No. 2018AAA0100400 National Natural Science Foundation of China under Grant No. 61772118. This research was carried out at the Rapid-Rich Object Search (ROSE) Lab at the Nanyang Technological University, Singapore. The ROSE Lab is supported by the National Research Foundation, Singapore, and the Infocomm Media Development Authority, Singapore. Haoliang Li thanks the Wallenberg-NTU Presidential Postdoc Fellowship grant.

Haoliang Li obtained his B.Eng degree from University of Electronic Science and Technology of China in 2013, and the Ph.D. degree from Nanyang Technological University, Singapore, in 2018. He was a project officer in 2018 and a research fellow from July 2018 to May 2019 in Rapid-Rich Object Search Lab, NTU. He is now a Wallenberg-NTU presidential postdoc fellow in NTU. He received the doctorate innovation award from NTU in 2019.

References (43)

  • M. Long et al.

    Transfer joint matching for unsupervised domain adaptation

    CVPR

    (2014)
  • R. Gopalan et al.

    Domain adaptation for object recognition: An unsupervised approach

    ICCV

    (2011)
  • J. Ni et al.

    Subspace interpolation via dictionary learning for unsupervised domain adaptation

    CVPR

    (2013)
  • B. Gong et al.

    Reshaping visual datasets for domain adaptation

    NIPS

    (2013)
  • J. Hoffman et al.

    Discovering Latent Domains for Multisource Domain Adaptation

    ECCV

    (2012)
  • J. Zhang et al.

    Joint geometrical and statistical alignment for visual domain adaptation

    CVPR

    (2017)
  • B. Sun et al.

    Return of frustratingly easy domain adaptation.

    AAAI

    (2016)
  • M. Ghifary et al.

    Scatter component analysis: a unified framework for domain adaptation and domain generalization

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2017)
  • M. Long et al.

    Learning transferable features with deep adaptation networks

    ICML

    (2015)
  • E. Tzeng, J. Hoffman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation, CVPR...
  • K. Bousmalis et al.

    Unsupervised pixel-level domain adaptation with generative adversarial networks

    The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    (2017)
  • Cited by (8)

    • Hierarchical feature disentangling network for universal domain adaptation

      2022, Pattern Recognition
      Citation Excerpt :

      To incorporate the sparse representation learning approach in domain adaptation, a domain-shared group-sparse dictionary learning model has been proposed in [22] for joint distribution alignment. Besides these methods, adversarial learning has been proposed for domain adaptation in [3,23,24] inspired by the idea of Generative Adversarial Nets [25]. In the adversarial learning, a discriminator is trained to distinguish features from source and target domains, and a feature extractor is learned to confuse the discriminator for knowledge transfer.

    • A Two-Way alignment approach for unsupervised multi-Source domain adaptation

      2022, Pattern Recognition
      Citation Excerpt :

      This situation makes the issue difficult to solve. A lot of methods for UDA have been proposed in the setting of single-source domain, which refer to single-source UDA [16–19]. Most of these algorithms are developed on the basis of the theoretical generalization error bound established by Ben-David et al. [9–11].

    • Discriminative feature alignment: Improving transferability of unsupervised domain adaptation by Gaussian-guided latent alignment

      2021, Pattern Recognition
      Citation Excerpt :

      Correlation alignment [24] utilizes the difference of the mean and the covariance between the two datasets as the domain divergence, and attempts to match them during the training. The methods based on maximum mean discrepancy (MMD) [25] such as [26] measure the variance between the latent feature distributions of the two domains. Some studies [27,28] also propose to learn the discriminative representations by pseudo-labels and aligning the output class distributions.

    View all citing articles on Scopus

    Haoliang Li obtained his B.Eng degree from University of Electronic Science and Technology of China in 2013, and the Ph.D. degree from Nanyang Technological University, Singapore, in 2018. He was a project officer in 2018 and a research fellow from July 2018 to May 2019 in Rapid-Rich Object Search Lab, NTU. He is now a Wallenberg-NTU presidential postdoc fellow in NTU. He received the doctorate innovation award from NTU in 2019.

    Wen Li received the Ph.D. degree from Nanyang Technological University, Singapore, in 2015. From 2015 to 2019, he was a Post-Doctoral Researcher with the Computer Vision Laboratory, ETH Zrich, Switzerland. He is currently a Professor with the School of Computer Science and Engineering, University of Electronic Science and Technology of China. His main interests include transfer learning, multi-view learning, multiple kernel learning, and their applications in computer vision.

    Shiqi Wang received the B.S. degree in computer science from the Harbin Institute of Technology in 2008 and the Ph.D. degree in computer application technology from Peking University in 2014. From 2014 to 2016, he was a Post-Doctoral Fellow with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada. From 2016 to 2017, he was with the Rapid-Rich Object Search Laboratory, Nanyang Technological University, Singapore, as a Research Fellow. He is currently an Assistant Professor with the Department of Computer Science, City University of Hong Kong. He has proposed over 30 technical proposals to ISO/MPEG, ITU-T, and AVS standards. His research interests include video compression, image/video quality assessment, and image/video search and analysis.

    View full text