Discovering and incorporating latent target-domains for domain adaptation

doi:10.1016/j.patcog.2020.107536

Pattern Recognition

Volume 108, December 2020, 107536

https://doi.org/10.1016/j.patcog.2020.107536 Get rights and content

Highlights

•
We focus on the practical problem in unsupervised domain adaptation that multiple latent domains are observed in the target domain.
•
We propose a domain discovery scheme based on the characteristics of the target domain and the relationship between source and target domains.
•
We propose a method to jointly learn the mapping function based on multiple latent domains.

Abstract

In this paper, we aim to address the unsupervised domain adaptation problem where the data in the target domain are much more diverse compared with the data in the source domain. In particular, this problem is formulated as discovering and incorporating latent domains underlying target data of interest for unsupervised domain adaptation. More specifically, the discovery of the latent target domains is based on three criteria, including the maximization of compactness and distinctiveness of the data in the individual latent target-domain, as well as the minimization of total divergence from the latent target-domains to the source domain. For each pair formed by a latent target domain and the source domain, we learn a feature space where the discrepancy between the source domain and the specific latent target domain is shrunk. Finally, we consider the projected source domain data on the learned latent feature spaces as different views of the source domain, and propose an extended multiple kernel learning algorithm to train a more robust and precise classifier for predicting the unlabeled target data. The effectiveness of our proposed method is demonstrated on various benchmark datasets for object recognition and human activity recognition. Moreover, we also show that our proposed method can be treated as an effective complement to the deep learning based unsupervised domain adaptation.

Introduction

In real-world visual recognition problems, it is common that training and testing data differ in various ways. For example, training data may be collected from a domain (a.k.a. source domain) that is different from the testing data (a.k.a. target domain). Due to the domain discrepancy, a model trained on source domain training data may fail to perform well on the target domain. Therefore, how to reduce the domain discrepancy between the source and the target domains, and reuse the source domain training data to build a precise classifier for the target domain are vital in domain adaptation. Many works have been proposed for domain adaptation problems in the literature, e.g. instance re-weighting (e.g. [1], [2]), subspace learning for distribution alignment (e.g. [3], [4], [5], [6], [7], [8], [9]), etc.

Most existing domain adaptation methods consider the “balanced” setting that the source and target domains are both from single domains. However, in many circumstances the training and testing data may be diverse and contain multiple latent domains. Directly applying domain adaptation methods may not be optimal. It has been observed in the literature that simply treating the labeled data collected from multiple domains may lead to poor adaptation performance [10]. This is largely because traditional distribution alignment methods generally assume the source and target domains are compact, and their supports are overlapped as well, which however, may not always hold when the domain is diverse. Participating a complex domain into multiple (small and compact) latent domains helps to reduce the difficulty for distribution alignment, which was also verified by the previous works on discovering latent domains for the source domains [10], [11]. Here, we argue that in real-world applications, compared with training data, testing data could be even more diverse, implying the existence of multiple “latent target domains”. For example, the images or videos for testing could be acquired from arbitrary viewpoints, under different illuminations, or using different devices. However, most existing latent domain discovery methods cannot be directly applied to the target domain, as they rely on label information based on source domain to learn the latent domain. Such label information is not available in the target domain.

In this paper, we aim to address a new challenging issue for the unsupervised domain adaptation by discovering latent target domains for improving the domain adaptation performance. Our intuition is that the main difficulty in domain adaptation for many visual recognition problems originates from the large diversity of the testing data. In other words, the testing data may be from different latent target domains, resulting in the underlying distribution to be extremely complicated. Therefore, we propose to first partition the target domain into multiple compact and distinctive latent domains, such that the distribution of each latent domain becomes simpler, and thus domain adaptation between the source and each latent target-domain could be less challenging. When partitioning the target domain, we also enforce each latent target-domain to be as similar to the source domain as possible, which can further facilitate knowledge transfer from the source domain. After learning latent target domains, for each pair of the source domain and a latent target-domain, we apply a state-of-the-art subspace-based domain adaptation method, Joint Geometrical and Statistical Alignment (JGSA) [12], to map all the data into a latent feature space, such that in the latent feature space instances from the domains can be well-aligned. Finally, to incorporate information from all the latent target domains, we propose an extended Multiple Kernel Learning (MKL) algorithm to train a robust classifier for making predictions on target data. Experiments are conducted on three benchmark datasets on object recognition and human activity recognition, and the results demonstrate the effectiveness of our proposed approach for exploiting multiple latent target domains to improve domain adaptation performance.

The contributions of this paper are summarized as follows.

1.
We focus on the practical problem in unsupervised domain adaptation that multiple latent domains are observed in the target domain. We propose an integrated solution by discovering and incorporating the latent target domains.
2.
We propose a latent domain discovery scheme based on the inherent characteristics of the target domain and the external relationship between source and target domains.
3.
We propose a method to jointly learn the mapping function based on multiple latent domains, which achieves superior performance on different computer vision tasks.

Section snippets

Related work

Traditional balanced domain adaptation approaches focused on either subspace learning or instance re-weighting. For example, in Huang et al. [1], an instance-weighting based on source domain data was proposed to minimize the distribution discrepancy between source and target domains. Subspace learning based unsupervised domain adaptation assumes that there exists a latent space such that the distribution between source and target domain can be minimized [3], [8], [13], [14], and can be further

Proposed methodology

For the consistency in the presentation, we use lowercase/uppercase letter in boldface to represent a vector/matrix, e.g., a denotes a vector and A denotes a matrix. The transpose of a vector/matrix is denoted by the superscript ⊤. The symbol ⊙ defines the element-wise product between two vectors/matrices of the same size.

Object recognition

We first use images collected from Amazon dataset (A), DSLR dataset (D), webcam dataset (W) and Caltech-256 dataset (C). We provide several samples from these four datasets in Fig. 1. Ten common categories in all these datasets are used for evaluation. We consider to use SURF feature [36] by using K-means to build a codebook of 800 clusters, leading to a final 800 dimension features for each image. Moreover, the Decaf6 feature [37] is extracted from pretrained AlexNet. We then consider Office31

Conclusion and future work

In this paper, we propose a new method to discover latent target domain for unsupervised domain adaptation. In particular, we propose three criteria for latent domains discovery: minimizing entropy within each latent domain, maximizing distinctiveness among different latent domain, and minimizing distinctiveness between source domain and each latent target domain. After latent target domains are learned, we leverage the latent target domain information by learning a common subspace for each

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Wen Li is supported by Major Project for New Generation of AI under Grant No. 2018AAA0100400 National Natural Science Foundation of China under Grant No. 61772118. This research was carried out at the Rapid-Rich Object Search (ROSE) Lab at the Nanyang Technological University, Singapore. The ROSE Lab is supported by the National Research Foundation, Singapore, and the Infocomm Media Development Authority, Singapore. Haoliang Li thanks the Wallenberg-NTU Presidential Postdoc Fellowship grant.

Haoliang Li obtained his B.Eng degree from University of Electronic Science and Technology of China in 2013, and the Ph.D. degree from Nanyang Technological University, Singapore, in 2018. He was a project officer in 2018 and a research fellow from July 2018 to May 2019 in Rapid-Rich Object Search Lab, NTU. He is now a Wallenberg-NTU presidential postdoc fellow in NTU. He received the doctorate innovation award from NTU in 2019.

References (43)

L.A. Pereira et al.
Semi-supervised transfer subspace for domain adaptation
Pattern Recognit.
(2018)
P. Huang et al.
Boosting for transfer learning from multiple data sources
Pattern Recognit. Lett.
(2012)
R. Wang et al.
Review on mining data from multiple data sources
Pattern Recognit. Lett.
(2018)
X. Wu et al.
Joint learning of multiple latent domains and deep representations for domain adaptation
IEEE Trans. Cybern.
(2019)
J. Huang et al.
Correcting sample selection bias by unlabeled data
NIPS
(2006)
M. Sugiyama et al.
Direct importance estimation with model selection and its application to covariate shift adaptation
NIPS
(2008)
S.J. Pan et al.
Domain adaptation via transfer component analysis
IEEE Trans. Neural Netw.
(2011)
B. Gong et al.
Geodesic flow kernel for unsupervised domain adaptation
CVPR
(2012)
B. Fernando et al.
Unsupervised visual domain adaptation using subspace alignment
ICCV
(2013)
M. Long et al.
Transfer feature learning with joint distribution adaptation
ICCV
(2013)

M. Long et al.

Transfer joint matching for unsupervised domain adaptation

CVPR

(2014)

R. Gopalan et al.

Domain adaptation for object recognition: An unsupervised approach

ICCV

(2011)

J. Ni et al.

Subspace interpolation via dictionary learning for unsupervised domain adaptation

CVPR

(2013)

B. Gong et al.

Reshaping visual datasets for domain adaptation

NIPS

(2013)

J. Hoffman et al.

Discovering Latent Domains for Multisource Domain Adaptation

ECCV

(2012)

J. Zhang et al.

Joint geometrical and statistical alignment for visual domain adaptation

CVPR

(2017)

B. Sun et al.

Return of frustratingly easy domain adaptation.

AAAI

(2016)

M. Ghifary et al.

Scatter component analysis: a unified framework for domain adaptation and domain generalization

IEEE Trans. Pattern Anal. Mach. Intell.

(2017)

M. Long et al.

Learning transferable features with deep adaptation networks

ICML

(2015)

E. Tzeng, J. Hoffman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation, CVPR...

K. Bousmalis et al.

Unsupervised pixel-level domain adaptation with generative adversarial networks

The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

(2017)

Cited by (8)

Hierarchical feature disentangling network for universal domain adaptation
2022, Pattern Recognition
Citation Excerpt :
To incorporate the sparse representation learning approach in domain adaptation, a domain-shared group-sparse dictionary learning model has been proposed in [22] for joint distribution alignment. Besides these methods, adversarial learning has been proposed for domain adaptation in [3,23,24] inspired by the idea of Generative Adversarial Nets [25]. In the adversarial learning, a discriminator is trained to distinguish features from source and target domains, and a feature extractor is learned to confuse the discriminator for knowledge transfer.
Universal Domain Adaptation (UniDA) aims to address a more practical problem compared with traditional Close-Set Domain Adaptation (CSDA). Besides the domain gap in traditional CSDA, the common and private label sets across domains are unknown in UniDA leading to an additional category gap. Without considering the category gap for domain adversarial training to extract domain-relevant features, existing methods may suffer from the feature misalignment problem and result in negative transfer. This paper proposes a Hierarchical Feature Disentangling Network (HFDN) to disentangle domain-relevant features into domain-specific and category-shift features for latent variables caused by domain gap and category gap, respectively. Domain-specific features are trained to distinguish the source domain from the target one by discovering domain-specific attributes (e.g. illumination, style), and adversarially aligned to bridge the domain gap for knowledge transfer. Category-shift features are extracted to distinguish domains by identifying private classes across domains, so that they can be leveraged to assign larger weights for samples from the common label set. Experiments show that the proposed HFDN surpasses state-of-the-art CSDA, partial DA, open-set DA and UniDA models.
A Two-Way alignment approach for unsupervised multi-Source domain adaptation
2022, Pattern Recognition
Citation Excerpt :
This situation makes the issue difficult to solve. A lot of methods for UDA have been proposed in the setting of single-source domain, which refer to single-source UDA [16–19]. Most of these algorithms are developed on the basis of the theoretical generalization error bound established by Ben-David et al. [9–11].
Domain adaptation aims at transferring knowledge from labeled source domain to unlabeled target domain. Current advances primarily concern single source domain and neglect the setting of multiple source domains. Previous unsupervised multi-source domain adaptation (MDA) algorithms only consider domain-level alignment, while neglecting the category-level information among multiple domains and the instance variations inside each domain. This paper introduces a Two-Way alignment framework for MDA (TWMDA), which considers both domain-level and category-level alignments, and addresses the instance variations. We first align the target and multiple sources on the domain-level by an adversarial learning process. To circumvent the drawbacks of adversarial learning, we further reduce the domain gap on the category-level by minimizing the distance between the category prototypes and unlabeled target instances. To address the instance variations, we design an instance weighting strategy for diverse source instances. The effectiveness of TWMDA is demonstrated on three benchmark datasets for image classification.
Discriminative feature alignment: Improving transferability of unsupervised domain adaptation by Gaussian-guided latent alignment
2021, Pattern Recognition
Citation Excerpt :
Correlation alignment [24] utilizes the difference of the mean and the covariance between the two datasets as the domain divergence, and attempts to match them during the training. The methods based on maximum mean discrepancy (MMD) [25] such as [26] measure the variance between the latent feature distributions of the two domains. Some studies [27,28] also propose to learn the discriminative representations by pseudo-labels and aligning the output class distributions.
In this paper, we focus on the unsupervised domain adaptation problem where an approximate inference model is to be learned from a labeled data domain and expected to generalize well to an unlabeled domain. The success of unsupervised domain adaptation largely relies on the cross-domain feature alignment. Previous work has attempted to directly align features by classifier-induced discrepancies. Nevertheless, a common feature space cannot always be learned via this direct feature alignment especially when large domain gaps exist. To solve this problem, we introduce a Gaussian-guided latent alignment approach to align the latent feature distributions of the two domains under the guidance of a prior. In such an indirect way, the distributions over the samples from the two domains will be constructed on a common feature space, i.e., the space of the prior, which promotes better feature alignment. To effectively align the target latent distribution with this prior distribution, we also propose a novel unpaired L1-distance by taking advantage of the formulation of the encoder-decoder. The extensive evaluations on nine benchmark datasets validate the superior knowledge transferability through outperforming state-of-the-art methods and the versatility of the proposed method by improving the existing work significantly.
FREEDOM: Target Label & Source Data & Domain Information-Free Multi-Source Domain Adaptation for Unsupervised Personalization
2023, arXiv
Evaluation of domain generalization and adaptation on improving model robustness to temporal dataset shift in clinical medicine
2022, Scientific Reports
Gated Domain Units for Multi-source Domain Generalization
2022, arXiv

View all citing articles on Scopus

Wen Li received the Ph.D. degree from Nanyang Technological University, Singapore, in 2015. From 2015 to 2019, he was a Post-Doctoral Researcher with the Computer Vision Laboratory, ETH Zrich, Switzerland. He is currently a Professor with the School of Computer Science and Engineering, University of Electronic Science and Technology of China. His main interests include transfer learning, multi-view learning, multiple kernel learning, and their applications in computer vision.

Shiqi Wang received the B.S. degree in computer science from the Harbin Institute of Technology in 2008 and the Ph.D. degree in computer application technology from Peking University in 2014. From 2014 to 2016, he was a Post-Doctoral Fellow with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada. From 2016 to 2017, he was with the Rapid-Rich Object Search Laboratory, Nanyang Technological University, Singapore, as a Research Fellow. He is currently an Assistant Professor with the Department of Computer Science, City University of Hong Kong. He has proposed over 30 technical proposals to ISO/MPEG, ITU-T, and AVS standards. His research interests include video compression, image/video quality assessment, and image/video search and analysis.

View full text

Discovering and incorporating latent target-domains for domain adaptation

Highlights

Abstract

Introduction

Section snippets

Related work

Proposed methodology

Object recognition

Conclusion and future work

Declaration of Competing Interest

Acknowledgments

Pattern Recognit.

Pattern Recognit. Lett.

Pattern Recognit. Lett.

IEEE Trans. Cybern.

Correcting sample selection bias by unlabeled data

NIPS

Direct importance estimation with model selection and its application to covariate shift adaptation

NIPS

Domain adaptation via transfer component analysis

IEEE Trans. Neural Netw.

Geodesic flow kernel for unsupervised domain adaptation

CVPR

Unsupervised visual domain adaptation using subspace alignment

ICCV

Transfer feature learning with joint distribution adaptation

ICCV

Transfer joint matching for unsupervised domain adaptation

CVPR

Domain adaptation for object recognition: An unsupervised approach

ICCV

Subspace interpolation via dictionary learning for unsupervised domain adaptation

CVPR

Reshaping visual datasets for domain adaptation

NIPS

Discovering Latent Domains for Multisource Domain Adaptation

ECCV

Joint geometrical and statistical alignment for visual domain adaptation

CVPR

Return of frustratingly easy domain adaptation.

AAAI

Scatter component analysis: a unified framework for domain adaptation and domain generalization

IEEE Trans. Pattern Anal. Mach. Intell.

Learning transferable features with deep adaptation networks

ICML

Unsupervised pixel-level domain adaptation with generative adversarial networks

The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)