Elsevier

Neural Networks

Volume 119, November 2019, Pages 313-322
Neural Networks

Multiclass heterogeneous domain adaptation via bidirectional ECOC projection

https://doi.org/10.1016/j.neunet.2019.08.005Get rights and content

Abstract

Heterogeneous domain adaptation aims to exploit the source domain data to train a prediction model for the target domain with different input feature space. Current methods either map the data points from different domains with different feature space to a common latent subspace or use asymmetric projections for learning the classifier. However, these learning methods separate common space learning and shared classifier training. This may lead complex model structure and more parameters to be determined. To appropriately address this problem, we propose a novel bidirectional ECOC projection method, named HDA-ECOC, for heterogeneous domain adaptation. The proposed method projects the inputs and outputs (labels) of two domains into a common ECOC coding space, such that, the common space learning and the shared classifier training can be performed simultaneously. Then, classification of the target testing sample can be directly addressed by an ECOC decoding. Moreover, the unlabeled target data is exploited by estimating the two domains projected instances consistency through a maximum mean discrepancy (MMD) criterion. We formulate this method as a dual convex minimization problem and propose an alternating optimization algorithm for solving it. For performance evaluation, experiments are performed on cross-lingual text classification and cross-domain digital image classification with heterogeneous feature space. The experimental results demonstrate that the proposed method is effective and efficient in solving the heterogeneous domain adaptation problems.

Introduction

Domain adaption seeks to improve the learning performance of a target domain by leveraging knowledge from source domains of different data distributions or feature spaces. It has been extensively explored in cases where the training data of the respective target domain are limited or too expensive to collect. Many methods have been proposed focusing on homogeneous settings (Ishii and Sato, 2017, Mao et al., 2019, Xiao and Guo, 2015a, Zuo et al., 2018), in which the source and target data share a common feature space. In recent years, the heterogeneous domain adaptation has attracted increasing interest where source and target domain has different feature space. At the same time, heterogeneous domain adaptation is considered as a more challenging task. Heterogeneous domain adaptation has been widely applied in many fields, including image classification in computer vision (Gritsenko et al., 2017, Hsieh et al., 2016, Li, Zhang et al., 2017, Li et al., 2015, Wang and Mahadevan, 2011, Weston et al., 2010, Zhu et al., 2018) drug efficiency prediction in biotechnology (Javanmardi and Tasdizen, 2018, Wang and Mahadevan, 2011), cross-language text classification (Li, Duan, Xu, & Tsang, 2014) and cross-modal retrieval (Bhatia et al., 2015, Wang and Mahadevan, 2011, Zhang et al., 2017). For example, in cross-language document classification, documents in English do not share the feature representation with those in German due to different vocabularies.

The fundamental challenge of heterogeneous domain adaptation is the different feature space across domains, such that the trained model in the source domain cannot be applied to the target domain. To address this problem, most HDA methods aim to learn a common feature representation that both source and target domain data can be represented by homogeneous features (Hsieh et al., 2016, Javanmardi and Tasdizen, 2018, Li et al., 2014, Li, Zhang et al., 2017, Li et al., 2015, Weston et al., 2010, Xiao and Guo, 2015b). Formally, two feature mapping matrices should be learned for transforming the source and target domain data into a new latent feature space where the difference is reduced (Hsieh et al., 2016, Li et al., 2014, Luo et al., 2017, Xiao and Guo, 2015b, Yan et al., 2017). For example, Li et al. (2014) proposed a heterogeneous feature augmentation (HFA) method, which projects the instances into a common subspace and utilizes the projected latent features to augment the original features, before training a classification model with the feature-augmented data. Hsieh et al. (2016) proposed a generalized joint distribution adaptation (GJDA) method of pairwise learning of feature projections allowing to eliminate the difference between projected cross-domain heterogeneous data by matching their marginal and class-conditional distributions. Xiao and Guo (2015b) proposed a novel semi-supervised subspace co-projection method where the instances of the two domains are projected into a co-located latent subspace. The prediction model is then trained with labeled training instances from both domains to exploit the error-correcting output code schemes for incorporating binary prediction tasks. Alternatively, an asymmetric transformation matrix can be learned to map data from one domain to another to either minimize the difference or maximize the alignment (Tsai et al., 2016, Wang and Yang, 2011, Xiao and Guo, 2015a). For example, Wang and Mahadevan (Xiao & Guo, 2015b) presented an HDA method based on domain adaptation using a manifold alignment (DAMA). This method manages several domains, and the feature mappings are learned by utilizing the labels shared by all domains to align each pair of manifolds. Tsai et al. (2016, p.) proposed a cross-domain landmark selection method (CDLS) by learning a heterogeneous feature transformation for unifying heterogeneous data. The representative source and target-domain data are then selected and jointly exploited to improve the adaptation capability. Wang and Yang (2011) proposed a two-step feature mapping method. It firstly extracts the structural features within each domain and then maps the features into the Reproducing Kernel Hilbert Space (RKHS), such that the “structural dependencies” of features across domains can be estimated by kernel matrices of the features within each domain.

These methods have produced the promising results, however, they still suffer from the following two major limitations. Firstly, learning common features and then training a shared classifier on the obtained common space will lead to a model structure with many redundant parameters and more training cost; Secondly, the dimension of the common space makes great effects for the performance, therefore, how to determine its size is difficult (Bhatia et al., 2015, Li et al., 2014, Wang and Yang, 2011, Zhang et al., 2017). To solve the problems mentioned above, we propose a bidirectional ECOC projection method, as shown in Fig. 1. Firstly, by an Error-Correcting Output Code (ECOC) encoding scheme, class labels are mapped into the ECOC space. Secondly, a pair of projection matrices is used to map the data of the source and target domain into the ECOC space again. Thirdly, the projected source and target data (including labeled and unlabeled) are exploited to ensure that the maximum mean discrepancy (MMD) between domains is minimized, such that the represented source and target domain following the same distribution. Finally, given a target data, it is mapped into this space, and directly use ECOC decoding to predict the label of the target data. To evaluate the proposed method, we conduct cross-lingual text classification on multi-lingual Amazon product reviews and cross-domain digital image classification on the UCI handwritten digit dataset. Experimental results demonstrate the high adaptability of the proposed approach for multiclass heterogeneous domain adaptation. In summary, we make the following contributions: (1) we propose a Bidirectional ECOC projection algorithm, named HDA-ECOC, to represent the inputs and outputs simultaneously, so that multi-classification tasks can be directly solved by ECOC decoding. The proposed method reduces the computational complexity and redundant information; (2) to ensure high interclass similarity, we consider both marginal and conditional distribution difference between domains in the ECOC space simultaneously. Moreover, since we reduce the number of parameters, the solution can be obtained very efficiently.

Section snippets

Error correcting output code (ECOC)

ECOC (Escalera, Pujol, & Radeva, 2010a) consists of two components: encoding and decoding. Given a c-class classification problem, in the encoding phase, each class is encoded as a codeword {1,+1}K, where K is the length of the codeword. All the codewords of the c classes synthesize a codeword matrix M{1,+1}c×K, where each row represents the codeword for each class. Fig. 2 shows four ECOC encoding methods:

  • One-vs-All (Rifkin & Klautau, 2004): c binary classifiers are learnt for c classes,

Problem statement

Consider two domains with different input feature spaces, XsRns×ds and XtRnt×dt, where ds and dt are the dimensionalities of the source and target domain feature space, respectively, but share the same multi-class output label space {1,2,,C}, where C is the number of classes. In particular, let Xs=[Xsl,Xsu]Rns×ds denote the instances in the source domain, where XslRls×ds is the labeled source data with the corresponding label Ysl={1,2,,C}ls×ds, and Xsu is the unlabeled source data.

Experimental results

In this section, we perform experiments on cross-lingual text classification datasets and digit image classification datasets with heterogeneous feature spaces.

Conclusion

In this paper, we proposed a novel bidirectional ECOC projection domain adaption approach for multi-class heterogeneous domain adaptation problems. The proposed method simultaneously projects inputs and outputs (labels) of the source domain and target domain into a common ECOC space. In this common space, we consider both marginal and conditional distribution difference between domains, and classifier training are unified and can performed one time. This method simplifies the model structure,

Acknowledgments

This work is supported by National Natural Science Foundation of China Grant Nos. 61572399, 61721002, 61532015, and 61532004; National Key Research and Development Program of China with Grant No. 2016YFB1000903; Shaanxi New Star of Science & Technology Grant, China No. 2013KJXX-29; New Star Team of Xian University of Posts & Telecommunications, China; Provincial Key Disciplines Construction Fund of General Institutions of Higher Education in Shaanxi, China .

References (25)

  • EscaleraS. et al.

    Separability of ternary codes for sparse designs of error-correcting output codes

    Pattern Recognition Letters

    (2009)
  • MaoWentao et al.

    A novel deep output kernel learning method for bearing fault structural diagnosis

    Mechanical Systems and Signal Processing

    (2019)
  • Allwein, E. L., Schapire, R. E., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin...
  • BhatiaK. et al.

    Sparse local embeddings for extreme multi-label classification

  • EscaleraS. et al.

    Error-correcting output codes library

    Journal of Machine Learning Research (JMLR)

    (2010)
  • EscaleraS. et al.

    On the decoding process in ternary error-correcting output codes

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2010)
  • GritsenkoA. et al.

    Extreme learning machines for VISualization+R: Mastering visualization with target variables

    Cognitive Computation

    (2017)
  • Hsieh, Y., Tao, S., Tsai, Y. H., Yeh, Y., & Wang, Y. F. (2016). Recognizing heterogeneous cross-domain data via...
  • Ishii, M., & Sato, A. (2017). Joint optimization of feature transform and instance weighting for domain adaptation. In...
  • Javanmardi, M., & Tasdizen, T. (2018). Domain adaptation for biomedical image segmentation using adversarial training....
  • LiW. et al.

    Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2014)
  • LiX. et al.

    Iterative reweighting heterogeneous transfer learning framework for supervised remote sensing image classification

    IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

    (2017)
  • Cited by (6)

    • Heterogeneous domain adaptation by Features Normalization and Data Topology Preserving

      2022, Knowledge-Based Systems
      Citation Excerpt :

      Among the DA methods, HEDA has more attractive applications, because there is not always an SD that matches the TD. To solve the challenges in HEDA (i.e. features space dissimilarity, inequality of cross-domain dimensions, data distribution dissimilarity in the SD and TD), when there are identical class labels in two domains, one of the methods “feature adaptation” [7,8,19,35–38,43], “classifier adaptation” [4,15,16,29,32,39,40] and “learning deep structures” [5,6,17,28,31,33,52] are used. The proposed method of this paper is categorized in the “feature adaptation” group.

    View full text