Cluster-based adaptive SVM: A latent subdomains discovery method for domain adaptation problems

https://doi.org/10.1016/j.cviu.2017.06.002Get rights and content

Highlights

  • We propose a new latent subdomains discovery method for domain adaptation approach.

  • We call this new method CA-SVM and it is designed based on a new simultaneously clustering and adaptation method.

  • CA-SVM uses linear SVM model for classification in the source and target domains.

  • CA-SVM does not need source samples for applying division.

Abstract

Machine learning algorithms often suffer from good generalization in testing domains especially when the training (source) and test (target) domains do not have similar distributions. To address this problem, several domain adaptation techniques have been proposed to improve the performance of the learning algorithms when they face accuracy degradation caused by the domain shift problem. In this paper, we focus on the non-homogeneous distributed target domains and propose a new latent subdomain discovery model to divide the target domain into subdomains while adapting them. It is expected that applying adaptation on subdomains increase the rate of detection in comparing with the situation that the target domain is seen as one single domain. The proposed division method considers each subdomain as a cluster which has the definite ratio of positive to negative samples, linear discriminability and conditional distribution similarity to the source domain. This method divides the target domain into subdomains while adapting the trained target classifier for each subdomain using Adapt-SVM adaptation method. It also has a simple solution for selecting the appropriate number of subdomains. We call our proposed method Cluster-based Adaptive SVM or CA-SVM in short. We test CA-SVM on two different computer vision problems, pedestrian detection and image classification. The experimental results show the advantage in accuracy rate for our approach in comparison to several baselines.

Introduction

In many real-world applications, it is common that the distribution of training data is different from that of test data. This distribution shift is mainly caused by factors such as difference in illumination, viewpoint, background, resolution, etc. The distribution shift between two domains is inevitable and caused by the process of collecting data. Many studies have demonstrated the influence of shift between distribution of training and test domains in accuracy degradation (Farhadi and Tabrizi, 2008, Saenko et al., 2010, Torralba and Efros, 2011).

Domain Adaptation (DA) is a branch of machine learning, proposed to address domain shift problems (Gopalan et al., 2011, Hoffman et al., 2013, Mirrashed and Rastegari, 2013, Pan et al., 2008, Pan et al., 2011). When the training (source) and test (target) domains have different but related distributions, DA performs adaptation between source and target domains such that it increases the detection accuracy in the target domain. In order to come along with adaptation challenge, a variety of methods have been proposed based on different assumptions about the source and target domains. One branch of adaptation studies focus on the problems that have access to the enough number of labeled source samples and a few number of labeled target samples (Hoffman et al., 2013, Aytar and Zisserman, 2011, Jiang et al., 2008, Xu et al., 2014b, Xu et al., 2014a, Duan et al., 2012b). In this case, adaptation is performed by finding the common space in which the source and target samples have similar distributions and can be used for training an appropriate classifier (Saenko et al., 2010, Donahue et al., 2013, Gong et al., 2012, Gopalan et al., 2011, Hoffman et al., 2013, Mirrashed and Rastegari, 2013, Pan et al., 2008, Pan et al., 2011, Yeh et al., 2014). The common space is found by the information obtained from the source and target domains based on available samples. This approach is more accurate than other approaches and can define adaptation more precisely. However, the time and memory complexity of this approach is high which makes it not applicable for all kinds of problems (especially for detection problems with huge number of source samples). Along with this group, there are some specific adaptation methods which use the trained source model and a few available target samples for adaptation (Aytar and Zisserman, 2011, Jiang et al., 2008, Mozafari and Jamzad, 2014, Xu et al., 2014b, Xu et al., 2014a, Yang et al., 2007b, Yang et al., 2007a). The methods in this group do not have access to the source samples. Instead they try to use the trained source model as a means of adaptation. The variation of methods in this branch is limited, and most of them apply adaptation in the original space of the source and target domains. The reason for applying adaptation in the original space comes back to the lack of information that a source model has to describe the source domain. This information is essential for finding the common space. The variations of model-transferring adaptation methods are limited. However, they are very important because of their low time and memory complexity (Mozafari and Jamzad, 2016) and they are well suited for real-time applications.

Most of DA methods consider the source and target domains as two homogeneous distributed domains and apply adaptation between them. Sometimes, source, target or both are made of distributionally non-homogeneous domains. A good example for a non-homogeneous domain is a constructed domain by the collected images from a search engine. Because a search engine returns data from different resources, therefore the searched images have different distributions. Fig. 1 shows the 13 top images collected by searching “cat” word in Google search engine. The returned images can be divided into three subdomains according to their difference in viewpoint, background and illumination.

To perform adaptation for distributionally non-homogeneous domains, the first step is dividing the source and target domains into distributionally homogeneous subdomains and then applying adaptation between each related subdomains. Accordingly, subdomain division is the main concern in these cases. Several methods (Duan et al., 2012b, Hoffman et al., 2012) are proposed for dividing the source and target domains into distributionally related subdomains. The common characteristics between them are their dependency on available source and target samples for dividing the source and target domains into appropriate subdomains for adaptation.

The focus of this paper, is on proposing a new adaptation method for non-homogeneous target domains which need to be adapted using a linear SVM trained source model. We select model-transferring approach for adaptation and propose a new clustering method for subdomain division that we call it Cluster-based Adaptive SVM or CA-SVM for short. CA-SVM uses a new proposed method to divide the target domain into homogeneous target subdomains. The proposed clustering method tries to keep the balance of labeled samples, linearly discriminable characteristic of samples and distribution similarity of subdomains to the source domain, in each cluster. CA-SVM divides the target domain and simultaneously applies adaptation for each subdomain using Adapt-SVM method (Yang et al., 2007a). We apply CA-SVM on two test-beds of pedestrian detection and image classification. We will highlight the role of separating the subdomains in improving the adaptation success of our proposed method in comparison to several baselines that see the target domain as a one-complete domain and to the baselines that divide the target domain into subdomains.

The rest of paper is organized as follows. In Section 2, we review the related works which mainly is focused on SVM-based adaptation methods. Section 3 contains a short introduction to SVM-based model-transferring methods followed by the details of the proposed CA-SVM algorithm and its analysis. The experimental results on real and synthetic data are shown in Section 4 and finally we draw the conclusion and future work in Section 5.

Section snippets

Related works

Recently a wide range of methods have been proposed in the field of domain adaptation. A comprehensive overview is given in (Shao et al., 2014, Jiang, 2008). DA is followed in the literature mainly in theoretical view (Blitzer et al., 2011, Kuzborskij and Orabona, 2013, Ben David et al., 2007, Kuzborskij and Orabona, 2016) and application view (Donahue et al., 2013, Gopalan et al., 2011, Hoffman et al., 2013, Mirrashed and Rastegari, 2013, Mozafari and Jamzad, 2014, Xu et al., 2014a, Sun et

Our proposed adaptation method: cluster-based adaptive SVM (CA-SVM)

In this section, at first we introduce the notations and definitions used in this paper. Then we will describe the general framework of CA-SVM and discuss about its different steps in details.

Experimental result

In this section we evaluate CA-SVM in comparison to several baselines and on different datasets. We divide this section into three main subsections. In the first part we focus on clustering behavior of CA-SVM and show the capability of CA-SVM to find appropriate subdomains on synthetic data. In the second subsection we discuss about the results of CA-SVM on the pedestrian detection dataset. We investigate the influence of defined constraints in the objective function of CA-SVM in improving the

Conclusion

In this paper, we introduced a new latent subdomain discovery method, which discovers adaptable clusters inside the target domain and simultaneously adapts them using Adapt-SVM. In contrast to the famous latent subdomains discovery like Reshaping it does not need the source samples for finding appropriate subdomains in the target domain. It uses trained source classifier to find adaptable clusters inside the target domain. The key idea of CA-SVM is based on a new clustering approach for

References (59)

  • J. Donahue

    Semi-supervised domain adaptation with instance constraints

  • L. Duan

    Visual event recognition in videos by learning from web data

    Pattern Anal. Mach. Intell. IEEE Trans.

    (2012)
  • Duan, L., Xu, D., and Tsang, I., Learning with augmented features for heterogeneous domain adaptation. arXiv preprint...
  • L. Duan et al.

    Domain adaptation from multiple sources: a domain-dependent regularization approach

    Neural Netw. Learn. Syst. IEEE Trans.

    (2012)
  • M. Enzweiler et al.

    Monocular pedestrian detection: survey and experiments

    Pattern Anal. Mach. Intell. IEEE Trans.

    (2009)
  • A. Farhadi et al.

    Learning to recognize activities from the wrong view point

    Computer Vision–ECCV 2008

    (2008)
  • B. Fernando

    Unsupervised visual domain adaptation using subspace alignment

  • Fernando, B., et al., Subspace alignment for domain adaptation. arXiv preprint arXiv:1409.5241,...
  • D. Gerónimo

    Adaptive image sampling and windows classification for on-board pedestrian detection

  • R. Girshick

    Rich feature hierarchies for accurate object detection and semantic segmentation

  • B. Gong

    Geodesic flow kernel for unsupervised domain adaptation

  • B. Gong et al.

    Reshaping visual datasets for domain adaptation

    Adv. Neural Inf. Process. Syst

    (2013)
  • R. Gopalan et al.

    Domain adaptation for object recognition: an unsupervised approach

  • G. Griffin et al.

    Caltech-256 Object Category Dataset

    (2007)
  • D.J. Hand et al.

    A simple generalisation of the area under the ROC curve for multiple class classification problems

    Mach. Learn.

    (2001)
  • J. Hoffman

    Discovering latent domains for multisource domain adaptation

    Computer Vision–ECCV 2012

    (2012)
  • Hoffman, J., et al., One-shot adaptation of supervised deep convolutional models. arXiv preprint arXiv:1312.6204,...
  • Hoffman, J., et al., Efficient learning of domain-invariant image representations. arXiv preprint arXiv:1301.3224,...
  • J. Hoffman

    LSDA: large scale detection through adaptation

    Ad. Neural Inf. Process. Syst

    (2014)
  • Cited by (0)

    View full text