Supervised transfer kernel sparse coding for image classification☆
Introduction
Sparse coding has been successfully applied to image classification problems [1], [12], [20], [21]. Sparse coding yields sparse representations so that a sample can be represented by a linear combination of a few dictionary elements. When represented by sparse representations, images can be well interpreted and classified. Recently, sparse coding has been combined with kernel methods to deal with the nonlinear structure of the data [6], [18]. They all extended sparse coding to nonlinear sparse coding using kernel trick, thus yielding more discriminative sparse representations. In view of its merits, we also map source and target data into a high dimensional feature space by an implicit mapping function and then learn the sparse representations of the data in the kernel space.
However, labeled images are often short and labeling new images is time consuming. So we can use other related dataset, source domain, to help our target domain image classification tasks [1], [13], [15]. When the feature distribution of source domain differs greatly from that of target domain, directly applying the classifier or object models trained on source domain to target domain is impossible. Transfer learning has been widely studied to solve this problem, which produces excellent results. A survey on transfer learning is presented [14].
Many researchers have focus on the use of sparse coding in transfer learning. Source and target images are combined to learn a shared dictionary and sparse representations for all the images [1], [12]. They all adopted a nonparametric distance measure to reduce the distribution divergence between source and target data. The new representations obtained are proved to be robust for the classification tasks. However, in [1], source data are also used as test data. In real applications, source data are all tagged, and the task is to classify the target data. In [12], the limited labeled target data are not considered in the algorithm. Only using the source labeled data is not enough for the target classification tasks. Shekhar et al. proposed a method to learn a shared discriminative dictionary of source and target data in a low dimensional space [17].
However, they seek the sparse representations of all the samples in the original feature space or a low dimensional space. The drawback is that they are not adequate for representing image features. Image features are not linearly separable in the original feature space in which many nonlinear similarity functions such as spatial pyramid descriptor, region covariance descriptor are used to measure the similarities between them. Pyramid match kernel is used for spatial pyramid descriptor, while geodesic distance for region covariance descriptor. These similarity measures between images are highly nonlinear. However, sparse coding means that the data is represented by a linear combination of a few dictionary elements. Directly using the linear method to represent the nonlinear image features leads to poor performance. We overcome the problem by mapping the samples into a high dimensional feature space using kernel trick to get more separable sparse representations by a coupled dictionary. When mapping the samples into a high dimensional feature space, we convert the linearly inseparable data in the original feature space into linear separable data in the high dimensional feature space. So we can apply sparse coding theory in the high dimensional feature space. In the high dimensional feature space, the computational complexity is so high that we use kernel function to transform the objective into a feasible problem.
Our another novelty is adding label information into the objective using the sparse coding of source and target data. Specifically, we maximize the correlation between the sparse coding of source and target data of the same class. Due to the introduction of supervision information, we can learn more discriminative sparse representations which are helpful for our classification tasks. Several experiments are implemented to demonstrate that our method yields very good performance.
This paper is organized as follows. Section 2 provides a brief review of related work. In Section 3, we introduce our novel Supervised Transfer Kernel Sparse Coding (STKSC) algorithm. In Section 4, the optimization of STKSC is presented. Extensive experiments are conducted in Section 5. The conclusions and future work are presented in Section 6.
Section snippets
Related work
Sparse coding has received growing attention because of its advantage and promising performance for many computer vision applications [1], [12], [22]. Researchers have developed a lot of algorithms to get sparse and efficient representations of images [18], [23]. Given input samples the sparse coding matrix and dictionary can be trained by solving the following problem where xi is a column of . The Frobenius norm is
Supervised transfer kernel sparse coding
Define source data target data in which with labels ns is the number of source samples. We define target labeled samples with labels and unlabeled samples respectively. All the target samples are nt is the number of target samples. Q′ means the transpose of a matrix Q in this paper.
Suppose source and
Optimization of STKSC
Given dictionary D, the objective function can be written as Since xi is independent of other items in X, we update xi while other vectors fixed.
We can use any kernel functions that satisfy the Mercers condition [2], such as Gaussian kernel, Polynomial kernel, to compute ϕ(y)′ϕ(y). Putting these kernels
Experiments and analysis
In this section, we perform experiments on publicly available benchmark datasets for cross domain image classification tasks (see Table 2). It can be observed that our method outperforms the state-of-art methods in most of cases.
Conclusions and future work
Learning discriminative sparse representations is of great importance for the cross domain image classification problems. In this paper, we use source and target data to learn a shared dictionary and sparse representations of all the data by minimizing the reconstruction error of kernel sparse coding. Moreover, the supervision information is used by maximizing the correlation between the sparse representations of source and target data of the same class. We also incorporate the MMD and graph
Acknowledgments
This work is supported by National Natural Science Foundation of China (Grant No. 61472305, 61070143, 61303034), Science and technology project of Shaanxi province, China (Grant No. 2015GY027), and the Fundamental Research Funds for the Central Universities (Grant No. SMC1405).
References (23)
- et al.
Multi-source transfer learning based on label shared subspace
Patt. Recognit. Lett.
(2015) - et al.
Principal component analysis
Chemom. Intell. Lab. Syst.
(1987) - et al.
Semi-supervised action recognition in video via labeled kernel sparse coding and sparse l 1 graph
Patt. Recognit. Lett.
(2012) - et al.
Supervised transfer sparse coding
Twenty-eighth AAAI Conference on Artificial Intelligence
(2014) - et al.
Support-vector networks
Mach. Learn.
(1995) - et al.
The pascal visual object classes (voc) challenge
Int. J. Comput. Vis.
(2010) - et al.
Liblinear: A library for large linear classification
J. Mach. Learn. Res.
(2008) - et al.
Sparse representation with kernels
IEEE Trans. Image Process.
(2013) - G. Griffin, A. Holub, P. Perona, Caltech-256 object category dataset, Technical Report 7694, California Institute of...
A database for handwritten text recognition research
IEEE Trans. Patt. Anal. Mach. Intell.
(1994)
Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories
Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on
Cited by (14)
T2-FDL: A robust sparse representation method using adaptive type-2 fuzzy dictionary learning for medical image classification
2020, Expert Systems with ApplicationsCitation Excerpt :The sparse representation was first proposed in the signal processing field and later found its place in different data representation applications (Huang & Aviyente, 2007). Recently, the sparse representation has been widely used in many image classification applications (Li, Fang, Wang, & Zhang, 2015; Zhang, Wang, Tao, Gong, & Zheng, 2017; Zhang, Shen, Wei, Li, & Sangaiah, 2017). Furthermore, it can significantly improve visual recognition and image classification by providing an efficient approximation (Chatfield, Lempitsky, Vedaldi, & Zisserman, 2011).
FDSR: A new fuzzy discriminative sparse representation method for medical image classification
2020, Artificial Intelligence in MedicineCitation Excerpt :Incomplete data in many signal processing, machine learning, and pattern recognition problems results in some sort of sparse representation. Thus, in recent years, the sparse representation has drawn much attention in different classification problems [12–14]. In sparse representation, the input data is approximated in terms of a linear combination of atoms from a given overcomplete basis, called a dictionary.
Multiple Universum Empirical Kernel Learning
2020, Engineering Applications of Artificial IntelligenceDeep transfer network for rotating machine fault analysis
2019, Pattern RecognitionDomain adaptation from RGB-D to RGB images
2017, Signal ProcessingCitation Excerpt :Domain difference between RGB-D dataset and RGB dataset is usually large due to causes from the imaging acquisition process. Many visual domain adaptation methods [14–17] have been widely studied to solve the problem, yielding excellent results. A survey of visual domain adaptation methods is presented in [18].
Learning Coupled Classifiers with RGB images for RGB-D object recognition
2017, Pattern RecognitionCitation Excerpt :Discriminative Domain Adaptation (DDA) method [24] transferred labeled source data to target domain after refining the data. Supervised Transfer Kernel Sparse Coding (STKSC) method [12] unified kernel sparse coding in transfer learning and exploited the labels to learn the sparse representations across domains. All the above methods use the few labeled samples from target domain and achieve better performance.
- ☆
This paper has been recommended for acceptance by Dr. S. Wang.