Active Two Phase Collaborative Representation Classifier for Image Categorization

Dornaika, F.; El Traboulsi, Y.; Ruicheck, Y.

doi:10.1007/978-3-030-30642-7_16

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11751))

Included in the following conference series:

International Conference on Image Analysis and Processing

1763 Accesses

Abstract

In recent times, the Sparse Representation Classifier (SRC), the Collaborative Representation Classifier (CRC), and the Two Phase Test Sample Sparse Representation (TPTSSR) classifier were proposed as classification tools that exploit sparse representation. Inspired by active learning techniques, this paper proposes an active Collaborative Representation Classifier that can be exploited by these supervised frameworks. The introduced Active Two Phase Collaborative Representation Classifier (ATPCRC) begins by estimating the label of the available unlabeled samples. At testing stage, based on the TPTSSR framework any test sample will have two representations that are calculated separately by using two different dictionaries. The first dictionary is composed of all samples having original labels. The second dictionary contains the whole dataset samples (original and predicted labels). The two kinds of class-wise reconstruction error are then fused in order to infer the label of the test sample. The proposal is validated on four public image datasets. The results shoe that the introduced ATPCRC can outperform the classic TPTSSR as well as several state-of-the-art approaches that use label and unlabeled data samples.

You have full access to this open access chapter, Download conference paper PDF

Weak Correlation-Based Discriminative Dictionary Learning for Image Classification

Support Discrimination Dictionary Learning for Image Classification

Image Classification by Iterative Semi-Supervised Sparse Coding

Keywords

1 Introduction

Image categorization was a hot topic in the computer vision and patter recognition community. Researchers brought many progresses to this domain by deploying semi-supervised learning (SSL) paradigms [2, 7, 8, 15].

Unlike unsupervised and supervised learning, SSL exploits both labeled and unlabeled data samples in estimating the models. However, SSL may face some difficulties especially in cases where the labels are very scarce. Therefore, one interesting approach is to increase the size of the labeled data by invoking active learning paradigms (e.g., [1, 11, 18]). One main goal of active learning is to generate more labeled samples by simply predicting the labels of unlabeled samples, and exploit them to build new models and classifiers. The main problems that these paradigms solve are: (i) identifying the most relevant unlabeled samples that the system should predict their label first, (ii) preserving confident predictions. Usually, the proposed solutions rely on the concept of confidence in prediction and classification. For instance, if the confidence of a label prediction is not enough for a specific data sample (i.e., the predicted label has a high uncertainty), then the corresponding sample will not be exploited by the final model. At most, it will be used as an unlabeled sample since its estimated label is uncertain. In addition to the uncertainty and confidence concepts, some methods proposed other criteria. In order to avoid having many labeled samples in the same cluster, Nguyen et al. [21] exploit the diversity concept by deploying a pre-clustering method. In [6], the authors proposed an active cluster based sampling method. However, since this approach employs a hierarchical clustering of unlabeled samples, the final performance can be impacted by the performance of clustering process itself. In [17], the authors introduced an active probabilistic variant of the K-NN classifier that can be used for multi-class problems. In [14], the authors proposed an approach that is based on informativeness and representativeness of unlabeled samples. Besides active learning paradigms, sparse representation has brought significant advances to the pattern recognition field [5]. This is due to its capacity to acquire, represent and compress knowledge of the domain, and thus to reconstruct the data with minimal loss [26]. The Sparse Representation based Classifier (SRC) [27] can be thought as a generalization of the Nearest Neighbor classifier (NN) and the Nearest Feature Subspace (NFS) [16]. Unlike the NN and NFS classifiers, SRC can be more robust in the presence of deviations and occlusions [25]. Despite the fact that SRC has good performance, it has a high computational cost since it is based on the $\ell _1$ minimization in the coding process. Therefore, SRC cannot practical for real-world problems requiring a fast decision and classification. Thus, many researchers exploited data locality [9]. For instance, the work of [19] limited the sparse coding dictionary to the nearest neighbors only. Xu et al. [28] proposed a Two Phase Test Sample Sparse Representation (TPTSSR) approach in which the regularization is given by the $\ell _2$ norm. This method has two phases. In the first phase, the testing sample is represented as a linear combination of all training samples. The first M samples that provide its best representation are then chosen to be the atoms of a new compact dictionary. In the second phase, the testing sample is coded using the new dictionary of M samples. The label of the test sample is made upon this representation. The Collaborative Representation Classifier (CRC) is the classifier that uses the first phase of the TPTSSR classifier. This paper is organized as follows: Sect. 2 presents our Active Two Phase Collaborative Representation classifier (ATPCRC). The experimental results and methods comparison are presented in Sect. 3. Section 4 concludes the paper. In the paper, capital bold letters denote matrices and bold letters denote vectors.

2 Active Two Phase Collaborative Representation Classifier

Proposed Method. In this section, we propose the Active Two Phase Collaborative Representation Classifier (ATPCRC). While our proposed ATPCRC makes the TPTSSR classifier active, it is also able to make any collaborative representation-based classifier (e.g., CRC and SRC) active. Our proposed ATPCRC aims to construct a classifier that exploits both labeled and unlabeled samples. Let $\mathbf{x}_1, \mathbf{x}_2, \ldots , \mathbf{x}_L$ denote the labeled data samples and $\mathbf{x}_{L+1}, \mathbf{x}_{L+2}, \ldots , \mathbf{x}_N$ denote the unlabeled data samples. The matrix of labeled samples is denoted by $\mathbf{X}_l = [\mathbf{x}_1, \mathbf{x}_2, \ldots , \mathbf{x}_L] \in \mathbb {R}^{D \times L}$ and the matrix of unlabeled samples is denoted by $\mathbf{X}_u = [\mathbf{x}_{L+1}, \mathbf{x}_{L+2}, \ldots , \mathbf{x}_N] \in \mathbb {R}^{D \times U}$ where L and $U = N - L$ are the numbers of labeled and unlabeled samples, respectively. The training data are defined by the matrix $\mathbf{X}= [\mathbf{x}_1, \mathbf{x}_2, \ldots , \mathbf{x}_N] \in \mathbb {R}^{D \times N}$.

Using active learning strategies, we first estimate the labels of all unlabeled samples, $\mathbf{X}_u$. We then use both the original labeled data and the predicted ones, $\mathbf{X}$, to build a new classifier. We recall that the TPTSSR is a lazy classifier in the sense that all of its computation stages run at the testing step. In order to predicting the label of the unlabeled samples any classifier can be invoked. In our work, we employ the TPTSSR classifier in which the original set of labeled samples are used. Once this stage is achieved, every sample in the training data matrix $\mathbf{X}$ has either an original label or a predicted one. In order to classifying a testing sample by the proposed ATPCRC, we proceed as follows. Two coding schemes are carried out independently, each has two phases of coding like in TPTSSR. The first coding scheme uses the labeled data $\mathbf{X}_l$. The second coding scheme uses the whole training data matrix $\mathbf{X}$. To infer the class of any testing sample, a fusion of the class-wise reconstruction error is exploited. Let $M_l$ and M denote the parameters of the two coding processes. We proceed as follows.

First Phase. In the first phase, the testing sample $\mathbf{y}\in \mathbb {R}^{D}$ will have two representation or codes: the first code vector is computed from a linear combination of the labeled samples $\mathbf{X}_l$ and the second code results from a linear combination of the whole training data $\mathbf{X}$. These two codes of $\mathbf{y}$ are given by:

$$\begin{aligned} \mathbf{y}&= a^l_1 \, \mathbf{x}_1 \, + \, a^l_2 \, \mathbf{x}_2 \, + \ldots + \, a^l_L \, \mathbf{x}_L \\ \mathbf{y}&= a_1 \, \mathbf{x}_1 \, + \, a_2 \, \mathbf{x}_2 \, + \ldots + \, a_L \, \mathbf{x}_L + \ldots + \, a_N \, \mathbf{x}_N \end{aligned}$$

(1)

Equations (1) and (2) can be rewritten in matrix form as follows:

$$ \mathbf{y}= \mathbf{X}_l \, \mathbf{a}^l \,\,\,\,\,\,\,\, \text{ and } \,\,\,\,\,\,\,\, \mathbf{y}= \mathbf{X}\, \mathbf{a}\,\,\,\,\,\,\,\, $$

where $\mathbf{a}^l = [ a^l_1, a^l_2, \ldots , a^l_L]^T$ and $\mathbf{a}= [ a_1, a_2, \ldots , a_N]^T$. The unknown code vectors $\mathbf{a}^l$ and $\mathbf{a}$ are recovered using $\ell _2$ regularization. These two vectors are solutions to the following optimization problems, respectively:

$$\begin{aligned} \mathbf{a}^{l\star } = arg \min _{\mathbf{a}^l} \Vert \mathbf{y}- \mathbf{X}_l \, \mathbf{a}^l \Vert ^2 + \lambda _l \, \Vert \mathbf{a}^l \Vert ^2 \nonumber \\ \mathbf{a}^{\star } = arg \min _{\mathbf{a}} \Vert \mathbf{y}- \mathbf{X}\, \mathbf{a}\Vert ^2 + \lambda \, \Vert \mathbf{a}\Vert ^2 \nonumber \end{aligned}$$

(2)

where $\lambda _l$ and $\lambda $ are two regularization parameters. The solutions for $\mathbf{a}^l$ and $\mathbf{a}$ are provided by:

$$\begin{aligned} \mathbf{a}^{l\star } = (\mathbf{X}_l^{T} \,\mathbf{X}_l +\lambda _l \, \mathbf{I}_l)^{-1} \, \mathbf{X}_l^{T} \mathbf{y}\nonumber \\ \mathbf{a}^{\star } = (\mathbf{X}^{T} \,\mathbf{X}+\lambda \, \mathbf{I})^{-1} \, \mathbf{X}^{T} \mathbf{y} \end{aligned}$$

(3)

where $\mathbf{I}$ and $\mathbf{I}_l$ are identity matrices with an appropriate size. From Eqs. (1) and (2), one can see that each data sample, $\mathbf{x}_i$, has its own contribution in the reconstruction of the test sample $\mathbf{y}$. Thus, from Eq. (1) the contribution of $\mathbf{x}_i$ is $a^l_i \mathbf{x}_i$. From (2), the contribution is $a_i \mathbf{x}_i$. Therefore, $\mathbf{x}_i$ has a large contribution in Eq. (1) if $\Vert \mathbf{y}- a^l_i \, \mathbf{x}_i \Vert ^2$ is small, and it has a large contribution in Eq. (2) if $\Vert \mathbf{y}- a_i \, \mathbf{x}_i \Vert ^2$ is small. Thus, the $M_l$ samples ($1 \le M_l \le L$) that have the largest $M_l$ contributions when approximating $\mathbf{y}$ in Eq. (1) and the M samples ($1 \le M \le N$) that have the largest M contributions when approximating $\mathbf{y}$ in Eq. (2) are chosen to be handed over to the second phase of coding. The two subsets of selected samples are denoted by $\{ \widetilde{\mathbf{x}}^l_1, \widetilde{\mathbf{x}}^l_2, \ldots , \widetilde{\mathbf{x}}^l_{M_l} \}$, and $ \{ \widetilde{\mathbf{x}}_1, \widetilde{\mathbf{x}}_2, \ldots , \widetilde{\mathbf{x}}_{M} \}$. In matrix form, these two dictionaries are given by $\widetilde{\mathbf{X}_l} = [ \widetilde{\mathbf{x}}^l_1, \widetilde{\mathbf{x}}^l_2, \ldots , \widetilde{\mathbf{x}}^l_{M_l}]$ and $\widetilde{\mathbf{X}} = [ \widetilde{\mathbf{x}}_1, \widetilde{\mathbf{x}}_2, \ldots , \widetilde{\mathbf{x}}_{M}]$.

Second Phase. In the second phase, the testing sample $\mathbf{y}$ is represented by two code vectors: the first one is a linear combination of the remaining $M_l$ labeled samples and the second one is a linear combination of the remaining M training samples. This can be written as:

$$ \mathbf{y}= \widetilde{\mathbf{X}_l} \, \mathbf{b}^l \,\,\,\,\,\,\,\, \text{ and } \,\,\,\,\,\,\,\, \mathbf{y}= \widetilde{\mathbf{X}} \, \mathbf{b}\,\,\,\,\,\,\,\, $$

where $\mathbf{b}^l$ and $\mathbf{b}$ denote the second phase vectors. Similarly to (1) and (2), the unknown vectors $\mathbf{b}^l$ and $\mathbf{b}$ are provided by:

$$\begin{aligned} \mathbf{b}^{l\star } = (\widetilde{\mathbf{X}_l}^{T} \, \widetilde{\mathbf{X}_l} + \gamma _l\, \mathbf{I}_l)^{-1} \, \widetilde{\mathbf{X}_l}^{T} \, \mathbf{y}\nonumber \\ \mathbf{b}^{\star } = (\widetilde{\mathbf{X}}^{T} \, \widetilde{\mathbf{X}} + \gamma \, \mathbf{I})^{-1} \, \widetilde{\mathbf{X}}^{T} \, \mathbf{y} \end{aligned}$$

(4)

where $\gamma $ and $\gamma _l$ are two regularization parameters.

Suppose we have $t_l$ data samples, from the $M_l$ labeled samples, belonging to the $c^{th}$ class: $(\widetilde{\mathbf{x}}^l_1)^c, (\widetilde{\mathbf{x}}^l_2)^c, \ldots , (\widetilde{\mathbf{x}}^l_{t_l})^c$, and their corresponding coefficients are $(b^l_1)^{c}, (b^l_2)^{c}, \ldots , (b^l_{t_l})^{c}$. Suppose that, from the M training samples, there are t data samples belonging to the $c^{th}$ class (or predicted to be in this class): $(\widetilde{\mathbf{x}}_1)^c, (\widetilde{\mathbf{x}}_2)^c, \ldots , (\widetilde{\mathbf{x}}_{t})^c$ and their corresponding coefficients are $(b_1)^{c}, (b_2)^{c}, \ldots , (b_{t})^{c}$. We can define the reconstruction error associated to class c, Dev(c) by:

$$\begin{aligned} \eta \, \bigg \Vert \mathbf{y}- \sum _{j=1}^{t_l} \, \widetilde{\mathbf{x}}^{c}_j \, (b^l_j)^c \bigg \Vert ^2 + (1-\eta ) \, \bigg \Vert \mathbf{y}- \sum _{j=1}^{t} \, \widetilde{\mathbf{x}}^{c}_j \, (b_j)^{c} \bigg \Vert ^2 \end{aligned}$$

(5)

where $\eta $ is a balance parameter ($0 \le \eta \le 1$). The above proposed residual is a way of fusing the collaborative contribution of the selected samples of the $c^{th}$ class, in representing the testing sample $\mathbf{y}$ by both $\mathbf{X}_l$ and $\mathbf{X}$. A large contribution corresponds to small residual. Therefore, the label of $\mathbf{y}$ is estimated by:

$$ l(\mathbf{y}) = arg \min _{c} Dev (c) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\, 1 \le c \le C $$

where C is the number of classes and $l(\mathbf{y})$ is the predicted class label of the testing sample $\mathbf{y}$. This, is the output of the ATPCRC. By using this merging rule, we are able to down-weigh the residual associated with the samples in $\mathbf{X}$ since their labels are not all correct. The introduced class-wise reconstruction errors avoid the use of an ad-hoc sample-based confidence measure. Based on Eq. (5), we can observe that if $\eta $ is set to 1, we get the classic TPTSSR. If $\eta $ is set to zero, we get a trivial active variant of TPTSSR. In the sequel, we will show that the proposed ATPCRC can outperform both the classic TPTSSR and the trivial active variant of TPTSSR.

The Algorithm. The introduces ATPCRC has the following inputs: the labeled data matrix $\mathbf{X}_l = [\mathbf{x}_1, \mathbf{x}_2, \ldots , \mathbf{x}_L] \in \mathbb {R}^{D \times L}$, the training data matrix $\mathbf{X}= [\mathbf{x}_1, \mathbf{x}_2, \ldots , \mathbf{x}_N] \in \mathbb {R}^{D \times N}$ (it has both labeled and unlabeled samples), the testing sample $\mathbf{y}\in \mathbb {R}^D$ and the parameters M and $M_l$.

1.
Estimate the labels of the samples $\mathbf{x}_{L+1}, \mathbf{x}_{L+2}, \ldots , \mathbf{x}_N$ using the TPTSSR classifier and the training data $\mathbf{X}_l$. M is the TPTSSR parameter.
2.
Calculate the code vectors $\mathbf{a}^{\star }$ and $\mathbf{a}^{l\star }$ using Eq. (3).
3.
Compute the vector $\mathbf{e}=(e_1,e_2, \ldots , e_N)^T$ where $e_i = \Vert \mathbf{y}- a_i \, \mathbf{x}_i \Vert ^2$. Sort $\mathbf{e}$ and choose the samples that corresponding to the smallest M elements of $\mathbf{e}$. These selected samples are denoted $\widetilde{\mathbf{x}}_1, \widetilde{\mathbf{x}}_2, \ldots , \widetilde{\mathbf{x}}_{M}$. Finally, form the matrix $\widetilde{\mathbf{X}} = [\widetilde{\mathbf{x}}_1, \widetilde{\mathbf{x}}_2, \ldots , \widetilde{\mathbf{x}}_{M}]$.
4.
Similarly form the matrix $\widetilde{\mathbf{X}^l} = [\widetilde{\mathbf{x}}^l_1, \widetilde{\mathbf{x}}^l_2, \ldots , \widetilde{\mathbf{x}}^l_{M_l}]$ using $e^l_i = \Vert \mathbf{y}- a^l_i \, \mathbf{x}_i \Vert ^2$ instead of $e_i = \Vert \mathbf{y}- a_i \, \mathbf{x}_i \Vert ^2$ and $M_l$ instead of M.
5.
Compute the code vectors $\mathbf{b}^{\star }$ and $\mathbf{b}^{l\star }$ using Eq. (4).
6.
For every class c ($1 \le c \le C$) calculate the global residual defined in Eq. (5).
7.
The label of $\mathbf{y}$ is the class that corresponds to the smallest residual error.

3 Performance Study

In this section, we compare the performance of the proposed ATPCRC with that of twelve methods: Nearest Neighbor classifier (NN), Support Vector Machines (SVM) adopting a polynomial kernel, Sparse Representation based Classifier (SRC) [27], Two Phase Test Sample Representation Classifier (TPTSSR) [28], Semi-supervised Discriminant Embedding (SDE) [13], Semi-supervised Discriminant Analysis (SDA) [4], Transductive Component Analysis (TCA) [20], Sparsity Preserving Discriminant Analysis (SPDA) [23], Laplacian Regularized Least Squares (LapRLS) [3], Flexible Manifold Embedding (FME) [22], Kernel Flexible Manifold Embedding (KFME) [12], and Semi-supervised Exponential Discriminant Embedding (ESDE) [10]. The SVM, NN, SRC, and TPTSSR classifiers are supervised methods while the other competing approaches are exploiting both unlabeled and labeled samples.

Experimental Setup. The experiments are run on four public image datasets. These four datasets belong to several categories: one object database (COIL20), one handwritten digits database (USPS), and two face datasets (Extended Yale and Honda).

COIL20^{Footnote 1}: The Columbia Object Image Library (COIL20) The COIL20 image database has 1440 images. There are 20 objects and each object provides 72 images which are taken at pose intervals of five degrees. In our experiments, we use a subset having 18 images for each object (one image for every $20^{\circ }$ of rotation).

Extended Yale^{Footnote 2}: There are 1774 images depicting 28 persons. Each person has 59–64 frontal images.

Honda: We use 1138 face images retrieved from the public Honda Video DataBase (HVDB). These images correspond to 22 persons.

USPS Handwritten Digits^{Footnote 3}: This dataset consists of 11000 images of handwritten digits from “0” to “9” (1100 images per digit). We utilize the tenth of this database.

Each dataset is randomly split into labeled, unlabeled and testing samples. In the conducted experiments, we adopt three different partitions of the data. These partitions are illustrated in Table 1. The labeled and unlabeled parts are used in the methods that use bot labeled and unlabeled data. The test part is used to evaluate the performance.

For each partition, the splitting process is repeated ten times. As a preprocessing step, all datasets used PCA in order to reduce the dimensions. We used a PCA that preserves 98% of the variability.

Table 1. Data partitions for the used image datasets.

Full size table

Method Comparison. Table 2 depicts the recognition performance of the proposed ATPCRC and that of 12 competing methods. In this table, we report the recognition rate average as well as its standard deviation over the ten random splits.

For the FME, KFME, SDE, SDA, SPDA, LapRLS and TCA methods, all parameters are tuned using the set $\{10^{-9}, 10^{-6}, 10^{-3}, 1, 10^{+3}, 10^{+6}, 10^{+9}\}$. For the ATPCRC method, M and $M_l$ parameters are chosen in $\{30, 60, 90, ..., N\}$. The regularization parameters of the proposed ATPCRC method (i.e., $\lambda $, $\lambda _l$, $\gamma $ and $\gamma _l$) are set to 0.01. $\eta $ is set to 0.8. This value for $\eta $ was empirically found to be a good choice for all datasets.

For the projection methods (SDE, SDA, SPDA, TCA, and ESDE), the classification was performed using the nearest neighbor (NN) classifier. The reported results correspond to the best parameters configuration over ten splits. Bold numbers correspond to the best recognition rates. Several observations can be made from Table 2. The main ones are as follows. (1) The performance of the introduced active classifier (ATPCRC) can be batter than that of many other competing methods. (2) The outperformance of the proposed ATPCRC method is significant for the Honda and Extended Yale datasets which have face images with a high variability.

Table 3 compares our proposed ATPCRC with the trivial active TPTSSR. The trivial active TPTSSR is obtained by setting the $\eta $ parameter of ATPCRC to zero. For the trivial active TPTSSR, the entire set of data samples $\mathbf{X}$ is used: those with ground-truth labels and those with predicted ones. From this table, we can see that the ATPCRC is superior to the trivial active TPTSSR in most of the cases. Thus, the use of weighted class-wise reconstruction residuals (i.e., Eq. (5)) was crucial for reaching a good performance.

Table 2. Average and standard deviation over ten random splits of the correct classification rate (%) using several methods.

Full size table

Table 3. Average recognition rate and standard deviation in % of a simple active TPTSSR and the proposed ATPCRC classifier.

Full size table

Statistical Significance. In the section we conduct a statistical analysis of the results. To this end, we use the well known paired sample t-test [24]. We adopt a confidence level of 95$\%$ (i.e., the statistical significance threshold p is set to 0.05). Table 2 shows the outcome of all paired sample t-tests. For a given competing approach, an underlined rate indicates that there is no significant statistical difference between the proposed ATPCRC and this competing approach. Among the 144 paired tests, the proposed ATPCRC was significantly better in 134 configurations representing 93.08% of the tested pairs.

Computational Time. We measure the computational time needed by the TPTSSR, SRC, and the proposed ATPCRC method. We fix the number of labeled images to 50% of the whole data and the remaining images are used as test images. Table 4 depicts the CPU time in seconds associated with classification of the whole test images. The experiments have been run using MATLAB on a 128 GB RAM intel core I7-6900k 8 cores 3.6 GHz CPU computer. As it can be seen, the proposed ATPCRC approach is much faster than the SRC method.

Table 4. CPU time (in seconds) of the SRC, TPTSSR and ATPCRC classifiers when 50% of the dataset are labeled images and the remaining 50% are test images.

Full size table

4 Conclusion

In this paper, we introduced an active Two Phase Collaborative Representation Classifier. Indeed, transforming the original TPTSSR (or any collaborative representation classifier) to an active classifier is a challenging task. The proposed fused class-wise reconstruction residual avoided adopting an ad-hoc sample-based confidence measure. Experiments conducted on four public images datasets show the outperformance of the proposed method over 12 classification methods. These experiments demonstrate that active learning can lead to a performance which is significantly better than that provided by the passive classifiers TPTSSR and SRC.

Notes

References

Al Rahhal, M.M., Bazi, Y., AlHichri, H., Alajlan, N., Melgani, F., Yager, R.R.: Deep learning approach for active classification of electrocardiogram signals. Inf. Sci. 345, 340–354 (2016)
Article Google Scholar
Ashfaq, R.A.R., Wang, X.Z., Huang, J.Z., Abbas, H., He, Y.L.: Fuzziness based semi-supervised learning approach for intrusion detection system. Inf. Sci. 378, 484–497 (2017)
Article Google Scholar
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)
MathSciNet MATH Google Scholar
Cai, D., He, X., Han, J.: Semi-supervised discriminant analysis. In: IEEE International Conference on Computer Vision, pp. 1–7 (2007)
Google Scholar
Chang, X., Ma, Z., Lin, M., Yang, Y., Hauptmann, A.: Feature interaction augmented sparse learning for fast kinect motion detection. IEEE Trans. Image Process. 26, 3911–3920 (2017)
Article MathSciNet Google Scholar
Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 208–215. ACM (2008)
Google Scholar
Dornaika, F., El Traboulsi, Y.: Learning flexible graph-based semi-supervised embedding. IEEE Trans. Cybern. 46(1), 206–218 (2016)
Article Google Scholar
Dornaika, F., El Traboulsi, Y.: Matrix exponential based semi-supervised discriminant embedding for image classification. Pattern Recogn. 61, 92–103 (2017)
Article Google Scholar
Dornaika, F., El Traboulsi, Y., Assoum, A.: Adaptive two phase sparse representation classifier for face recognition. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 182–191. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-02895-8_17
Chapter MATH Google Scholar
Dornaika, F., Traboulsi, Y.E.: Matrix exponential based semi-supervised discriminant embedding. Pattern Recogn. 61, 92–103 (2017)
Article Google Scholar
Drugman, T., Pylkkönen, J., Kneser, R.: Active and semi-supervised learning in ASR: benefits on the acoustic and language models. In: INTERSPEECH, pp. 2318–2322 (2016)
Google Scholar
El Traboulsi, Y., Dornaika, F., Assoum, A.: Kernel flexible manifold embedding for pattern classification. Neurocomputing 167, 517–527 (2015)
Article Google Scholar
Huang, H., Liu, J., Pan, Y.: Semi-supervised marginal fisher analysis for hyperspectral image classification. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 3, 377–382 (2012)
Article Google Scholar
Huang, S.J., Jin, R., Zhou, Z.H.: Active learning by querying informative and representative examples. In: Advances in Neural Information Processing Systems, pp. 892–900 (2010)
Google Scholar
Iwayemi, A., Zhou, C.: SARAA: semi-supervised learning for automated residential appliance annotation. IEEE Trans. Smart Grid 8(2), 779–786 (2017)
Google Scholar
Jafarpour, S., Xu, W., Hassibi, B., Calderbank, R.: Efficient and robust compressed sensing using optimized expander graphs. IEEE Trans. Inf. Theory 55(9), 4299–4308 (2009)
Article MathSciNet Google Scholar
Jain, P., Kapoor, A.: Active learning for large multi-class problems. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 762–769. IEEE (2009)
Google Scholar
Joshi, A.J., Porikli, F., Papanikolopoulos, N.: Multi-class active learning for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2372–2379. IEEE (2009)
Google Scholar
Li, C.G., Guo, J., Zhang, H.G.: Local sparse representation based classification. In: 2010 20th International Conference on Pattern Recognition (ICPR 2010), pp. 649–652. IEEE (2010)
Google Scholar
Liu, W., Tao, D., Liu, J.: Transductive component analysis. In: Eighth IEEE International Conference on Data Mining, ICDM 2008, pp. 433–442. IEEE (2008)
Google Scholar
Nguyen, H.T., Smeulders, A.: Active learning using pre-clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 79. ACM (2004)
Google Scholar
Nie, F., Xu, D., Tsang, I.W.H., Zhang, C.: Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE Trans. Image Process. 19(7), 1921–1932 (2010)
Article MathSciNet Google Scholar
Qiao, L., Chen, S., Tan, X.: Sparsity preserving discriminant analysis for single training image face recognition. Pattern Recogn. Lett. 31(5), 422–429 (2010)
Article Google Scholar
http://www.statisticssolutions.com/manova-analysis-paired-sample-ttest/
Wang, J., Lu, C., Wang, M., Li, P., Yan, S., Hu, X.: Robust face recognition via adaptive sparse representation. IEEE Trans. Cybern. 44(12), 2368–2378 (2014)
Article Google Scholar
Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98(6), 1031–1044 (2010)
Article Google Scholar
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Article Google Scholar
Xu, Y., Zhang, D., Yang, J., Yang, J.Y.: A two-phase test sample sparse representation method for use with face recognition. IEEE Trans. Circ. Syst. Video Technol. 21(9), 1255–1262 (2011)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

University of the Basque Country UPV/EHU, San Sebastian, Spain
F. Dornaika & Y. El Traboulsi
IKERBASQUE Foundation, Bilbao, Spain
F. Dornaika
Laboratoire CIAD, University of Bourgogne Franche-Comte, UTBM, 90010, Belfort, France
Y. Ruicheck

Authors

F. Dornaika
View author publications
You can also search for this author in PubMed Google Scholar
Y. El Traboulsi
View author publications
You can also search for this author in PubMed Google Scholar
Y. Ruicheck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to F. Dornaika .

Editor information

Editors and Affiliations

University of Trento, Povo, Italy
Elisa Ricci
Mapillary Research, Graz, Austria
Samuel Rota Bulò
University of Amsterdam, Amsterdam, The Netherlands
Cees Snoek
Fondazione Bruno Kessler, Povo, Italy
Oswald Lanz
Fondazione Bruno Kessler, Povo, Italy
Stefano Messelodi
University of Trento, Povo, Italy
Nicu Sebe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dornaika, F., El Traboulsi, Y., Ruicheck, Y. (2019). Active Two Phase Collaborative Representation Classifier for Image Categorization. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds) Image Analysis and Processing – ICIAP 2019. ICIAP 2019. Lecture Notes in Computer Science(), vol 11751. Springer, Cham. https://doi.org/10.1007/978-3-030-30642-7_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-30642-7_16
Published: 02 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30641-0
Online ISBN: 978-3-030-30642-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Active Two Phase Collaborative Representation Classifier for Image Categorization

Abstract

Similar content being viewed by others

Weak Correlation-Based Discriminative Dictionary Learning for Image Classification

Support Discrimination Dictionary Learning for Image Classification

Image Classification by Iterative Semi-Supervised Sparse Coding

Keywords

1 Introduction

2 Active Two Phase Collaborative Representation Classifier

3 Performance Study

4 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Active Two Phase Collaborative Representation Classifier for Image Categorization

Abstract

Similar content being viewed by others

Weak Correlation-Based Discriminative Dictionary Learning for Image Classification

Support Discrimination Dictionary Learning for Image Classification

Image Classification by Iterative Semi-Supervised Sparse Coding

Keywords

1 Introduction

2 Active Two Phase Collaborative Representation Classifier

3 Performance Study

4 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation