Keywords

1 Introduction

Face recognition has traditionally been posed as the problem of identifying a face from a single image. Good performance is usually rely on smartly designed classifiers. A number of classifiers were proposed, such as the Nearest Neighbor (NN) [4], A Local Support Vector Machine Approach [12], Sparse Representation-based Classifier [16] and Linear Regression Classification (LRC) [11]. These classifiers use a single test sample for classification and assume that images are taken in controlled environments. Their classification performance is generally dependent on the representation of individual test samples. However, facial appearance changes dramatically under variations in pose, illumination, expression, etc., and images captured under controlled conditions may not suffice for reliable recognition under the more varied conditions, that occur in real surveillance and video retrieval applications. Recently there has been growing interest in face recognition from image sets. Rather than supplying a single query image, the system supplies a set of images of the same unknown individual, and we expect that rich information provided in the image sets can improve the recognition rate.

Image sets classification algorithms include parametric methods [1, 8, 14] and non-parametric methods [2, 3, 5,6,7, 9, 13, 17]. Parametric method, firstly use the probability density functions to represent the image sets, then they use distance of divergence functions to measure the similarity between the image set (probability distribution), and they finally classify the test image set into the category which the closest image collection belongs. There are various difficulties in parametric methods, and the recognition performance is usually unsatisfactory. In recent years, researchers have focused on nonparametric methods that are independent of models. These methods do not have any assumptions about the distribution of image sets. Typical example of such methods is subspace algorithm.

This paper makes a brief review on dual linear regression classification (DLRC), then proposes the bilinear regression classification (BLRC) for image set retrieval. For BLRC algorithm, we first give the concept of uncorrelated subspace. Then, we introduce two strategies to constitute the unrelated subspace. Next, we calculate related distance metric and unrelated distance metric. Last, we introduce a combination metric for two new classifiers based on two constitution strategies of the unrelated subspace. Experimental results shows that the performance of BLRC is better than DLRC and several state-of-the-art classifiers for some benchmark.

2 Dual Linear Regression Classification

Suppose \(a\) and \(b\) be height and width of an image. Let two sets of (down-scaled) face images be represented by

$$\begin{aligned} X = [x_1,x_2, \cdots , x_m], \end{aligned}$$
(1)
$$\begin{aligned} Y = [y_1,y_2, \cdots , y_n], \end{aligned}$$
(2)

where \(x_i\ (i=1,2,\cdots , m)\) and \(y_j\ (j=1,2,\cdots , n)\) are column vectors of size \(ab\).

Column vectors of the image set \(X\) and the image set \(Y\) determine a subspace respectively, and an image located at the intersection of the two subspaces. That is, the “virtual” face image can be assumed vector \(V\) should be a linear combination of the column vectors of two image sets respectively. To calculate the distance between two image sets, our task is to find the “virtual” face \(V\) and Coefficient vectors \(\alpha = (\alpha _1,\alpha _2, \cdots , \alpha _m)^T\), \(\beta =( \beta _1,\beta _2, \cdots , \beta _n)^T\) such that

$$\begin{aligned} V = X\alpha = Y \beta . \end{aligned}$$
(3)

Considering that we have all down-scaled images standardized into unit vectors, we further require that

$$\begin{aligned} \sum _{i=1}^m \alpha _i=\sum _{j=1}^n \beta _j=1. \end{aligned}$$
(4)

When \(\hat{x}_i=x_i-x_m\ (i=1,2, \cdots , m-1)\), \(\hat{y}_j=y_j-y_n\ (j=1,2, \cdots , n-1)\). We have

$$\begin{aligned} V =[\hat{x}_1, \hat{x}_2, \cdots , \hat{x}_{m-1}]\hat{\alpha }+x_m=[\hat{y}_1, \hat{y}_2, \cdots , \hat{y}_{n-1}]\hat{\beta }+y_n, \end{aligned}$$
(5)

where \(\hat{\alpha }=(\alpha _1,\alpha _2, \cdots , \alpha _{m-1})^T\), \(\hat{\beta }=(\beta _1,\beta _2, \cdots , \beta _{n-1})^T\). Assume that there is a approximate solution \(\gamma =(\alpha _1,\alpha _2, \cdots , \alpha _{m-1},\beta _1,\beta _2, \cdots , \beta _{n-1})^T\in \mathrm{I\!R}^{(m+n-2)\times 1}\) for the equation

$$\begin{aligned} y_n-x_m=\hat{XY}\gamma , \end{aligned}$$
(6)

where \(\hat{XY}=[\hat{x}_1,\hat{x}_2, \cdots , \hat{x}_{m-1},-{\hat{y}_1},-{\hat{y}_2}, \cdots , -{\hat{y}_{n-1}}]\).

After obtaining the estimated value of the regression coefficient \(\gamma \), the “virtual” face image may be represented by the image set \(X\) and the image set \(Y\) respectively. Specifically, the “virtual” face image \(V_X\) reconstructed from the image set \(X\) is

$$\begin{aligned} V_X=[\hat{x}_1,\hat{x}_2, \cdots , \hat{x}_{m-1}][\hat{\gamma }_1,\hat{\gamma }_2, \cdots , \hat{\gamma }_{m-1}]^T+ x_m, \end{aligned}$$
(7)

while the “virtual” face image \(V_Y\) reconstructed from the image set \(Y\) is

$$\begin{aligned} V_Y=[\hat{y}_1,\hat{y}_2, \cdots , \hat{y}_{n-1}][\hat{\gamma }_m,\hat{\gamma }_{m+1}, \cdots , \hat{\gamma }_{m+n-2}]^T + y_n. \end{aligned}$$
(8)

Obviously, difference between the two reconstructed “virtual” face images is essentially the residual of the linear regression equation. Since the difference between the image set \(X\) and the image set \(Y\) can be expressed by calculating the difference between the two reconstructed “virtual” face images, we can use the residual of the linear regression equation to estimate the similarity of the two image sets subspace \(X\), \(Y\), namely

$$\begin{aligned} D(X,Y)=\Vert V_Y-V_X\Vert =\Vert (y_n-x_m)-\hat{XY}\hat{\gamma } \Vert . \end{aligned}$$
(9)

If the \(D(X,Y)\) value is smaller, the two image sets are closer to each other.

3 Bilinear Regression Classification

Inspired by DLRC, this section proposes bilinear regression classification. We show a simple flowchart in Fig. 1. The main contents of this section are organized as follows. First, the concept of unrelated subspaces is presented in Subsect. 3.1. Second, two strategies of constituting the unrelated subspace are described in Subsect. 3.2. Then, both related metric and unrelated metrics are computed in Subsect. 3.3. Last, the final distance metric for classification called combination metric, is described in Subsect. 3.4.

Fig. 1.
figure 1

The flowchart of the proposed BLRC

3.1 Definition of Unrelated Image Set Subspace

Definition 1

Suppose that there are C-classes image set in the training set, there are a total of M test image sets in the test set. For each image set in the test set, it is assumed that we need to calculate the distance between the test image set and the \(c^{th}\) image set, where \(c=1,2 \cdots C\), and the \(c^{th}\) image set in the training image set has \(N_c\) image samples. If there is a set U, U also contains \(N_c\) samples, and these \(N_c\) samples are from the other \(C-1\) classes except for the \(c^{th}\) class, then set U is called the unrelated image set subspace of the above test image set.

According to Definition 1, we need to select \(N_c\) image samples from the remaining \(C-1\) class samples that exclude \(c^{th}\) category to construct the unrelated image set subspace. In next subsection we will describe how to construct unrelated image set subspace.

3.2 Constructions of the Unrelated Subspace

The \(c^{th}\) image set \(X^c\) in the training image set is represented as follows:

$$\begin{aligned} X^c=[x_1^c,x_2^c, \cdots , x_{N_c}^c]\in \mathrm{I\!R}^{q\times N_c}. \end{aligned}$$
(10)

That means that the \(c^{th}\) image set in the training set defines a subspace, which can be represented by \(X^c\).

The subspace \(X\) determined by all images on the training set is as follows:

$$\begin{aligned} X=[X^1,X^2,\cdots , X^C ]\in \mathrm{I\!R}^{q\times l}, \end{aligned}$$
(11)

in which \(l=\sum _{c=1}^C{N_c}\).

The overall mean of training image set \(X\) is

$$\begin{aligned} X_{mean}=\frac{1}{l}\sum _{c=1}^C \sum _{i=1}^{N_c}{x_i^c}. \end{aligned}$$
(12)

The mean of the \(c^{th}\) image set on training image sets is \(X_{mean}^c=\frac{1}{N_c}\sum _{i=1}^{N_c}{x_i^c}\). Images in class \(c\) are centralized as \(\hat{x}_i^c=x_i^c-X_{mean}^c (c=1,2, \cdots , C; i=1,2, \cdots ; N_c) \), then the centralized training image set \(\hat{X}\) is formulated as follows:

$$\begin{aligned} \hat{X}=[x_1^1,x_2^1, \cdots , x_{N_1}^1, \cdots , x_{N_C}^c] \in \mathrm{I\!R}^{q\times l}. \end{aligned}$$
(13)

Similarly, the image subspace determined by the test image set \(Y\) presented by

$$\begin{aligned} Y=[y_1,y_2, \cdots , y_n]\in \mathrm{I\!R}^{q\times n}. \end{aligned}$$
(14)

For image set \(Y\), \(y_{mean}=\frac{1}{n}\sum _{i=1}^n {y_i}\), centralized as \(\hat{y}_i=y_i-y_{mean}\ (i=1,2, \cdots , n)\), and then the centralized testing image set \(\hat{Y}\) is formulated as follows:

$$\begin{aligned} \hat{Y}=[\hat{y}_1,\hat{y}_2, \cdots , \hat{y}_n]. \end{aligned}$$
(15)

Strategy 1. When calculating the manhatta distance between the test image set and the \(c^{th}\) image set, the distance between \(y_{mean}\) and a training sample \(X_i\) can be computed as:

$$\begin{aligned} d_i=|X_i-y_{mean}| (i=1,2, \cdots , l). \end{aligned}$$
(16)

The distance metric set \(D\) of the training image set \(X\) and \(y_{mean}\) is as follows:

$$\begin{aligned} D=[d_1,d_2, \cdots , d_l]\in \mathrm{I\!R}^{1\times l}. \end{aligned}$$
(17)

First, we remove the elements corresponding to the \(c^{th}\) class from \(D\) as \(\hat{D}\in R^{1\times (L-N_c)}\). Then we sort the elements in \(\hat{D}\) in ascend order and select \(N_c\) samples \(x_i^p (p\ne c)\) from \(X\), which corresponds to the smallest \(N_c\) distances from \(\hat{D}\) to constitute the unrelated subspace \(U_c\).

$$\begin{aligned} U_c =[u_1^c,u_2^c \cdots u_{N_C}^c]\in \mathrm{I\!R}^{q\times N_c}. \end{aligned}$$
(18)

The classifier based on strategy 1 will be called bilinear regression classification-I (BLRC-I).

Strategy 2. When calculating the distance between the test image set and the \(c^{th}\) training image set, assuming that training image set \(X\) and test image set \(Y\) determine a “virtual” face image space. Different from strategy 1, Strategy 2 does not directly calculate the distance between each image in the training image set \(X\) and the center \(y_{mean}\) of the test image set. Instead, it calculates the distance between the projection of each image in the training image set on the “virtual” face space and the center of the test image set \(y_{mean}\).

In order to obtain the joint coefficient vector of the two image sets \(\hat{X}\) and \(\hat{Y}\), the joint image set \(E\) and the test vector \(e\) can be constituted as:

$$\begin{aligned} E=[\hat{X}, -\hat{Y}]\in \mathrm{I\!R}^{q \times (l+n)}, \end{aligned}$$
(19)
$$\begin{aligned} e=y_{mean}-x_{mean}. \end{aligned}$$
(20)

Suppose that \(\theta \in \mathrm{I\!R}^{(L+n)\times 1}\) is the joint coefficient vector of \(\hat{X}\) and \(\hat{Y}\), which can be calculated by solving the optimization problem

$$\begin{aligned} \hat{\theta }=\arg \underset{\theta }{\min }\,\,{\Vert e-E\theta \Vert }^2+\lambda _1 {\Vert \theta \Vert }_2^2+\lambda _2 {\Vert \theta \Vert }_1, \end{aligned}$$
(21)

where \( \lambda _1>0, \lambda _2>0\) and \(\lambda _1+\lambda _2=1\).

After solving the regression coefficient \(\hat{\theta }\). Then, the Mahalanobis distance between the projection of each image in the training image set \(X\) on the “virtual” face space and the center of the test image set can be expressed by the following equation:

$$\begin{aligned} d_i=|\hat{X}_i\hat{\theta }_i-y_{mean}| (i=1,2, \cdots , l). \end{aligned}$$
(22)

The distance metric set \(D\) is formulated by

$$\begin{aligned} D=[d_1,d_2, \cdots , d_l]\in \mathrm{I\!R}^{1\times l}. \end{aligned}$$
(23)

First, we remove the elements corresponding to the \(c^{th}\) class from \(D\) as \(\hat{D}\in R^{1\times (L-N_c)}\). Then we sort the elements in \(\hat{D}\) in ascend order and select \(N_c\) samples \(x_i^p(p\ne c)\) from \(X\), which corresponds to the smallest \(N_c\) distances from \(\hat{D}\) to constitute the unrelated subspace \(U_c\),

$$\begin{aligned} U_c =[u_1^c,u_2^c, \cdots , u_{N_C}^c]\in \mathrm{I\!R}^{q\times N_c}. \end{aligned}$$
(24)

The classifier based on strategy 2 will be called bilinear regression classification-II (BLRC-II).

3.3 Related and Unrelated Distance Metric

Related Distance Metric. In Subsect. 3.2, we have obtained the class mean \(X_{mean}^c\) for each class in the training set. After centralized processing, the training image set of class c can be converted to

$$\begin{aligned} \hat{X}_c=[\hat{x}_1^c,\hat{x}_2^c, \cdots , \hat{x}_{N_c}^c ]\in \mathrm{I\!R}^{q\times N_c}. \end{aligned}$$
(25)

Now we need to calculate the distance between the test image set \(\hat{Y}\) and the \(c^{th}\) image set \(\hat{X}_c\) in the training set. To obtain the joint regression coefficients of the two image sets, the joint image set \(S_r^c\) and test vector \(s_r^c\) can be constituted as:

$$\begin{aligned} S_r^c=[\hat{X}_c, -\hat{Y}]\in \mathrm{I\!R}^{q\times (N_c+n)}, \end{aligned}$$
(26)

and

$$\begin{aligned} s_r^c=y_{mean}-x_{mean}^c. \end{aligned}$$
(27)

Assume that \(\gamma ^c\in \mathrm{I\!R}^{(N_c+n)\times 1}\) is the joint regression coefficient of \(\hat{X}_c\) and \(\hat{Y}\). According to the regression equation \(s_r^c=S_r^c \gamma ^c\), we can see that the solution of \(\gamma ^c\in R^{(N_c+n)\times 1}\) is

$$\begin{aligned} \hat{\gamma }^c={({S_r^c}^T S_r^c+\lambda I)}^{-1} {S_r^c}^T s_r^c. \end{aligned}$$
(28)

Then, the reconstructed “virtual” face image \(r_1\) obtained from the \(c^{th}\) training image set \(\hat{X}_c\) is

$$\begin{aligned} r_1=\hat{X}_c {[\gamma _1^c,\gamma _2^c, \cdots , \gamma _{N_c}^c]}^T+ x_{mean}^c. \end{aligned}$$
(29)

The reconstructed “virtual” face image \(r_2\) obtained from the test image set \(Y\) is

$$\begin{aligned} r_2=\hat{Y}{[\gamma _{N_c+1}^c,\gamma _{N_c+2}^c, \cdots , \gamma _{N_c+n}^c]}^T+ y_{mean}. \end{aligned}$$
(30)

Finally, the distance between \(r_1\) and \(r_2\) can be used to represent the distance between the test image set and the \(c^{th}\) image set in the training set, which is expressed by

$$\begin{aligned} d_r^c=\Vert r_1-r_2\Vert =\Vert s_r^c-S_r^c \gamma ^c\Vert . \end{aligned}$$
(31)

That is, the residual of the linear regression equation \(s_r^c=S_r^c\gamma ^c\) can be used to represent the distance between the test image set and the \(c^{th}\) image set in the training set.

Unrelated Distance Metric. The unrelated image set subspace \(U_c\) of the test image set has been obtained in Sect. 3.2. The mean vector of \(U_c\) is

$$\begin{aligned} u_{mean}^c=\frac{1}{N_c}\sum _{i=1}^{N_c}{u_i^c}. \end{aligned}$$
(32)

After centralization, the unrelated image set subspace \(U_c\) can be converted to

$$\begin{aligned} \hat{U}_c=[\hat{u}_1^c, \hat{u}_2^c, \cdots , \hat{u}_{N_c}^c]\in \mathrm{I\!R}^{q\times N_c}. \end{aligned}$$
(33)

Now we need to calculate the distance between the test image set \(\hat{Y}\) and the unrelated image set subspace \(U_c\). To obtain the joint regression coefficients of two image sets, the joint image set \(S_u^c\) and test vector \(s_u^c\) can be constituted as

$$\begin{aligned} S_u^c=[\hat{U}_c, -\hat{Y}]\in \mathrm{I\!R}^{q\times (N_c+n)}, \end{aligned}$$
(34)

and

$$\begin{aligned} s_u^c=y_{mean}-u_{mean}^c. \end{aligned}$$
(35)

Assume that \(\delta ^c\in \mathrm{I\!R}^{(N_c+n)\times 1}\) is the joint regression coefficient of \(\hat{U}_c\) and \(\hat{Y}\). According to the regression equation \(s_u^c=S_r^c \delta ^c\), it indicates that the solution of \(\delta ^c\in \mathrm{I\!R}^{(N_c+n)\times 1}\) is

$$\begin{aligned} \hat{\delta }^c={({S_u^c}^T S_u^c+\lambda I)}^{-1} {S_u^c}^T s_u^c. \end{aligned}$$
(36)

Then, the reconstructed “virtual” face image \(r_1\) obtained from the unrelated image set subspace \(\hat{U}_c\) is

$$\begin{aligned} r_1=\hat{U}_c {[\delta _1^c,\delta _2^c, \cdots , \delta _{N_c}^c]}^T+ u_{mean}^c. \end{aligned}$$
(37)

The reconstructed “virtual” face image \(r_2\) obtained from the test image set \(Y\) is

$$\begin{aligned} r_2=\hat{Y}{[\delta _{N_c+1}^c,\delta _{N_c+2}^c, \cdots , \delta _{N_c+n}^c]}^T+ y_{mean}. \end{aligned}$$
(38)

Finally, the distance between \(r_1\) and \(r_2\) can be used to represent the distance between the test image set and the unrelated image set subspace, which is expressed by

$$\begin{aligned} d_u^c=\Vert r_1-r_2\Vert =\Vert s_u^c-S_u^c \delta ^c\Vert . \end{aligned}$$
(39)

That is, the residual of the linear regression equation \(s_u^c=S_u^c\delta ^c\) can be used to represent the distance between the test image set and the unrelated image set subspace.

3.4 Combined Distance Metric

After obtaining the related distance metric \(d_r^c\) and the unrelated distance metric \(d_u^c\), we can construct a discriminative criterion by combine the two metric results in a suitable manner. It is obvious that if the test image set belongs to category \(c\), we hope that the distance between the test image set \(\hat{Y}\) and the \(c^{th}\) image set \(\hat{X}_c\) is closer, that is, the \(d_r^c\) is as small as possible. on the other hand, it is desirable to make the feature representations between the test image set \(\hat{Y}\) and the unrelated image set \(\hat{U}_c\) further, that is, the \(d_u^c\) is as large as possible. So we propose a new metric \(d_p^c\) as

$$\begin{aligned} d_p^c=\frac{d_r^c}{d_u^c}. \end{aligned}$$
(40)

The smaller the value of \(d_p^c\), the greater similarity between the test image set and the \(c^{th}\) image set. In other words our face image set recognition criterion selects the image set category \(c\) when \(d_p^c\) takes the minimum value, i.e.

$$\begin{aligned} \underset{c^*}{\min }\,\, \{d_p^c \mid c=1,2, \cdots , C\}. \end{aligned}$$
(41)

4 Experimental Results

This section provides extensive experimental results to evaluate the performance of two proposed classifiers: BLRC-I and BLRC-II. These experiments are conducted by using several benchmark datasets, i.e., image-based face recognition on the LFW face database [18] and AR face database [10], video-based face recognition on Honda/UCSD face database [8].

4.1 Experiments on LFW

LFW face database were captured in unconstrained environments such that there will be large variations in face images including pose, age, race, facial expression, lighting, occlusions, and background, etc. We use the aligned version of the LFW database, LFW-a to evaluate the recognition performance.

LFW-a contains more than 5,000 subjects. Each subject including images of the same individual in different poses. Note that all the images in LFW-a are of size \(250\times 250\). We manually crop the images into size of \(90\times 78\) (by removing 88 pixel margins from top, 72 from bottom, and 86 pixel margins from both left and right sides). An subset of LFW containing 62 persons, each people has more than 20 face images, is used for evaluating the algorithms. Our experimental setting is identical to that in [3]. The first 10 images of each subject are selected to form the training set, while the last 10 images are used as the probe images.

The proposed classifiers are compared with methods including sparse approximated nearest points (SANP) [5, 6], affine hull based image set distance (ASIHD) [2], convex hull based image set distance (CSIHD) [2], manifold discriminant analysis (MDA) [13], Dual Linear Regression Based Classification for Face Cluster Recognition (DLRC) [3] and Pairwise Linear Regression Classification for Image Set Retrieval (PLRC) [19]. All methods use the down-scaled images of size of \(10\times 10\) and \(15\times 10\) as in [3]. The classification results of all methods are illustrated in Table 1. For the images with size of \(10\times 10\), the proposed BLRC-I achieves identical performances with the MDA and PLRC-I method, and the recognition rate is 93.55\(\%\), which exceeds other classifiers. For BLRC-II, the recognition rate is 98.39\(\%\), obtains the best recognition rate compared with other methods. For images with size of \(15\times 10\), BLRC-I reaches 96.77\(\%\) recognition rate, BLRC-II, recognition rate is as high as 98.39\(\%\). The effects of BLRC-II are higher than those of other classifiers as shown in Table 1.

Table 1. The recognition rates (RR) on LFW database.

4.2 Experiments on AR

In this section, we study the performance of the proposed classifiers by using the well-known AR database. There are over 4000 face images of 126 subjects (70 men and 56 women) in the database. The face images of each individual contain different expressions, lighting conditions, wearing sun glasses and wearing scarf. We use the cropped AR database that includes 2600 face images of 100 individuals, First, we manually crop images into a size of \(90\times 70\) (by removing 38 pixel margins from top, 39 from bottom, and 24 pixel margins from left and 25 pixel margins right sides). Then downscale the clipped image to get \(40\times 40\) resolutions. In the experiments, the first 13 images of each subject are selected to form a training image set, and the remaining 13 images are composed of test image sets.

For this database, the proposed classifiers are compared with following state-of-the-art approaches: SANP [5, 6], ASIHD [2], CSIHD [2], DLRC [3] and PLRC [19]. The recognition rates of different classifiers have been presented in Table 2. Experimental results show that compared with other algorithms, the recognition accuracy of the BLRC-I and BLRC-II for image set recognition is as high as 97.98\(\%\), which shows obvious improvement on the classification performance.

Table 2. The recognition rates (RR) on AR database.

4.3 Honda/UCSD Face Database

The Honda/UCSD dataset contains 59 video clips of 20 subjects [8], all but one have at least 2 videos. 20 videos are called training videos and the remainder 39 test videos. The lengths of videos vary from 291 to 1168 frames. In order to maintain the comparability of the experimental results, we use face images consistent with other proceeding work [6].

This dataset has been used extensively for image-based face recognition, the accuracy has reached 100\(\%\) or close to 100\(\%\). Therefore, researchers have turned to experiment on the settings using a small amount frames. We carry out the experiment using the first 50 frames in each video for this database. The shared database by [5] is used. For the video clips that contain less than 50 frames, all frames are selected in the experiment. The following methods are chosen for comparison: DCC [7], MMD [15], MDA [13], AHISD [2], CHISD [2], MSM [17], SANP [5, 6], DLRC [3] and PLRC [19]. Table 3 lists all recognition rates of these classifiers on this database. We find that the recognition rates of BLRC-I, AHISD, RNP, DLRC and PLRC-I are all equal 87.18\(\%\), which is much better than those of DCC and MMD methods. The BLRC-II classifier obtains the highest accuracy 92.31\(\%\) for this database, which is obviously superior to the results of other types of recognition algorithms.

Table 3. The recognition rates (RR) on Honda/UCSD database.

5 Conclusion

In this paper, bilinear regression classification method (BLRC) is proposed for face image set recognition. Compared to DLRC, BLRC increases the unrelated subspace for classification. Based on different methods of constituting the unrelated subspace, two classifiers are proposed in this paper. In order to validate the performance of two classifiers, some experiments are evaluated on three database for face image set classification tasks. All experimental results confirm the effectiveness of two proposed classification algorithms.