Abstract
This paper presents a regularized patch-based representation for single sample per person face recognition. We represent each image by a collection of patches and seek their sparse representations under the gallery images patches and intra-class variance dictionaries at the same time. For the reconstruction coefficients of all the patches from the same image, by imposing a group sparsity constraint on the reconstruction coefficients corresponding to the patches from the gallery images, and by imposing a sparsity constraint on the reconstruction coefficients corresponding to the intra-class variance dictionaries, our formulation harvests the advantages of both patch-based image representation and global image representation, i.e. our method overcomes the side effect of those patches which are severely corrupted by the variances in face recognition, while enforcing those less discriminative patches to be constructed by the gallery patches from the right person. Moreover, instead of using the manually designed intra-class variance dictionaries, we propose to learn the intra-class variance dictionaries which not only greatly accelerate the prediction of the probe images but also improve the face recognition accuracy in the single sample per person scenario. Experimental results on the AR, Extended Yale B, CMU-PIE, and LFW datasets show that our method outperforms sparse coding related face recognition methods as well as some other specially designed single sample per person face representation methods, and achieves the best performance. These encouraging results demonstrate the effectiveness of regularized patch-based face representation for single sample per person face recognition.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
To learn the variation dictionary with SVDL, all subjects in the generic set should have images for a given type of variation. For LFW, the number of the variation type is unknown, and it is also impossible to find the images with the same type of variation. Therefore it is impossible to learn the dictionary with SVDL on this dataset. It is worth noting that to make SVDL applicable to LFW dataset, Yang et al. (2013) used the data from CMU-MultiPIE dataset as the generic set to learn the dictionary, but such setting is different from ours. For fair comparison, the performance of SVDL under such setting is not included in our paper. For the Extended Yale B and CMU-PIE datasets, some persons in the generic set don’t have images for some type of variations. If we remove these persons, it would be unfair to compare SVDL with our method and other baseline methods which use the generic set. Therefore we only include the results of SVDL on the AR dataset.
(PC)\(^2\)A, E(PC)\(^2\)A, FLDA-Block, FLDA-SVD, CRC, PCRC, PSRC, and SRC don’t use the generic dataset, so the performance of these methods under S2 and S3 is the same.
The LFW dataset is usually used for the face verification problem.
In order to obtain better face verification and recognition systems for such datasets, typically one needs to use more sophisticated alignment methods for more complicated face shape models.
We run the Matlab implementation of these methods on a Windows Server (64bit) with a 2.13GHz CPU and 16GB RAM.
For the CMU-PIE dataset, the manually designed dictionaries are very large for other poses (C27, C05, C29), therefore the prediction is extremely very expensive, and we cannot finish it on our machine in one day. Therefore we didn’t report the performance based on the manually designed or learnt dictionaries under those poses. This fact further proves the importance and necessity in learning the intra-class variance dictionaries.
Please also note that all the results are based on the same parameters, which are designed for intensity features. Fine-tuned parameters for other features may further improve their performance.
The computational cost for updating the sparse coefficients with feature-sign-search algorithm is less expensive than that of updating the intra-class variance dictionary.
References
Ahonen, T., Hadid, A., & Pietikainen, M. (2006). Face description with local binary patterns: Application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 2037–2041.
Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720.
Cai, J. F., Candes, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982.
Chen, S., Liu, J., & Zhou, Z. H. (2004a). Making FLDA applicable to face recognition with one sample per person. Pattern Recognition, 37, 1553–1555.
Chen, S., Zhang, D., & Zhou, Z. H. (2004b). Enhanced (PC)\(^2\)A for face recognition with one training image per person. Pattern Recognition Letters, 25, 1173–1181.
Deng, W., Hu, J., Guo, J., Cai, W., & Feng, D. (2010). Robust, accurate and efficient face recognition from a single training image: A uniform pursuit approach. Pattern Recognition, 43(5), 1748–1762.
Deng, W., Hu, J., & Guo, J. (2012). Extended SRC: Undersampled face recognition via intraclass variant dictionary. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 1864–1870.
Deng, W., Hu, J., & Guo, J. (2013). In defense of sparsity based face recognition. In IEEE Conference on Computer Vision and Pattern Recognition.
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32(2), 407–499.
Gao, S., Tsang, I. W., & Chia, L. T. (2013). Sparse representation with kernels. IEEE Transactions on Image Processing, 22(2), 423–434.
Gao, Q., Zhang, L., & Zhang, D. (2008). Face recognition using FLDA with single training image per person. Applied Mathematics and Computation, 205, 726–734.
Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.
Gottumukkal, R., & Asari, V. K. (2004). An improved face recognition technique based on modular PCA approach. Pattern Recognition Letters, 25(4), 429–436.
Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report. Amherst, MA: University of Massachusetts.
Kim, K. I., Jung, K., & Kim, H. J. (2002). Face recognition using kernel principal component analysis. IEEE Signal Processing Letters, 9, 40–42.
Kim, T. K., Kittler, J., & Kittler, J. (2005). Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 318–327.
Kong, S., & Wang, D. (2012). A dictionary learning approach for classificaiton: separating the particularity and the commonality. In Proceedings of the European Conference on Computer Vision.
Kumar, R., Banerjee, A., Vemuri, B. C., & Pfister, H. (2011). Maximizing all margins: Pushing face recognition with kernel plurality. In Proceedings of the International Conference on Computer Vision.
Lee, H., Battle, A., Raina, R., & Ng, A. Y. (2006). Efficient sparse coding algorithms. In Proceedings of the Conference on Neural Information Processing Systems.
Liu, C., & Wechsler, H. (2002). Gabor feature based classification using the enhanced Fisher linear discriminant model for face recognition. IEEE Transactions on Image Processing, 11(4), 467–476.
Lu, J., Tan, Y. P., & Wang, G. (2011). Discriminative multi-manifold analysis for face recognition from a single training sample per person. In Proceedings of the International Conference on Computer Vision (pp. 1943–1950).
Martinez, A. M. (2002). Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(6), 748–763.
Martinez, A., & Benavente, R. (1998). The AR face database (Vol. 24).
Shan, S., Cao, B., Gao, W., & Zhao, D. (2002). Extended fisherface for face recognition from a single example image per person. In IEEE International Symposium on Circuits and Systems.
Sim, T., Baker, S., & Bsat, M. (2002). The CMU pose, illumination, and expression (PIE) database. In International Conference on Automatic Face and Gesture Recognition.
Su, Y., Shan, S., Chen, X., & Gao, W. (2010). Adaptive generic learning for face recognition from a single sample per person. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Tan, X., Chen, S., Zhou, Z. H., & Zhang, F. (2005). Recognizing partially occluded, expression variant faces from single training image per person with SOM and soft k-NN ensemble. IEEE Transactions on Neural Networks, 16, 875–886.
Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
Wolf, L., Hassner, T., & Taigman, Y. (2009). Similarity scores based on background samples. In Proceedings of the Asian Conference on Computer Vision.
Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–227.
Wu, J., & Zhou, Z. H. (2002). Face recognition with one training image per person. Pattern Recognition Letters, 23, 1711–1719.
Yang, M., Van Gool, L., & Zhang, L. (2013). Sparse variation dictionary learning for face recognition with a single training sample per person. In International Conference on Computer Vision.
Yang, J., Yin, W., Zhang, Y., & Wang, Y. (2009). A fast algorithm for edgepreserving variational multichannel image restoration. SIAM Journal on Imaging Sciences, 2(2), 569–592.
Yang, J., Zhang, D., Frangi, A. F., & Yang, J. Y. (2004). Two-dimensional PCA: A new approach to appearance-based face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(1), 131–137.
Yuan, X. T., Liu, X., & Yan, S. (2012). Visual classification with multitask joint sparse representation. IEEE Transactions on Image Processing, 21(10), 4349–4360.
Zhang, L., & Feng, X. (2011). Sparse representation or collaborative representation: Which helps face recognition? In Proceedings of the International Conference on Computer Vision.
Zhu, P., Zhang, L., Hu, Q., & Shiu, S. C. (2012). Multi-scale patch based collaborative representation for face recognition with margin distribution optimization. In Proceedings of the European Conference on Computer Vision.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Yi Ma.
Rights and permissions
About this article
Cite this article
Gao, S., Jia, K., Zhuang, L. et al. Neither Global Nor Local: Regularized Patch-Based Representation for Single Sample Per Person Face Recognition. Int J Comput Vis 111, 365–383 (2015). https://doi.org/10.1007/s11263-014-0750-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-014-0750-4