Skip to main content
Log in

Empowering Simple Binary Classifiers for Image Set Based Face Recognition

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Face recognition from image sets has numerous real-life applications including recognition from security and surveillance systems, multi-view camera networks and personal albums. An image set is an unordered collection of images (e.g., video frames, images acquired over long term observations and personal albums) which exhibits a wide range of appearance variations. The main focus of the previously developed methods has therefore been to find a suitable representation to optimally model these variations. This paper argues that such a representation could not necessarily encode all of the information contained in the set. The paper, therefore, suggests a different approach which does not resort to a single representation of an image set. Instead, the images of the set are retained in their original form and an efficient classification strategy is developed which extends well-known simple binary classifiers for the task of multi-class image set classification. Unlike existing binary to multi-class extension strategies, which require multiple binary classifiers to be trained over a large number of images, the proposed approach is efficient since it trains only few binary classifiers on very few images. Extensive experiments and comparisons with existing methods show that the proposed approach achieves state of the art performance for image set classification based face and object recognition on a number of challenging datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • An, S., Hayat, M., Khan, S. H., Bennamoun, M., Boussaid, F., & Sohel, F. (2015). Contractive rectifier networks for nonlinear maximum margin classification. In Proceedings of the IEEE international conference on computer vision (pp. 2515–2523)

  • Arandjelovic, O., Shakhnarovich, G., Fisher, J., Cipolla, R., & Darrell, T. (2005). Face recognition with image sets using manifold density divergence. In 2005 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 581–588)

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  MATH  Google Scholar 

  • Cevikalp, H., & Triggs, B. (2010). Face recognition based on image sets. In IEEE conference on computer vision and pattern recognition, 2010. CVPR 2010 (pp. 2567–2573). IEEE.

  • Chang, C. C., & Lin, C. J. (2011). Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 27.

    Google Scholar 

  • Chatfield, K., Simonyan, K., Vedaldi, A. & Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets. In BMVC.

  • Chien, J. T., & Wu, C. C. (2002). Discriminant waveletfaces and nearest feature classifiers for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(12), 1644–1649.

    Article  Google Scholar 

  • Davis, J. V., Kulis, B., Jain, P., Sra, S. & Dhillon, I. S. (2007). Information-theoretic metric learning. In Proceedings of the 24th international conference on machine learning (pp. 209–216). ACM.

  • Eth80. http://www.d2.mpi-inf.mpg.de/Datasets/ETH80. Accessed 05 July 2014.

  • Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.

    MATH  Google Scholar 

  • Fanelli, G., Gall, J., & Van Gool, L. (2011a). Real time head pose estimation with random regression forests. In 2011 IEEE conference on computer vision and pattern recognition (CVPR) pp. 617–624. IEEE.

  • Fanelli, G., Weise, T., Gall, J., & Van Gool, L. (2011b). Real time head pose estimation from consumer depth cameras. Pattern Recognition, 6835, 101–110.

    Google Scholar 

  • Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2004). Neighbourhood components analysis. In Advances in neural information processing systems, (p. 17).

  • Gross, R., & Shi, J. (2001). The cmu motion of body (mobo) database. Technical report.

  • Harandi, M. T., Sanderson, C., Shirazi, S., & Lovell, B. C. (2011). Graph embedding discriminant analysis on grassmannian manifolds for improved image set matching. In 2011 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2705–2712).

  • Hayat, M., & Bennamoun, M. (2014). An automatic framework for textured 3d video-based facial expression recognition. IEEE Transactions on Affective Computing, 5(3), 301–313.

    Article  Google Scholar 

  • Hayat, M., Bennamoun, M. & An, S. (2014). Learning non-linear reconstruction models for image set classification. In 2014 IEEE conference on computer vision and pattern recognition (CVPR).

  • Hayat, M., Bennamoun, M. & An, S. (2014). Reverse training: An efficient approach for image set classification. In: D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars (eds.) Computer Vision ECCV 2014, Lecture Notes in Computer Science, vol. 8694, pp. 784–799. Springer International Publishing.

  • Hayat, M., Bennamoun, M., & An, S. (2015). Deep reconstruction models for image set classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(4), 713–727.

    Article  Google Scholar 

  • Hayat, M., Bennamoun, M. & El-Sallam, A. A. (2013). Clustering of video-patches on grassmannian manifold for facial expression recognition from 3d videos. In 2013 IEEE workshop on applications of computer vision (WACV).

  • Hu, Y., Mian, A. S., & Owens, R. (2012). Face recognition using sparse approximated nearest points between image sets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(10), 1992–2004.

    Article  Google Scholar 

  • Huang, Z., Shan, S., Zhang, H., Lao, S., Kuerban, A. & Chen, X. (2013). Benchmarking still-to-video face recognition via partial and local linear discriminant analysis on COX-S2V dataset. In Computer Vision–ACCV 2012 (pp. 589–600). Springer.

  • Huang, Z., Wang, R., Shan, S. & Chen, X. (2014). Learning euclidean-to-riemannian metric for point-to-set classification.

  • Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S. & Darrell, T. (2014), Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.

  • Khan, S. H., Bennamoun, M., Sohel, F. & Togneri, R. (2014). Automatic feature learning for robust shadow detection. In IEEE 27th international conference on computer vision and pattern recognition (CVPR) (pp. 1939–1946). IEEE.

  • Khan, S. H., Hayat, M., Bennamoun, M., Togneri, R., & Sohel, F. A. (2016). A discriminative representation of convolutional features for indoor scene recognition. IEEE Transactions on Image Processing, 25(7), 3372–3383.

    Article  MathSciNet  Google Scholar 

  • Kim, M., Kumar, S., Pavlovic, V. & Rowley, H. (2008). Face tracking and recognition with visual constraints in real-world videos. In 2008 IEEE conference on computer vision and pattern recognition (CVPR), (pp. 1–8). IEEE.

  • Kim, T. K., Kittler, J., & Cipolla, R. (2007). Discriminative learning and recognition of image set classes using canonical correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 1005–1018.

    Article  Google Scholar 

  • Krizhevsky, A., Sutskever, I. & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In NIPS (pp. 1097–1105).

  • Kumar, N., Berg, A.C., Belhumeur, P. N., & Nayar, S. K. (2009). Attribute and simile classifiers for face verification. In IEEE international conference on computer vision (ICCV).

  • Lee, K. C., Ho, J., Yang, M. H. & Kriegman, D. (2003). Video-based face recognition using probabilistic appearance manifolds. In 2003 IEEE conference on computer vision and pattern recognition (CVPR), vol. 1, pp. I–313. IEEE.

  • Leibe, B. & Schiele, B. (2003). Analyzing appearance and contour based methods for object categorization. In 2003 IEEE conference on computer vision and pattern recognition (CVPR) vol. 2, pp. II–409. IEEE.

  • Li, B. Y., Mian, A. S., Liu, W. & Krishna, A. (2013). Using kinect for face recognition under varying poses, expressions, illumination and disguise. In 2013 IEEE workshop on applications of computer vision (WACV) (pp. 186–192). IEEE.

  • Lu, J., Wang, G. & Moulin, P. (2013). Image set classification using holistic multiple order statistics features and localized multi-kernel metric learning. In 2013 IEEE conference on international conference on computer vision (ICCV)

  • Ng, H. W. & Winkler, S. (2014). A data-driven approach to cleaning large face datasets. In IEEE international conference on image processing, Paris, France, 27–30 Oct. IEEE.

  • Oja, E. (1983). Subspace methods of pattern recognition (Vol. 4). Baldock: Research Studies Press England.

    Google Scholar 

  • Ojala, T., Pietikäinen, M., & Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.

    Article  MATH  Google Scholar 

  • Ortiz, E., Wright, A. & Shah, M. (2013). Face recognition in movie trailers via mean sequence sparse representation-based classification. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3531–3538). doi:10.1109/CVPR.2013.453

  • Parkhi, O. M., Vedaldi, A. & Zisserman, A.(2015). Deep face recognition. In British machine vision conference.

  • Ross, D. A., Lim, J., Lin, R. S., & Yang, M. H. (2008). Incremental learning for robust visual tracking. International Journal of Computer Vision, 77(1–3), 125–141.

    Article  Google Scholar 

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision (IJCV), 115(3), 211–252. doi:10.1007/s11263-015-0816-y.

    Article  MathSciNet  Google Scholar 

  • Shakhnarovich, G., Fisher, J. W., & Darrell, T. (2002). Face recognition from long-term observations. In European conference on computer vision (ECCV), (pp. 851–865). Springer.

  • Sharif Razavian, A., Azizpour, H., Sullivan, J. & Carlsson, S. (2014). Cnn features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 806–813).

  • Sharma, A., Kumar, A., Daume, H. & Jacobs, D. W. (2012). Generalized multiview analysis: A discriminative latent space. In 2012 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2160–2167). IEEE.

  • Sugiyama, M. (2007). Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. The Journal of Machine Learning Research, 8, 1027–1061.

  • Uzair, M., Mahmood, A., Mian, A. & McDonald, C. (2013). A compact discriminative representation for efficient image-set classification with application to biometric recognition. In 2013 International conference on biometrics (ICB). IEEE.

  • Vedaldi, A., & Zisserman, A. (2012). Efficient additive kernels via explicit feature maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(3), 480–492.

    Article  Google Scholar 

  • Vincent, P. & Bengio, Y. (2001). K-local hyperplane and convex distance nearest neighbor algorithms. In Advances in neural information processing systems (pp. 985–992).

  • Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.

    Article  Google Scholar 

  • Wang, R. & Chen, X. (2009). Manifold discriminant analysis. In IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009, (pp. 429–436). IEEE.

  • Wang, R., Guo, H., Davis, L. S. & Dai, Q. (2012). Covariance discriminative learning: A natural and efficient approach to image set classification. In 2012 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2496–2503). IEEE.

  • Wang, R., Shan, S., Chen, X. & Gao, W. (2008). Manifold-manifold distance with application to face recognition based on image set. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–8). IEEE.

  • Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10, 207–244.

    MATH  Google Scholar 

  • Yamaguchi, O., Fukui, K. & Maeda, K. I. (1998). Face recognition using temporal image sequence. In 1998 IEEE international conference on automatic face and gesture recognition (FG) (pp. 318–323). IEEE.

  • Yang, M., Zhu, P., Gool, L. V. & Zhang, L. (2013). Face recognition based on regularized nearest points between image sets, pp. 1–7.

  • Yang, P., Shan, S., Gao, W., Li, S. Z. & Zhang, D. (2004). Face recognition using ada-boosted gabor features. InProceedings on sixth IEEE international conference on automatic face and gesture recognition, 2004 (pp. 356–361). IEEE.

  • Yin, L., Chen, X., Sun, Y., Worm, T. & Reale, M. (2008). A high-resolution 3d dynamic facial expression database. In 8th IEEE international conference on automatic face gesture recognition, FG ’08 (pp. 1 –6).

  • Zhu, P., Zhang, L., Zuo, W. & Zhang, D. (2013). From point to set: Extend the learning of distance metrics. In 2013 IEEE conference on international conference on computer vision (ICCV). IEEE.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Munawar Hayat.

Additional information

Communicated by K. Kise.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hayat, M., Khan, S.H. & Bennamoun, M. Empowering Simple Binary Classifiers for Image Set Based Face Recognition. Int J Comput Vis 123, 479–498 (2017). https://doi.org/10.1007/s11263-017-1000-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-017-1000-3

Keywords

Navigation