Incremental learning patch-based bag of facial words representation for face recognition in videos

Wang, Chao; Wang, Yunhong; Zhang, Zhaoxiang; Wang, Yiding

doi:10.1007/s11042-013-1562-1

Incremental learning patch-based bag of facial words representation for face recognition in videos

Published: 27 June 2013

Volume 72, pages 2439–2467, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chao Wang¹,
Yunhong Wang¹,
Zhaoxiang Zhang¹ &
…
Yiding Wang²

398 Accesses
3 Citations
Explore all metrics

Abstract

Video-based face recognition is a fundamental topic in image processing and video analysis, and presents various challenges and opportunities. In this paper, we introduce an incremental learning approach to video-based face recognition which efficiently exploits the spatiotemporal information in videos. Face image sequences are incrementally clustered based on their descriptors, and the representative face images of each cluster are picked out. The incremental algorithm of creating facial visual words is applied to construct a codebook using the descriptors of the representative face images. Continuously, with the quantization of the facial visual words, each descriptor extracted from patches is converted into codes, and codes from each region are pooled together into a histogram. The representation of the face image is generated by concatenating the histograms from all regions, which is employed to perform the categorization. In the online recognition, a similarity score matrix and a voting algorithm are employed to judge a face video’s identity. Recognition is performed online while face video sequence is continuous and the proposed method gives nearly realtime feedback. The proposed method achieves a 100 % verification rate on the Honda/UCSD database and 82 % on the YouTube datebase. Experimental results demonstrate the effectiveness and flexibility of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pyramid Mean Representation of Image Sequences for Fast Face Retrieval in Unconstrained Video Data

Eigen-PEP for Video Face Recognition

Online Face Recognition System Based on Local Binary Patterns and Facial Landmark Tracking

References

Aggarwal G, Chowdhury A, Chellappa R (2004) A system identification approach for video-based face recognition. In: Proc. ICPR, pp 175–178
Ahonen T, Matas J, He C, Pietikäinen M (2009) Rotation invariant image description with local binary pattern histogram fourier features. Image Anal 5575:61–70
Article Google Scholar
Carnegie RC (2003) Mean-shift blob tracking through scale space. In: Proc. CVPR, pp 234–240
Chang C, Lin C (2011) Libsvm: a library for support vector machines. ACM Trans Intel Syst Technol 2:27
Google Scholar
Cui Z, Shan S, Zhang H, Lao S, Chen X (2012) Image sets alignment for video-based face recognition. In: Proc. CVPR, pp 2626–2633
Fan W, Wang Y, Tan T (2005) Video-based face recognition using bayesian inference model. In: Audio-and video-based biometric person authentication, pp 122–130
Fischer M, Ekenel H, Stiefelhagen R (2011) Person re-identification in tv series using robust face recognition and user feedback. Multimed Tools Appl 55(1):83–104
Article Google Scholar
Gkalelis N, Mezaris V, Kompatsiaris I, Stathaki T (2013) Mixture subclass discriminant analysis link to restricted gaussian model and other generalizations. IEEE Trans Neural Netw Learn Syst 24(1):8–21
Google Scholar
Gou G, Shen R, Wang Y, Basu A (2011) Temporal-spatial face recognition using multi-atlas and markov process model. In: Proc. international conference on multimedia and expo, pp 1–4
Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Proc. ICPR, vol 2, pp 1458–1465
Hadid A, Pietikainen M (2004) From still image to video-based face recognition: an experimental analysis. In: Proc. automatic face and gesture recognition, pp 813–818
Hall P, Marshall D, Martin R (2000) Merging and splitting eigenspace models. IEEE Trans PAMI 22(9):1042–1049
Article Google Scholar
Hu Y, Mian A, Owens R (2011) Sparse approximated nearest points for image set classification. In: Proc. CVPR, pp 121–128
Huang K, Trivedi M (2002) Streaming face recognition using multicamera video arrays. In: Proc. ICPR, pp 213–216
Kim M, Kumar S, Pavlovic V, Rowley H (2008) Face tracking and recognition with visual constraints in real-world videos. In: Proc. CVPR, pp 1–8
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR, pp 2169–2178
Lee K, Ho J, Yang M, Kriegman D (2003) Video-based face recognition using probabilistic appearance manifolds. In: Proc. CVPR, pp 313–320
Lee K, Ho J, Yang M, Kriegman D (2005) Visual tracking and recognition using probabilistic appearance manifolds. Comput Vis Image Underst 99:303–331
Article Google Scholar
Li Z, Imai J, Kaneko M (2010) Robust face recognition using block-based bag of words. In: Proc. ICPR, pp 1285–1288. IEEE
Liu X, Cheng T (2003) Video-based face recognition using adaptive hidden markov models. In: Proc. CVPR, pp 340–345
Liu L, Wang Y, Tan T (2007) Online appearance model learning for video-based face recognition. In: Proc. CVPR, pp 1–7
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Matta F, Dugelay J (2006) Person recognition using human head motion information. In: Articulated motion and deformable objects, pp 326–335
Matta F, Dugelay J (2009) Person recognition using facial video information: a state of the art. J Vis Lang Comput 20(3):180–187
Article Google Scholar
Mian A (2011) Online learning from local features for video-based face recognition. Pattern Recogn 44(5):1068–1075
Article MATH Google Scholar
Phillips P, Grother P, Micheals R, Blackburn D, Tabassi E, Bone J (2003) Face recognition vendor test 2002 results. Evaluation report
Poh N, Chan C, Kittler J, Marcel S, Mc Cool C, Rua E, Alba Castro J, Villegas M, Paredes R (2010) An evaluation of video-to-video face verification. IEEE Trans Inf Forensics Secur 5(4):781–801
Article Google Scholar
Schneider J, Borlund P (2007) Matrix comparison, part 1: motivation and important issues for measuring the resemblance between proximity measures or ordination results. J Am Soc Inf Sci Technol 58(11):1586–1595
Article Google Scholar
Schwarze T, Riegel T, Han S, Hutter A, Nowak S, Ebel S, Petersohn C, Ndjiki-Nya P (2013) Role-based identity recognition for tv broadcasts. Multimed Tools Appl 63(2):501–520
Article Google Scholar
Seo H, Milanfar P (2010) Training-free, generic object detection using locally adaptive regression kernels. IEEE Trans PAMI 32(9):1688–1704
Article Google Scholar
Shan S, Gao W, Cao B, Zhao D (2003) Illumination normalization for robust face recognition against varying lighting conditions. In: International workshop on analysis and modeling of faces and gestures, pp 157–164
Swain M, Ballard D (1991) Color indexing. Int J Comput Vis 7(1):11–32
Article Google Scholar
Van De Sande K, Gevers T, Snoek C (2009) Evaluating color descriptors for object and scene recognition. IEEE Trans PAMI 32:1582–1596
Article Google Scholar
Vedaldi A, Zisserman A (2010) Efficient additive kernels via explicit feature maps. In: Proc. CVPR, pp 3539–3546
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proc. CVPR, p 511. Intel, Microprocessor Research Labs
Yilmazturk M, Ulusoy I, Cicekli N (2013) Online annotation of faces in personal videos by sequential learning. Multimed Tools Appl 63(3):591–613
Article Google Scholar
Zhang L, Chu R, Xiang S, Liao S, Li S (2007) Face detection based on multi-block lbp representation. In: Advances in biometrics, pp 11–18

Download references

Acknowledgements

This work is funded by the National Basic Research Program of China (No. 2010CB327902), the National Natural Science Foundation of China (No. 61005016, No. 61061130560), the National High-tech R&D Program of China (2011AA010502), the Open Projects Program of National Laboratory of Pattern Recognition, and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Laboratory of Intelligent Recognition and Image Processing, Beijing Key Laboratory of Digital Media, School of Computer Science and Engineering, Beihang University, Beijing, China
Chao Wang, Yunhong Wang & Zhaoxiang Zhang
School of Information Engineering, North China University of Technology, Beijing, China
Yiding Wang

Authors

Chao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yunhong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoxiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yiding Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaoxiang Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, C., Wang, Y., Zhang, Z. et al. Incremental learning patch-based bag of facial words representation for face recognition in videos. Multimed Tools Appl 72, 2439–2467 (2014). https://doi.org/10.1007/s11042-013-1562-1

Download citation

Published: 27 June 2013
Issue Date: October 2014
DOI: https://doi.org/10.1007/s11042-013-1562-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incremental learning patch-based bag of facial words representation for face recognition in videos

Abstract

Access this article

Similar content being viewed by others

Pyramid Mean Representation of Image Sequences for Fast Face Retrieval in Unconstrained Video Data

Eigen-PEP for Video Face Recognition

Online Face Recognition System Based on Local Binary Patterns and Facial Landmark Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Incremental learning patch-based bag of facial words representation for face recognition in videos

Abstract

Access this article

Similar content being viewed by others

Pyramid Mean Representation of Image Sequences for Fast Face Retrieval in Unconstrained Video Data

Eigen-PEP for Video Face Recognition

Online Face Recognition System Based on Local Binary Patterns and Facial Landmark Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation