Skip to main content
Log in

Real-time detection tracking and recognition algorithm based on multi-target faces

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

At present, face recognition algorithms are facing some problems with poor face tracking and low real-time performance in multi-target recognition scenarios. This paper details a multi-target face real-time detection tracking and recognition algorithm, including three methods of fast-tracking, fast detection, and quick recognition. The first step offers a new network based on GOTURN for achieving fast face tracking. The prior information of the previous frame image used to predict the position of the face boxes at the current frame. The second step is based on MTCNN for face detection, using the prior information of the present structure to avoid generating massive of invalid candidate boxes, thereby achieving rapid detection of faces. Finally, fast face recognition realized by reduced MobileFaceNet. By avoiding repeated exposure and repeated identification of the same target, the algorithm successfully transforms a multi-target scene into a single-target scene. On the OTB2015 and 300_VW test sets, the evaluation trackers tracked faces with an accuracy rate of 92.2% and 99.6% respectively. On the Xiph test set, multi-target detection and tracking face speed reached 102fps on the CPU. Compared with the original MobileFaceNet, the streamlined network has an accuracy rate of 99.1% on LFW, the feature extraction speed increased by 25%, and the model size reduced by 45%. Experimental results show that the algorithm has high recognition accuracy and real-time performance in multi-target recognition scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Babenko B , Yang M H , Belongie S . (2009). Visual tracking with online multiple instance learning[C]// 2009 IEEE conference on computer vision and pattern recognition. IEEE

  2. Chen S, Liu Y, Gao X, et al. (2018). MobileFaceNets: efficient CNNs for accurate real-time face verification on Mobile devices[J]. arXiv preprint arXiv: 1804.07573

  3. Chollet F (2016). Xception: deep learning with Depthwise separable convolutions[J]. arXiv preprint arXiv: 1610.02357

  4. Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. PAMI 25(5):564–577

    Article  Google Scholar 

  5. Deng J, Guo J, Xue N, et al. (2018). ArcFace: additive angular margin loss for deep face recognition[J]. arXiv preprint arXiv: 1801.07698

  6. Dinh T B , Vo N , Gérard G. Medioni.(2011). Context tracker: exploring supporters and distracters in unconstrained environments[C]// the 24th IEEE conference on computer vision and pattern recognition, CVPR, Colorado Springs, CO, USA, 20–25 June 2011. IEEE

  7. Fang G, Li J, Wang Y (2019) Real-time face recognition on ARM platform based on deep learning [J]. Journal Of Computer Applications 39(8):2217–2222

    Google Scholar 

  8. Grabner H, Leistner C, Bischof H (2008) Semi-supervised on-line boosting for robust tracking[C]// European conference on computer vision. Springer, Berlin, Heidelberg

    Google Scholar 

  9. Hare S, Saffari A, Torr P H S. (2011). Struck: structured output tracking with kernels[C]// IEEE international conference on computer vision, ICCV 2011, Barcelona, Spain, November 6–13, 2011. IEEE

  10. He K, Zhang X, Ren S, et al. (2016). Deep residual learning for image recognition[C]//2014 IEEE conference on computer vision and pattern recognition (CVPR). IEEE Computer Society: 770–778

  11. David Held, Thrun, Silvio Savarese.(2016). Learning to track at 100fps with deep regression networks[C]//European Conference on Computer Vision.Cham:749–765

  12. Henriques JF, Rui C, Martins P et al (2012) Exploiting the cirulant structure oftracking-by-detection with kernels[J]. European Conference on Computer Vision 7575(1):702–715

    Google Scholar 

  13. Junseok Kwon, Kyoung Mu Lee. (2010). Visual tracking decomposition[C]// the twenty-third IEEE conference on computer vision and pattern recognition, CVPR 2010, San Francisco, CA, USA, 13–18 June 2010. IEEE

  14. Junseok Kwon, Kyoung Mu Lee. (2011). Tracking by sampling trackers[C]// IEEE international conference on computer vision, ICCV 2011, Barcelona, Spain, November 6–13, 2011. IEEE

  15. Li B, Wu W, Zheng Z, Yan J (2018) High performance visual tracking with Siamese region proposal network. CVPR:8971–8980

  16. Wu Y. Lim J. Yang MH (2016). Online object tracking:a benchmark[C]//IEEE Conference on Computer Vision and Pattern Recognition. Portland:IEEE:2411–2418

  17. Liu B , Huang J , Yang L , et al.(2011). Robust tracking using local sparse appearance model and K-selection[C]// the 24th IEEE conference on computer vision and pattern recognition, CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011. IEEE

  18. Liu W, Wen Y, Yu Z et al (2017) SphereFace: deep Hypersphere embedding for face recognition[C]//2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE Computer Society:6738–6746

  19. Nam H, Han B.(2016). Learning multi—domain convolutional neural networks for visual tracking[C].Las Vegas:IEEE Interna— tional conference on computer vision and pattern Reeogni— tion

  20. Razavian A S, Azizpour H, Sullivan J, et al. (2014). CNN features off-the-shelf: an astounding baseline for recognition[C]//computer vision and pattern recognition workshops (CVPRW), IEEE Computer Society:512–519

  21. Sandler M, Howard A, Zhu M, et al. (2018). MobileNetV2: inverted residuals and linear bottlenecks[J]. arXiv preprint arXiv: 1801.04381

  22. Stalder S, Grabner H ,Gool LV . (2009). Beyond semi-supervised tracking: tracking should be as simple as detection, but not simpler than recognition[C]// computer vision workshops (ICCV workshops), 2009 IEEE 12th international conference on. IEEE

  23. Taigman Y, Ming Y, Ranzato M, et al. (2014). DeepFace: closing the gap to human-level performance in face verification[C]//2014 conference on computer vision and pattern recognition (CVPR), IEEE Computer Society: 1701–1708

  24. G Tzimiropoulos. (2015). Project-out cascaded regression with an application to face alignment. In proceedings of the IEEE conference on computer vision and pattern recognition, pages 3659–3667

  25. Wu Y, Lim J, Yang M.(2013). Online object tracking:a benchmark[C].Portland:computer vision and pattern recognition

  26. Wu Y, Lim J, Yang M (2015) Object tracking benchmark[J]. IEEETransactions on Pattern Analysis&Machine InteHigenee 37(9):1834–1848

    Article  Google Scholar 

  27. Yang S, Luo P, Loy C-C, Tang X (2016) Wider face: a face detection benchmark. CVPR 2(3):5

    Google Scholar 

  28. Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters 23(10):1499–1503

    Article  Google Scholar 

  29. Zhang X, Zhou X, Lin M, et al. (2017). ShuffleNet: an extremely efficient convolutional neural network for Mobile devices[J]. arXiv preprint arXiv: 1707.01083

Download references

Acknowledgements

This research is supported by a fund from Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System (Grant No. znxx2018QN06), Major Project for New Generation of AI (Grant No. 2018AAA0100400) and the National Natural Science Foundation of Hunan (Grant No.2018JJ2098).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Wang, Y., Fang, G. et al. Real-time detection tracking and recognition algorithm based on multi-target faces. Multimed Tools Appl 80, 17223–17238 (2021). https://doi.org/10.1007/s11042-020-09601-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09601-2

Keywords

Navigation