Skip to main content
Log in

A real time target face tracking algorithm based on saliency detection and Camshift

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

To take advantage of the speed advantage of the Camshift and try to overcome the problem of its poor robustness in target tracking, in this paper, a real time target face tracking algorithm based on saliency detection and Camshift is proposed. Considering that the target to be tracked is more significant than the background in the frame, the saliency detection algorithm MBplus is first used to remove the background around the target as much as possible, so as to reduce the interference caused by the background to the Camshift tracking results. Then the Camshift is used to search and localize the targets in the processed video frames. At the same time, to compensate for the lack of tracking ability of Camshift for some characteristic targets, the Kalman filter is used to predict the position of the target in the current frame. Finally, the Kalman-predicted target position, the target position obtained by Camshift, are compared with the target tracked in the previous frame, and the position with high similarity is considered as the target tracking result of this paper. The experimental results show that the average tracking precision of the proposed target face tracking algorithm on the Birchfield database is 94.0%, its average tracking success rate on the NRC-IIT Facial Video Database is 100%, and even for the target faces with few attributes in ytcelebrity database, its tracking precision and tracking success rate are all 100%, which are superior to some state-of-the-art tracking algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Data Availability

The Birchfeld, FDTV20, NRC-IIT Facial Video, and ytcelebrity datasets used in this paper are publicly available, and we can easily get them. As for the sequences that our self-taken in this paper and our manual annotation results of the target faces in all datasets, it is temporarily inconvenient to be public due to the needs of our follow-up research.

References

  1. Aisard M, Blake A (1998) Condensation—conditional density propagation for visual tracking. Int J Comput Vis 29:5–28

    Article  Google Scholar 

  2. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional siamese networks for objecttracking. In: European conference on computer vision, pp 850–865

  3. Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016) Staple: complementary learners for real-time tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR) , pp 1401–1409

  4. Birchfield S (1998) Elliptical head tracking using intensity gradients and color histograms. In: Proceedings 1998 IEEE computer society conference on computer vision and pattern recognition (cat. no.98CB36231), pp 232–237

  5. Bradski GR (1998) Computer vision face tracking for use in a perceptual user interface. In: Fourth IEEE workshop on applications of computer vision 2(2), pp 12–21

  6. Cai B, Xu X, Xing X, Jia K, Miao J, Tao D (2016) BIT: biologically inspired tracker. IEEE Trans Image Process 25(3):1327–1339

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen S, Tan X, Wang B, Lu H, Hu X, Fu Y (2020) Reverse attention-based residual network for salient object detection. IEEE Trans Image Process 29:3763–3776

    Article  MATH  Google Scholar 

  8. Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799

    Article  Google Scholar 

  9. Choi J, Chang HJ, Jeong J, Demiris Y, Choi JY (2016) Visual tracking using attention-modulated disintegration and integration. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 4321–4330

  10. Henriques J, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: European conference on computer vision, pp 702–715

  11. Galoogahi HK, Fagg A, Lucey S (2017) Learning background-aware correlation filters for visual tracking. In: IEEE International conference on computer vision (ICCV), pp 1144–1152

  12. Gorodnichy DO (2005) Video-based framework for face recognition in video. In: The 2nd Canadian conference on computer and robot vision (CRV’05), pp 330–338

  13. Guan M, Wen C (2021) Adaptive multi-feature reliability re-determinative correlation filter for visual tracking. IEEE Trans Multimed 23:3841–3852

    Article  Google Scholar 

  14. Guo F, Wang W, Shen Z, Shao L, Tao D (2020) Motion-aware rapid video saliency detection. IEEE Trans Circuits Syst Video Technol 30(12):4887–4898

    Article  Google Scholar 

  15. Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  16. Horn BKP, Schunck BG (1980) Determining optical flow. Artif Intell 17:185–203

    Article  MATH  Google Scholar 

  17. Hu L, Li Z, Xu H, Fang B (2019) An improved vehicle detection and tracking model. In: International symposium for intelligent transportation and smart city (ITASC) 2019 proceedings, vol 127, pp 84–93

  18. Hu R, Zhang L, Deng Z, Zhu X (2021) Multi-scale graph fusion for co-saliency detection. In: Thirty-fifth AAAI conference on artificial intelligence, pp 7789–7796

  19. Huang Z, Fu C, Li Y, Lin F, Lu P (2019) Learning aberrance repressed correlation filters for real-time UAV tracking. In: EEE/CVF International conference on computer vision (ICCV), pp 2891–2900

  20. Jiang X, Yan F, Lu Y, Wang K, Guo S, Zhang T, Pang Y, Niu J, Xu M (2022) Joint Attention-Guided feature fusion network for saliency detection of surface defects. IEEE Trans Instrum Meas 71:1–12

    Google Scholar 

  21. Kim M, Kumar S, Pavlovic V, Rowley H (2008) Face tracking and recognition with visual constraints in real-world videos. In: IEEE conference on computer vision and pattern recognition, pp 1–8

  22. Kim J, Yu SJ, Kim D, Toh K, Lee S (2017) An adaptive local binary pattern for 3D hand tracking. Pattern Recogn 61:139–152

    Article  Google Scholar 

  23. Laaroussi K, Saaidi A, Masrar M, Satori K (2018) Human tracking using joint color-texture features and foreground-weighted histogram. Multimed Tools Appl 77(11):13947–13981

    Article  Google Scholar 

  24. Lee M, Park C, Cho S, Lee S (2022) Superpixel group-correlation network for co-saliency detection. In: IEEE international conference on image processing (ICIP), pp 806–810

  25. Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: European conference on computer vision, pp 254–265

  26. Li Y, Fu C, Ding F, Huang Z, Lu G (2020) Autotrack: towards high-performance visual tracking for UAV with automatic spatio-temporal regularization. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11920–11929

  27. Liu Z, Li J, Ye L, Sun G, Shen L (2017) Saliency detection for unconstrained videos using Superpixel-Level graph and spatiotemporal propagation. IEEE Trans Circ Syst Video Technol 27(12):2527–2542

    Article  Google Scholar 

  28. Lukežic A, Vojir T, Zajc LC, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: IEEE conference on computer vision and pattern recognition (CVPR) , pp 4847–4856

  29. Ma F, Sun X, Zhang F, Zhou Y, Li H (2023) What catch your attention in SAR images: saliency detection based on Soft-Superpixel lacunarity cue. IEEE Trans Geosci Remote Sens 61:1–17

    Google Scholar 

  30. Mondal A (2021) Occluded object tracking using object-background prototypes and particle filter. Appl Intell 51:5259–5279

    Article  Google Scholar 

  31. Mueller M, Smith N, Ghanem B (2017) Context-aware correlation filter tracking. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1387–1395

  32. Nam H, Han B (2016) Learning multi-domain convolutional neu-ral networks for visual tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 4293–4302

  33. Nawaz M, Yan H (2021) Saliency detection using deep features and Affinity-Based robust background subtraction. IEEE Trans Multimed 23:2902–2916

    Article  Google Scholar 

  34. Ning J, Yang J, Jiang S, Zhang L, Yang MH (2016) Object tracking via dual linear structured SVM and explicit feature map. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 4266–4274

  35. Park C, Lee M, Cho M, Lee S (2022) Saliency detection via global context enhanced feature fusion and edge weighted loss. In: IEEE International conference on image processing (ICIP), pp 811–815

  36. Pei L, Zhang H, Yang B (2022) Improved Camshift object tracking algorithm in occluded scenes based on AKAZE and Kalman. Multimed Tools Appl 81:2145–2159

    Article  Google Scholar 

  37. Possegger H, Mauthner T, Bischof H (2015) In defense of color-based model-free tracking. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2113–2120

  38. Putro MD, Kurnianggoro L, Jo KH (2021) High performance and efficient real-time face detector on central processing unit based on convolutional neural network. IEEE Trans Ind Inform 17(7):4449–4457

    Article  Google Scholar 

  39. Qi Y, Zhang S, Jiang F, Zhou H, Tao D, Li X (2020) Siamese local and global networks for robust face tracking. IEEE Trans Image Process 29:9152–9164

    Article  MATH  Google Scholar 

  40. Qian X, Zeng Y, Wang W, Zhang Q (2022) Co-saliency detection guided by group weakly supervised learning. IEEE Trans Multimed 1–1

  41. Ranganatha S, Gowramma YP (2017) An integrated robust approach for fast face tracking in noisy real-world videos with visual constraints. In: International conference on intelligent computing and control (i2c2), pp 1–5

  42. Saboo S, Singha J (2021) Vision based two-level hand tracking system for dynamic hand gestures in indoor environment. Multimed Tools Appl 80:20579–20598

    Article  Google Scholar 

  43. Soetedjo A, Somawirata IK (2016) Implementation of face detection and tracking on a low cost embedded system using fusion technique. In: 11th International conference on computer science & education (ICCSE), pp 209–213

  44. Sun H, Wen X (2021) Research on learning progress tracking of multimedia port user based on improved CamShift algorithm. Multimed Tools Appl 80:22719–22732

    Article  Google Scholar 

  45. Tathe SV, Narote SP (2013) Mean shift and Kalman filter based human face tracking. In: Proceedings of international conference on advances in signal processing and communication

  46. Topkaya IS, Erdogan H (2019) Using spatial overlap ratio of independent classifiers for likelihood map fusion in mean-shift tracking. SIViP 13:61–67

    Article  Google Scholar 

  47. Wang L, Ouyang W, Wang X, Lu H (2015) Visual tracking withfully convolutional networks. In: IEEE International conference on computer vision (ICCV), pp 3119–3127

  48. Wang J, Jia Z, Lai H, Yang J, Kasabov NK (2020) A Multi-Information fusion correlation filters tracker. IEEE Access 8:162022–162040

    Article  Google Scholar 

  49. Wang S, Yang S, Wang M, Jiao L (2021) New contour cue-based hybrid sparse learning for salient object detection. IEEE Trans Cybern 51(8):4212–4226

    Article  Google Scholar 

  50. Yan J, Zhong L, Yao Y, Xu X, Du C (2021) Dual-template adaptive correlation filter for real-time object tracking. Multimed Tools Appl 80 (2):2355–2376

    Article  Google Scholar 

  51. Yuan D, Zhang X, Liu J, Li D (2019) A multiple feature fused model for visual object tracking via correlation filters. Multimed Tools Appl 78:27271–27290

    Article  Google Scholar 

  52. Zeng Y, Feng M, Lu H, Borji A (2018) An unsupervised game-theoretic approach to saliency detection. IEEE Trans Image Process 27(9):4545–4554

    Article  MathSciNet  MATH  Google Scholar 

  53. Zhang K, Zhang L, Liu Q, Zhang D, Yang MH (2014) Fast visual tracking via dense spatio-temporal context learning. In: European conference on computer vision, pp 127–141

  54. Zhang J, Sclaroff S, Lin Z, Shen X, Price B, Mech R (2015) Minimum barrier salient object detection at 80 FPS. In: IEEE International conference on computer vision (ICCV), pp 1404–1412

  55. Zhang P, Liu W, Lu H, Shen C (2019) Salient object detection with lossless feature reflection and weighted structural loss. IEEE Trans Image Process 28(6):3048–3060

    Article  MathSciNet  MATH  Google Scholar 

  56. Zhang J, Jin X, Sun J, Wang J, Sangaiah AK (2020) Spatial and semantic convolutional features for robust visual object tracking. Multimed Tools Appl 79:15095–15115

    Article  Google Scholar 

  57. Zhou H, Xie X, Lai JH, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9138–9147

Download references

Acknowledgements

This work was supported by the National Science Foundation of China under Grant 61665012 and Grant U1803261, the International Science and Technology Cooperation Project of the Ministry of Education of the People’s Republic of China under Grant DICE 2016–2196, and the Scientific research plan of universities in Xinjiang Uygur Autonomous Region under Grant XJEDU2019Y006. We would like to thank the referees for their efforts to review our manuscript, as well as for their valuable suggestions and questions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenhong Jia.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Zhenhong Jia, Huicheng Lai and Fei Shi are contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Jia, Z., Lai, H. et al. A real time target face tracking algorithm based on saliency detection and Camshift. Multimed Tools Appl 82, 43599–43624 (2023). https://doi.org/10.1007/s11042-023-14889-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14889-x

Keywords

Navigation