Skip to main content
Log in

RGB-T long-term tracking algorithm via local sampling and global proposals

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Visual tracking is a basic research topic in pattern recognition and computer vision. Relying on the information complementarity of RGB and thermal infrared (RGB-T) images, RGB-T tracking technology can significantly enhance the tracking performance in different scenarios. In recent years, some excellent RGB-T tracking algorithms have been proposed, but they mainly focus on short-term tracking, that is, the object cannot be recaptured when the tracking fails. Besides, most of the existing RGB-T tracking algorithms based on correlation filters are only suitable for short-term tracking and cannot handle tracking failures well. To this end, we propose a new RGB-T tracking algorithm based on correlation filters to make up for the short-term tracking deficiency. Specifically, our algorithm mainly includes feature fusion, reliability evaluation and object recovery components. First, the RGB and thermal infrared image features are cascaded for object tracking. Then, the reliability of the tracking result is evaluated through continuous responses. Finally, when the tracking result is judged to be unreliable, the object recovery mechanism is activated to recapture the object. Extensive experiments on the large-scale benchmark datasets verify the effectiveness of the proposed approach against other state-of-the-art RGB-T trackers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Kumar, N., Sukavanam, N.: A weakly supervised cnn model for spatial localization of human activities in unconstraint environment. Signal Image Video Process. 14(5), 1009–1016 (2020)

    Article  Google Scholar 

  2. Kumar, N., Sukavanam, N.: An improved cnn framework for detecting and tracking human body in unconstraint environment. Knowl Based Syst. 193, 105198 (2019)

    Article  Google Scholar 

  3. Zhang, X., Ye, P., Leung, H., et al.: Object fusion tracking based on visible and infrared images: a comprehensive review. Inf Fusion. 63, 166–187 (2020)

    Article  Google Scholar 

  4. Wu Y., Blasch E., Chen G., et al: multiple source data fusion via sparse representation for robust visual tracking. In: Proceeding of the International Conference on Information Fusion, pp. 1-8 (2011)

  5. Leykin, A., Hammoud, R.I.: pedestrian tracking by fusion of thermal-visible surveillance videos. Mach. Vis. Appl. 21(4), 587–595 (2010)

    Article  Google Scholar 

  6. Liu, H., Sun, F.: Fusion tracking in color and infrared images using joint sparse representation. Sci. China Series F: Inform. Sci. 55(3), 590–599 (2012)

    MathSciNet  Google Scholar 

  7. Li C.,Hu S., et al.: Real-time grayscale-thermal tracking via laplacian sparse representation. In: International Conference on Multimedia Modeling, pp. 54-65 (2016)

  8. Lan, X., Ye, M., Zhang, S., et al.: Modality-correlation-aware sparse representation for rgb-infrared object tracking. Pattern Recognit. Lett. 130, 12–20 (2018)

    Article  Google Scholar 

  9. Li, C., Cheng, H., Hu, S., et al.: Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Trans. Image Process. 25(12), 5743–5756 (2016)

    Article  MathSciNet  Google Scholar 

  10. Li C., Zhu C., Huang Y., et al.: Cross-modal ranking with soft consistency and noisy labels for robust rgb-t tracking. In: European Conference on Computer Vision, pp. 831-847 (2018)

  11. Li, C., Wu, X., Zhao, N., et al.: Fusing two-stream convolutional neural networks for rgb-t object tracking. Neurocomputing 281, 78–85 (2017)

    Article  Google Scholar 

  12. Li, C., Wu, X., Zhao, N., et al.: Fusing two-stream convolutional neural networks for rgb-t object tracking. Neurocomputing 281, 78–85 (2018)

    Article  Google Scholar 

  13. Lan, X., Ye, M., Shao, R., et al.: Learning modality-consistency feature templates:a robust rgb-infrared tracking system. IEEE Trans. Ind. Electron. 66(12), 9887–9897 (2019)

    Article  Google Scholar 

  14. Zhu Y., Li C., Lu Y., et al.: FANet:quality-aware feature aggregation network for rgb-t tracking (2019). arXiv preprint https://arxiv.org/abs/1811.09855

  15. Lan, X., Ye, M., Shao, R., et al.: Online non-negative multi-modality feature template learning for rgb-assisted infrared tracking. IEEE Access 7, 67761–67771 (2019)

    Article  Google Scholar 

  16. Zhang, X., Ye, P., Peng, S., et al.: Siamft: An rgb-infrared fusion tracking method via fully convolutional siamese networks. IEEE Access 7, 122122–122133 (2019)

    Article  Google Scholar 

  17. Zhu Y., Li C., Luo B., et al.: Dense feature aggregation and pruning for rgbt tracking. In: The ACM International Conference on Multimedia, pp. 465-472 (2019)

  18. Li C., Lu A., Zheng Z., et al.: Multi-adapter rgbt tracking. In: The IEEE International Conference on Computer Vision, pp. 2262-2270 (2019)

  19. Zhai, S., Shao, P., Liang, X., et al.: Fast rgb-t tracking via cross-modal correlation filters. Neurocomputing 334, 172–181 (2019)

    Article  Google Scholar 

  20. Wang Y., Li C., Tang J., et al.: Learning soft-consistent correlation filters for rgb-t object tracking. In: Chinese Conference on Pattern Recognition and Computer Vision, pp. 295-306 (2018)

  21. Wang Y., Li C., Tang J.; Learning soft-consistent correlation filters for rgb-t object tracking. In: The Chinese Conference on Pattern Recognition and Computer Vision, pp. 295-306 (2018)

  22. Wang Y., Wei X., Tang X., et al.: Adaptive fusion cnn features for rgbt object tracking. IEEE trans Intell Transp Syst. https://doi.org/10.1109/TITS.2021.3073046

  23. Bolme, D.S., Beveridge, J.R., Draper, B.A., et al: Visual object tracking using adaptive correlation filters. In: International Conference on Computer Vision and Pattern Recognition, pp. 2544-2550 (2010)

  24. Henriques, J.F., Caseiro, R., Martins P., et al.: Exploiting the circulant structure of tracking-by-detection with kernels. In: European Conference on Computer Vision, pp. 702-715 (2012)

  25. Danelljan, M., Häger, G., Khan, F., et al.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, pp. 1-5 (2014)

  26. Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision, pp. 254-265 (2014)

  27. Ma, C., Yang, X., Zhang, C.: Long-term correlation tracking. In: International Conference on Computer Vision and Pattern Recognition, pp. 5388-5396 (2015)

  28. Danelljan, M., Bhat, G., Khan, F.S., et al.: ECO: efficient convolution operators for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, pp. 6931-6939 (2017)

  29. Li C., Zhao N., Lu Y., et al.: Weighted sparse representation regularized graph learning for rgb-t object tracking. In: The ACM International Conference on Multimedia, pp. 1856-1864 (2017)

  30. Li, C., Liang, X., Lu, Y., et al.: RGB-T object tracking: benchmark and baseline. Pattern Recognit. 96, 106977 (2019)

    Article  Google Scholar 

  31. Wu Y., Blasch E., Chen G., et al.: Multiple source data fusion via sparse representation for robust visual tracking. In: The International Conference on Information Fusion, pp. 1-8 (2011)

  32. Li L., Li C., Tu Z., et al.: A fusion approach to grayscale-thermal tracking with cross-modal sparse representation. In: Chinese Conference on Image and Graphics Technologies, pp. 494-505 (2018)

  33. Yun, X., Sun, Y., Yang, X., et al.: Discriminative fusion correlation learning for visible and infrared tracking. Math. Probl. Eng. 2019, 1–11 (2019)

    Article  Google Scholar 

  34. Luo, C., Sun, B., Yang, K., et al.: Thermal infrared and visible sequences fusion tracking based on a hybrid tracking framework with adaptive weighting scheme. Infrared Phys. Tech. 99, 265–276 (2019)

    Article  Google Scholar 

  35. Ren K., Zhang X., Han Y., et al.: Robust night target tracking via infrared and visible video fusion. In: Applications of Digital Image Processing, pp. 43-52 (2018)

  36. Henriques, J.F., Caseiro, R., Martins, P., et al.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)

    Article  Google Scholar 

  37. Gray, R.M.: Toeplitz and circulant matrices: a review. Now Publishers. 77(1–3), 125–141 (2006)

    MATH  Google Scholar 

  38. Rifkin, R., Yeo, G., Poggio, T.: Regularized least-squares classification. Nato Sci. Ser. Sub Ser. 190, 131–154 (2003)

    Google Scholar 

  39. Zhou, Z., Dong, M., Xie, X., et al.: Fusion of infrared and visible images for night-vision context enhancement. Appl. Opt. 55(23), 6480–6490 (2016)

    Article  Google Scholar 

  40. Zitnick, C., Dollar, P.: Edge boxes: locating object proposals from edges. In: European Conference on Computer Vision, pp. 391-405 (2014)

  41. Zhong, Q., Li, C., Zhang, Y., et al.: Cascade region proposal and global context for deep object detection. Neurocomputing 395, 170–177 (2020)

    Article  Google Scholar 

  42. Pang J., Qiu, L., Li X., et al.: Quasi-Dense similarity learning for multiple object tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 164-173 (2021)

  43. Kim H., Lee D., Sim J., et al.: SOWP: spatially ordered and weighted patch descriptor for visual tracking. In: The IEEE International Conference on Computer Vision, pp. 3011-3019 (2015)

  44. Lukezic A., Vojir T., Zajc L C., et al.: Discriminative correlation filter with channel and spatial reliability. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4847-4856 (2017)

  45. Zhang J., Ma S., Sclaroff S.: MEEM: robust tracking via multiple experts using entropy minimization. In: The European Conference on Computer Vision, pp. 188-203 (2014)

  46. Valmadre J., Bertinetto L., Henriques J.F., et al.: End-to-end representation learning for correlation filter based tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 5000-5008 (2017)

  47. Zhang Z., Peng H.: Deeper and wider siamese networks for real-time visual tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4586-4595 (2019)

  48. H Nam., B Han.: Learning multi-domain convolutional neural networks for visual tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293-4302 (2016)

  49. S Yun., J Choi., Y Yoo., et al.: Action-decision networks for visual tracking with deep reinforcement learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1349-1358 (2017)

  50. Bertinetto L., Valmadre J., et al.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision, pp. 850-865 (2016)

  51. C Li., C Zhu., J Zhang, et al.: Learning local-global multi-graph descriptors for RGB-T object tracking. IEEE Trans. Circuit. Syst. Video Technol. 29, 2913–2926 (2018)

  52. Pu S., Song Y., Ma C., et al.: Deep attentive tracking via reciprocative learning. In: Advances in Neural Information Processing Systems, pp. 1931-1941 (2018)

  53. Galoogahi H.K., Fagg A., et al.: Learning background-aware correlation filters for visual tracking. In: International Conference on Computer Vision, pp. 1144-1152 (2017)

  54. Danelljan M., Hager G., et al.: Learning spatially regularized correlation filters for visual tracking. In: International Conference on Computer Vision, pp. 4310-4318 (2015)

Download references

Acknowledgements

This paper is supported by Excellent Youth Foundation of Sichuan Scientific Committee (No. E10104361), Major Project of Sichuan Provincial Department of Science and Technology (No. E10104422).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liu Jun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jun, L., Zhongqiang, L. & Xingzhong, X. RGB-T long-term tracking algorithm via local sampling and global proposals. SIViP 16, 2221–2229 (2022). https://doi.org/10.1007/s11760-022-02187-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02187-2

Keywords

Navigation