Abstract
Visual tracking is a basic research topic in pattern recognition and computer vision. Relying on the information complementarity of RGB and thermal infrared (RGB-T) images, RGB-T tracking technology can significantly enhance the tracking performance in different scenarios. In recent years, some excellent RGB-T tracking algorithms have been proposed, but they mainly focus on short-term tracking, that is, the object cannot be recaptured when the tracking fails. Besides, most of the existing RGB-T tracking algorithms based on correlation filters are only suitable for short-term tracking and cannot handle tracking failures well. To this end, we propose a new RGB-T tracking algorithm based on correlation filters to make up for the short-term tracking deficiency. Specifically, our algorithm mainly includes feature fusion, reliability evaluation and object recovery components. First, the RGB and thermal infrared image features are cascaded for object tracking. Then, the reliability of the tracking result is evaluated through continuous responses. Finally, when the tracking result is judged to be unreliable, the object recovery mechanism is activated to recapture the object. Extensive experiments on the large-scale benchmark datasets verify the effectiveness of the proposed approach against other state-of-the-art RGB-T trackers.
Similar content being viewed by others
References
Kumar, N., Sukavanam, N.: A weakly supervised cnn model for spatial localization of human activities in unconstraint environment. Signal Image Video Process. 14(5), 1009–1016 (2020)
Kumar, N., Sukavanam, N.: An improved cnn framework for detecting and tracking human body in unconstraint environment. Knowl Based Syst. 193, 105198 (2019)
Zhang, X., Ye, P., Leung, H., et al.: Object fusion tracking based on visible and infrared images: a comprehensive review. Inf Fusion. 63, 166–187 (2020)
Wu Y., Blasch E., Chen G., et al: multiple source data fusion via sparse representation for robust visual tracking. In: Proceeding of the International Conference on Information Fusion, pp. 1-8 (2011)
Leykin, A., Hammoud, R.I.: pedestrian tracking by fusion of thermal-visible surveillance videos. Mach. Vis. Appl. 21(4), 587–595 (2010)
Liu, H., Sun, F.: Fusion tracking in color and infrared images using joint sparse representation. Sci. China Series F: Inform. Sci. 55(3), 590–599 (2012)
Li C.,Hu S., et al.: Real-time grayscale-thermal tracking via laplacian sparse representation. In: International Conference on Multimedia Modeling, pp. 54-65 (2016)
Lan, X., Ye, M., Zhang, S., et al.: Modality-correlation-aware sparse representation for rgb-infrared object tracking. Pattern Recognit. Lett. 130, 12–20 (2018)
Li, C., Cheng, H., Hu, S., et al.: Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Trans. Image Process. 25(12), 5743–5756 (2016)
Li C., Zhu C., Huang Y., et al.: Cross-modal ranking with soft consistency and noisy labels for robust rgb-t tracking. In: European Conference on Computer Vision, pp. 831-847 (2018)
Li, C., Wu, X., Zhao, N., et al.: Fusing two-stream convolutional neural networks for rgb-t object tracking. Neurocomputing 281, 78–85 (2017)
Li, C., Wu, X., Zhao, N., et al.: Fusing two-stream convolutional neural networks for rgb-t object tracking. Neurocomputing 281, 78–85 (2018)
Lan, X., Ye, M., Shao, R., et al.: Learning modality-consistency feature templates:a robust rgb-infrared tracking system. IEEE Trans. Ind. Electron. 66(12), 9887–9897 (2019)
Zhu Y., Li C., Lu Y., et al.: FANet:quality-aware feature aggregation network for rgb-t tracking (2019). arXiv preprint https://arxiv.org/abs/1811.09855
Lan, X., Ye, M., Shao, R., et al.: Online non-negative multi-modality feature template learning for rgb-assisted infrared tracking. IEEE Access 7, 67761–67771 (2019)
Zhang, X., Ye, P., Peng, S., et al.: Siamft: An rgb-infrared fusion tracking method via fully convolutional siamese networks. IEEE Access 7, 122122–122133 (2019)
Zhu Y., Li C., Luo B., et al.: Dense feature aggregation and pruning for rgbt tracking. In: The ACM International Conference on Multimedia, pp. 465-472 (2019)
Li C., Lu A., Zheng Z., et al.: Multi-adapter rgbt tracking. In: The IEEE International Conference on Computer Vision, pp. 2262-2270 (2019)
Zhai, S., Shao, P., Liang, X., et al.: Fast rgb-t tracking via cross-modal correlation filters. Neurocomputing 334, 172–181 (2019)
Wang Y., Li C., Tang J., et al.: Learning soft-consistent correlation filters for rgb-t object tracking. In: Chinese Conference on Pattern Recognition and Computer Vision, pp. 295-306 (2018)
Wang Y., Li C., Tang J.; Learning soft-consistent correlation filters for rgb-t object tracking. In: The Chinese Conference on Pattern Recognition and Computer Vision, pp. 295-306 (2018)
Wang Y., Wei X., Tang X., et al.: Adaptive fusion cnn features for rgbt object tracking. IEEE trans Intell Transp Syst. https://doi.org/10.1109/TITS.2021.3073046
Bolme, D.S., Beveridge, J.R., Draper, B.A., et al: Visual object tracking using adaptive correlation filters. In: International Conference on Computer Vision and Pattern Recognition, pp. 2544-2550 (2010)
Henriques, J.F., Caseiro, R., Martins P., et al.: Exploiting the circulant structure of tracking-by-detection with kernels. In: European Conference on Computer Vision, pp. 702-715 (2012)
Danelljan, M., Häger, G., Khan, F., et al.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, pp. 1-5 (2014)
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision, pp. 254-265 (2014)
Ma, C., Yang, X., Zhang, C.: Long-term correlation tracking. In: International Conference on Computer Vision and Pattern Recognition, pp. 5388-5396 (2015)
Danelljan, M., Bhat, G., Khan, F.S., et al.: ECO: efficient convolution operators for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, pp. 6931-6939 (2017)
Li C., Zhao N., Lu Y., et al.: Weighted sparse representation regularized graph learning for rgb-t object tracking. In: The ACM International Conference on Multimedia, pp. 1856-1864 (2017)
Li, C., Liang, X., Lu, Y., et al.: RGB-T object tracking: benchmark and baseline. Pattern Recognit. 96, 106977 (2019)
Wu Y., Blasch E., Chen G., et al.: Multiple source data fusion via sparse representation for robust visual tracking. In: The International Conference on Information Fusion, pp. 1-8 (2011)
Li L., Li C., Tu Z., et al.: A fusion approach to grayscale-thermal tracking with cross-modal sparse representation. In: Chinese Conference on Image and Graphics Technologies, pp. 494-505 (2018)
Yun, X., Sun, Y., Yang, X., et al.: Discriminative fusion correlation learning for visible and infrared tracking. Math. Probl. Eng. 2019, 1–11 (2019)
Luo, C., Sun, B., Yang, K., et al.: Thermal infrared and visible sequences fusion tracking based on a hybrid tracking framework with adaptive weighting scheme. Infrared Phys. Tech. 99, 265–276 (2019)
Ren K., Zhang X., Han Y., et al.: Robust night target tracking via infrared and visible video fusion. In: Applications of Digital Image Processing, pp. 43-52 (2018)
Henriques, J.F., Caseiro, R., Martins, P., et al.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)
Gray, R.M.: Toeplitz and circulant matrices: a review. Now Publishers. 77(1–3), 125–141 (2006)
Rifkin, R., Yeo, G., Poggio, T.: Regularized least-squares classification. Nato Sci. Ser. Sub Ser. 190, 131–154 (2003)
Zhou, Z., Dong, M., Xie, X., et al.: Fusion of infrared and visible images for night-vision context enhancement. Appl. Opt. 55(23), 6480–6490 (2016)
Zitnick, C., Dollar, P.: Edge boxes: locating object proposals from edges. In: European Conference on Computer Vision, pp. 391-405 (2014)
Zhong, Q., Li, C., Zhang, Y., et al.: Cascade region proposal and global context for deep object detection. Neurocomputing 395, 170–177 (2020)
Pang J., Qiu, L., Li X., et al.: Quasi-Dense similarity learning for multiple object tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 164-173 (2021)
Kim H., Lee D., Sim J., et al.: SOWP: spatially ordered and weighted patch descriptor for visual tracking. In: The IEEE International Conference on Computer Vision, pp. 3011-3019 (2015)
Lukezic A., Vojir T., Zajc L C., et al.: Discriminative correlation filter with channel and spatial reliability. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4847-4856 (2017)
Zhang J., Ma S., Sclaroff S.: MEEM: robust tracking via multiple experts using entropy minimization. In: The European Conference on Computer Vision, pp. 188-203 (2014)
Valmadre J., Bertinetto L., Henriques J.F., et al.: End-to-end representation learning for correlation filter based tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 5000-5008 (2017)
Zhang Z., Peng H.: Deeper and wider siamese networks for real-time visual tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4586-4595 (2019)
H Nam., B Han.: Learning multi-domain convolutional neural networks for visual tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293-4302 (2016)
S Yun., J Choi., Y Yoo., et al.: Action-decision networks for visual tracking with deep reinforcement learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1349-1358 (2017)
Bertinetto L., Valmadre J., et al.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision, pp. 850-865 (2016)
C Li., C Zhu., J Zhang, et al.: Learning local-global multi-graph descriptors for RGB-T object tracking. IEEE Trans. Circuit. Syst. Video Technol. 29, 2913–2926 (2018)
Pu S., Song Y., Ma C., et al.: Deep attentive tracking via reciprocative learning. In: Advances in Neural Information Processing Systems, pp. 1931-1941 (2018)
Galoogahi H.K., Fagg A., et al.: Learning background-aware correlation filters for visual tracking. In: International Conference on Computer Vision, pp. 1144-1152 (2017)
Danelljan M., Hager G., et al.: Learning spatially regularized correlation filters for visual tracking. In: International Conference on Computer Vision, pp. 4310-4318 (2015)
Acknowledgements
This paper is supported by Excellent Youth Foundation of Sichuan Scientific Committee (No. E10104361), Major Project of Sichuan Provincial Department of Science and Technology (No. E10104422).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jun, L., Zhongqiang, L. & Xingzhong, X. RGB-T long-term tracking algorithm via local sampling and global proposals. SIViP 16, 2221–2229 (2022). https://doi.org/10.1007/s11760-022-02187-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02187-2