RGBT Tracking based on modality feature enhancement

Zhai, Sulan; Wu, Yi; Liu, Lei; Tang, Jin

doi:10.1007/s11042-023-16418-2

RGBT Tracking based on modality feature enhancement

Published: 12 September 2023

Volume 83, pages 29311–29330, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sulan Zhai^1,3,
Yi Wu^1,3,
Lei Liu^2,3 &
…
Jin Tang^2,3

170 Accesses
Explore all metrics

Abstract

Fusion tracking based on visible and thermal infrared images can boost tracking performance under adverse challenging conditions, such as low illumination and bad weather. Existing RGBT tracking methods mainly focus on estimating the reliability weights of two modalities to achieve effective multi-modal fusion. However, these algorithms fail to significantly enhance the discriminability of multimodal features, including channel-level and spatial-level discriminability, which limits their tracking performance. We propose a novel Modality Feature Enhancement Network for RGBT tracking. Specifically, we design a modality feature enhancement module, which is composed of the channel feature enhancement module and the spatial feature enhancement module. The channel feature enhancement module can adaptively adjust the importance of different channels, which helps to improve the channel discriminability of multimodal features. The spatial feature enhancement module is used to improve the spatial discriminability of multimodal features. By the collaboration of these two modules, our network can effectively tackle partial occlusion challenges. In addition, modality feature enhancement module is parameter-shared between different modalities to explore the advantages of modality-shared cues. To solve the problem of the tracking failure caused by sudden camera motion, we introduce the re-sampling strategy to improve the tracking robustness. Extensive experiments on three RGBT tracking benchmark datasets show that our method is superior to other advanced tracking algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Image Fusion Techniques: A Survey

Article 24 January 2021

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

References

Cai B, Zhang C, Zhixin LI (2017) Tracking Infrared-visible Target with Joint Histogram. J of Guang xi Normal University
Feng M, Song K, Wang Y, Liu J, Yan Y (2020) Learning Discriminative Update Adaptive Spatial-Temporal Regularized Correlation Filter for RGB-T Tracking. J Vis Commun Image Represent, 72:102881
Gao Y, Li CL, Zhu YB, Tang J, He T, Wang FT (2019) Deep adaptive fusion network for high performance rgbt tracking. In: Proc IEEE Int Conf Comput Vis Workshops
He KM, Zhang XY, Ren SQ, Sun J (2016) Deep Residual Learning for Image Recognition. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 770–778
He MJ, Zhang J, Shan SG, Liu X, Wu ZQ et al (2022) Locality-Aware Channel-Wise Dropout for Occluded Face Recognition. IEEE Trans Image Process, 31:788–798
Article ADS PubMed Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 7132–7141
Kroeger T, Timofte RD, Dai DX, Van Gool L (2016) Fast optical flow using dense inverse search. In: Proceedings of the Computer Vision-ECCV: 14th European Conference, pp 471-488
Laurense VA, Goh JY, Gerdes JC (2017) Path-tracking for autonomous vehicles at the limit of friction. Am Control Conf, p 5586–5 591
Li CL, Liang XY, Lu YJ, Zhao N, Tang J (2019) RGB-T object tracking: Benchmark and baseline. Pattern Recognit, 96:106977
Li CL, Liu L, Lu AD, Ji Q, Tang J (2020) Challenge-aware rgbt tracking. In: Proceedings of the Computer Vision-ECCV: 16th European Conference, pp 222-237
Li CL, Lu AD, Zheng AH, Tu ZZ, Tang J (2019) Multi-Adapter RGBT Tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. https://doi.org/10.1109/ICCVW.2019.00279
Li CL, Xue WL, Jia YQ, Qu ZC, Luo B, Tang J, et al (2021) LasHeR: A Large-scale High-diversity Benchmark for RGBT Tracking. IEEE Trans Image Process. https://doi.org/10.48550/arXiv.2104.13202
Li CL, Cheng H, Hu SY, Liu XB, Tang J, Lin L (2016) Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Trans Image Process, 25:5743–5756
Article ADS MathSciNet Google Scholar
Li J, Zhang Z, He H (2017) Hierarchical convolutional neural networks for EEG-based emotion recognition. Cogn Comput, 10:1–13
ADS Google Scholar
Li X, Wang W, Hu X, et al (2019) Selective kernel networks. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit. pp 510-519
Li B, Yan JJ, Wu W, Zhu Z, Hu XL (2018) High performance visual tracking with siamese region proposal network. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 8971–8980
Lu AD, Li CL, Yan YQ, Tang J, Luo B (2021) RGBT Tracking via Multi-Adapter Network with Hierarchical Divergence Loss. IEEE Trans Image Process, pp 5613–5625
Lu AD, Qian C, Li CL, Tang J, Wang L (2020). Duality-Gated Mutual Condition Network for RGBT Tracking. https://doi.org/10.1109/TNNLS.2022.3157594
Article Google Scholar
Marriott RT, Romdhani S, Chen L (2021) A 3D GAN for Improved Large-pose Facial Recognition. In:Proc IEEE Conf Comput Vis Pattern Recognit, pp 13445–13455
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 4293–4302
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1409.1556
Sun P, Zhang RF, Jiang Y, Kong T, Xu CF, Zhan W, et al (2021) Sparse R-CNN: End-to-End Object Detection with Learnable Proposals. In:Proc IEEE Conf Comput Vis Pattern Recognit, p 14449–14458
Tu ZZ, Chun L, Li CL, Tang J, Luo B (2020) M5L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking. https://doi.org/10.48550/arXiv.2003.07650
Wang CQ, Xu CY, Cui Z, Zhou L, Zhang T, Zhang XY, et al (2020) Cross-modal pattern-propagation for RGB-T tracking. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 7064–7073
Wang K, Wei HL, Chen CB, Cao K (2018) Target Tracking Based on Infrared and Visible Light Fusion. Comput Syst Appl, 27:149–153
Google Scholar
Wang Y, Wei X, Tang X, Shen H, Zhang H (2021) Adaptive Fusion CNN Features for RGBT Object Tracking. IEEE Trans Intell Transp Syst, 23:7831–7840
Article Google Scholar
Wang Q, Wu B, Zhu P, et al (2020) ECA-Net: Efficient channel attention for deep convolutional neural networks. In: Proc IEEE/CVF Conf Comput Vis Pattern Recognit, p 11534-11542
Xiao Y, Yang MM, Li CL, Liu L, Tang J (2022) Attribute-based Progressive Fusion Network for RGBT Tracking. Proc AAAI Conf Artif Intell, 36:2831–2838
Google Scholar
Zhai S, Shao P, Liang X, Wang X (2019) Fast RGB-T Tracking via Cross-Modal Correlation Filters. Neurocomputing 334:172–181
Article Google Scholar
Zhang LC, Danelljan M, Gonzalez-Garcia A, van de Weijer J, Shah-baz Khan F (2019) Multi-modal fusion for end-to-end rgb-t tracking. In: Proc IEEE Conf Comput Vis Workshops
Zhang PY, Wang D, Lu H, Yang X (2021) Learning Adaptive Attribute-Driven Representation for Real-Time RGB-T Tracking. Int J Comput Vis, pp 1–16
Zhang H, Zhang L, Zhuo L, Zhang J (2020) Object tracking in rgb-t videos using modal-aware attention network and competitive learning. Sensors 20:393
Article ADS PubMed PubMed Central Google Scholar
Zhang PY, Zhao J, Bo CJ, Wang D, Lu HC, Yang XY (2021) Jointly modeling motion and appearance cues for robust RGB-T tracking. IEEE Trans Image Process, 30:3335–3347
Article ADS PubMed Google Scholar
Zhu YB, Li CL, Luo B, Tang J, Wang X (2019) Dense feature aggregation and pruning for rgbt tracking. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 465–472
Zhu YB, Li CL, Tang J, Luo B (2021) Quality-Aware Feature Aggregation Network for Robust RGBT Tracking. IEEE Trans Intell Veh, 6:121–130
Article Google Scholar
Zhu YB, Li CL, Tang J, Luo B, Wang L (2021) RGBT Tracking by Trident Fusion Network. IEEE Trans Circ Syst Video Technol, 32:579–592
Article Google Scholar
Zhu G, Porikli F, Li HD (2016) Beyond local search: Tracking objects everywhere with instance-specific proposals. In:Proc IEEE Conf Comput Vis Pattern Recognit, pp 943–951

Download references

Acknowledgements

This work is part supported by the Natural Science Research Project of Anhui Education Department (Grant No: KJ2019A0005), the Open Project of School of Mathematical Sciences, Anhui University (Grant No: KF2019A03), the National Natural Science Foundation of China (Grant No: 62076003)

Author information

Authors and Affiliations

School of Mathematical Sciences, Anhui University, Hefei, Anhui, China
Sulan Zhai & Yi Wu
Anhui Provincial Key Laboratory of multi-modal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
Lei Liu & Jin Tang
School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
Sulan Zhai, Yi Wu, Lei Liu & Jin Tang

Authors

Sulan Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Yi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jin Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Liu.

Ethics declarations

Ethics Approval and Consent to Participate

This article does not contain any studies with animals performed by any of the authors.

Competing Interests

The authors declare that there is no confict of interests regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhai, S., Wu, Y., Liu, L. et al. RGBT Tracking based on modality feature enhancement. Multimed Tools Appl 83, 29311–29330 (2024). https://doi.org/10.1007/s11042-023-16418-2

Download citation

Received: 08 July 2022
Revised: 21 July 2023
Accepted: 24 July 2023
Published: 12 September 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11042-023-16418-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RGBT Tracking based on modality feature enhancement

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Image Fusion Techniques: A Survey

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics Approval and Consent to Participate

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RGBT Tracking based on modality feature enhancement

Abstract

Access this article

Similar content being viewed by others

Image Matching from Handcrafted to Deep Features: A Survey

Image Fusion Techniques: A Survey

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics Approval and Consent to Participate

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation