Abstract:
RGBT tracking is often affected by complex scenes (i.e., occlusions, scale changes, noisy background, etc). Existing works usually adopt a single-strategy RGBT tracking f...Show MoreMetadata
Abstract:
RGBT tracking is often affected by complex scenes (i.e., occlusions, scale changes, noisy background, etc). Existing works usually adopt a single-strategy RGBT tracking fusion scheme to handle modality fusion in all scenarios. However, due to the limitation of fusion model capacity, it is difficult to fully integrate the discriminative features between different modalities. To tackle this problem, we propose a Fusion Tree Network (FTNet), which provides a multi-strategy fusion model with high capacity to efficiently fuse different modalities. Specifically, we combine three kinds of attention modules (i.e., channel attention, spatial attention, and location attention) in a tree structure to achieve multi-path hybrid attention in the deeper convolutional stages of the object tracking network. Extensive experiments are performed on three RGBT tracking datasets, and the results show that our method achieves superior performance among state-of-the-art RGBT tracking models.
Published in: 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)
Date of Conference: 29 November 2022 - 02 December 2022
Date Added to IEEE Xplore: 24 November 2022
ISBN Information: