Loading [MathJax]/extensions/TeX/extpfeil.js
Adaptive Multi-Scale Transformer Tracker for Satellite Videos | IEEE Journals & Magazine | IEEE Xplore
Scheduled Maintenance: On Monday, 27 January, the IEEE Xplore Author Profile management portal will undergo scheduled maintenance from 9:00-11:00 AM ET (1400-1600 UTC). During this time, access to the portal will be unavailable. We apologize for any inconvenience.

Adaptive Multi-Scale Transformer Tracker for Satellite Videos


Abstract:

Satellite video tracking tasks are often characterized by blurred foreground boundaries in vast scenes, a wide range of targets varying in scale, and irregular changes in...Show More

Abstract:

Satellite video tracking tasks are often characterized by blurred foreground boundaries in vast scenes, a wide range of targets varying in scale, and irregular changes in appearance. These challenges significantly impact the optimization of robust tracker performance. Therefore, it is imperative to extract diverse features with dynamic adaptive learning capabilities for the target being tracked in each sequence. In this article, we explore a novel adaptive multi-scale Transformer (MT) tracker for satellite videos to explore the potential spatiotemporal information of the target effectively. Specifically, a multi-scale spatial Transformer (MSST) is designed to leverage stage-by-stage spatial reduction and channel doubling, thereby enhancing the representation capabilities for the tracked target. In dynamic feature learning, an adaptive temporal Transformer (ATT) is then introduced based on multiple cross attentions, which analyzes the adaptive learning capacity for the dynamic target. It analyzes the weight proportion of different attentions automatically in the specific sequence through the learnable parameters. Finally, a multi-scale feature (MSF) regression module is crafted to improve the positioning accuracy of targets with low pixel counts in satellite scenes. This module accomplishes precise annotation of target boxes by effectively fusing features from diverse stages. We evaluate the proposed tracker performance on several public satellite datasets, including SatSOT, SV248S, and VISO. Experimental results show that the performance of our model can be comparable to the state-of-the-art trackers.
Article Sequence Number: 4800116
Date of Publication: 09 August 2024

ISSN Information:

Funding Agency:


References

References is not available for this document.