This study proposes FLSTrack, an end-to-end multi-object tracking algorithm that integrates Focused Linear Attention with dual decoders. The algorithm aims to address the limitations of current multi-object tracking methods, including poor performance in complex scenarios, inadequate data association, and high computational complexity. Initially, the SwinTransformer is paired with a Focused Linear Attention module to enhance the network’s ability to extract both local and global information, thereby reducing computational costs. Subsequently, a dual-branch decoder based on window attention is developed, with one branch dedicated to tracking and the other to detecting targets in image frames. To further enhance the algorithm’s speed, the complex feature re-identification (ReID) network is replaced with the BYTE data association method. To compensate for the loss of feature appearance resulting from omitting the ReID network, the SIoU loss function is introduced, significantly improving target localization accuracy. The experimental results of FLSTrack on the MOT17, MOT20, DanceTrack, and KITTI datasets show superior performance. Moreover, with an inference speed nearing 30 FPS, the algorithm achieves an optimal balance between tracking accuracy and real-time performance.
This work was supported in part by the Science and Technology Foundation of Guizhou Province (Grant No. QKHJC-ZK[2024]063), and in part by National Natural Science Foundation of China (Grant No. 62266011).
Conceptualization:[Dafu zu], [Guangqian Kong]; Methodology:[Dafu Zu], [Xun Duan]; Formal analysis and investigation: [Dafu Zu], [Huiyun Long]; Writing-original draft preparation: [Dafu Zu]; Writing-review and editing: [Guangqian Kong], [Xun Duan]; Funding acquisition: [Guangqian Kong],[Xun Duan],[Huiyun Long]; Supervision: [Guangqian Kong],[Xun Duan],[Huiyun Long].
Zu, D., Duan, X., Kong, G. et al. FLSTrack: focused linear attention swin-transformer network with dual-branch decoder for end-to-end multi-object tracking. SIViP 19, 25 (2025). https://doi.org/10.1007/s11760-024-03676-2
DOI: https://doi.org/10.1007/s11760-024-03676-2