Siamsdt: a self-adaptive dynamic template siamese network for airborne visual tracking of MAVs on heterogeneous FPGA-SoC

Zhang, Yuxin; Wen, Jiazheng; Wu, Ran; Liu, Huanyu; Li, Junbao

doi:10.1007/s11227-025-06928-1

Siamsdt: a self-adaptive dynamic template siamese network for airborne visual tracking of MAVs on heterogeneous FPGA-SoC

Published: 11 February 2025

Volume 81, article number 481, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Yuxin Zhang¹,
Jiazheng Wen¹,
Ran Wu¹,
Huanyu Liu¹ &
…
Junbao Li¹

122 Accesses
Explore all metrics

Abstract

Airborne visual tracking is pivotal in enhancing the autonomy and intelligence of micro aerial vehicles (MAVs). However, MAVs frequently encounter challenges such as viewpoint changes and interference from similar objects in practice. Additionally, due to their small size and lightweight characteristics, MAVs have limited onboard computational resources, significantly constraining algorithm complexity and impacting tracking performance. To address these issues, we propose a robust and lightweight tracking model, self-adaptive dynamic template Siamese network (SiamSDT). Leveraging two key designs: temporal attention mechanism and Self-adaptive Template Fusion module, SiamSDT is capable of adapting to the appearance variations during the tracking process. Specifically, temporal attention mechanism integrates historical information in a sequential manner, retaining pertinent information while reducing storage and computational complexity. Additionally, the Self-adaptive Template Fusion module dynamically adjusts the fusion ratio of each template through a similarity matrix, further enhancing the model’s adaptability and anti-interference capability. Furthermore, we propose a solution tailored for heterogeneous ZYNQ platforms to deal with the issue of limited onboard resources, and an FPGA-based accelerator is designed to accelerate the inference process through pipeline, data reuse, ping-pong operation and array partition. The performance of SiamSDT was evaluated on OTB and UAV123 dataset. On the UAV123 dataset, SiamSDT achieves a 4.8% increase in precision and a 1.2% increase in success rate compared to the baseline algorithm without any increase in parameters. The hardware simulation experiments demonstrate that our deployment scheme can significantly reduce inference latency with an acceptable decrease in tracking performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning convolutional self-attention module for unmanned aerial vehicle tracking

Article 22 December 2022

TMTB: Transformer based multi-task branching multi-object tracking algorithm for wide-view scenes

Article 11 October 2023

Siamese object tracking for unmanned aerial vehicle: a review and comprehensive analysis

Article 27 July 2023

Data availability

All the experiments are conducted utilizing publicly accessible datasets.

References

Hatamleh KS, Ma O, Paz R (2009) A uav model parameter identification method: a simulation study. Int J Inform Acquisition 06(04):225–238
Article MATH Google Scholar
Korchenko AG, Illyash OS (2013) The generalized classification of unmanned air vehicles. In: 2013 IEEE 2nd International Conference Actual Problems of Unmanned Air Vehicles Developments Proceedings (APUAVD), pp 28–34
Aboelezz A, Hassanalian M, Desoki A, Elhadidi B, El-Bayoumi G (2020) Design, experimental investigation, and nonlinear flight dynamics with atmospheric disturbances of a fixed-wing micro air vehicle. Aerosp Sci Technol 97:105636
Article Google Scholar
Yang S, Hou Z, Chen H (2023) Evaluation of vulnerability of mav/uav collaborative combat network based on complex network. Chaos, Solit & Fractals 172:113500
Article MATH Google Scholar
Marvasti-Zadeh SM, Cheng L, Ghanei-Yakhdan H, Kasaei S (2021) Deep learning for visual tracking: a comprehensive survey. IEEE Trans Intell Transp Syst 23(5):3943–3968
Article Google Scholar
Chen F, Wang X, Zhao Y, Lv S, Niu X (2022) Visual object tracking: a survey. Comput Vis Image Underst 222:103508. https://doi.org/10.1016/j.cviu.2022.103508
Article MATH Google Scholar
Zhao B, Huo M, Li Z, Yu Z, Qi N (2024) Graph-based multi-agent reinforcement learning for large-scale UAVs swarm system control. Aerosp Sci Technol 150:109166
Article MATH Google Scholar
Tan S, Sun L, Song Y (2022) Prescribed performance control of Euler-Lagrange systems tracking targets with unknown trajectory. Neurocomputing 480:212–219
Article MATH Google Scholar
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II 14, pp 850–865. Springer
Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1763–1771
Wang Q, Teng Z, Xing J, Gao J, Hu W, Maybank S (2018) Learning attentions: residual attentional siamese network for high performance online visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4854–4863
Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8971–8980
Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 101–117
Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4282–4291
Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: A unifying approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1328–1338
Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp 12549–12556
Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6668–6677
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article MATH Google Scholar
Wen J, Liu H, Li J (2024) Ptds centertrack: pedestrian tracking in dense scenes with re-identification and feature enhancement. Mach Vis Appl 35(3):54
Article MATH Google Scholar
Su Z, Ji H, Tian C, Zhang Y (2024) Performance evaluation for multi-target tracking with temporal dimension specifics. Chin J Aeronaut 37(2):446–458
Article MATH Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need.(nips), 2017. arXiv preprint arXiv:1706.03762 10, 0140525–16001837
Wang N, Zhou W, Wang J, Li H (2021) Transformer meets tracker: Exploiting temporal context for robust visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1571–1580
Cao Z, Fu C, Ye J, Li B, Hift YL (2021) Hierarchical feature transformer for aerial tracking. in 2021 ieee. In: CVF International Conference on Computer Vision (ICCV), pp 15457–15466
Chen X, Yan B, Zhu J, Lu H, Ruan X, Wang D (2022) High-performance transformer tracking. IEEE Trans Pattern Anal Mach Intell 45(7):8507–8523
MATH Google Scholar
Yan B, Peng H, Fu J, Wang D, Lu H (2021) Learning spatio-temporal transformer for visual tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10448–10457
Cao Z, Huang Z, Pan L, Zhang S, Liu Z, Fu C (2022) Tctrack: Temporal contexts for aerial tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14798–14808
Cui Y, Jiang C, Wang L, Wu G (2022) Mixformer: End-to-end tracking with iterative mixed attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13608–13618
Wei X, Bai Y, Zheng Y, Shi D, Gong Y (2023) Autoregressive visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9697–9706
Zhu J, Lai S, Chen X, Wang D, Lu H (2023) Visual prompt multi-modal tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9516–9526
Sun Y, Yu F, Chen S, Zhang Y, Huang J, Li C, Li Y, Wang C (2024) Chattracker: Enhancing visual tracking performance via chatting with multimodal large language model. arXiv preprint arXiv:2411.01756
Chen X, Peng H, Wang D, Lu H, Hu H (2023) Seqtrack: Sequence to sequence learning for visual object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14572–14581
Hong L, Yan S, Zhang R, Li W, Zhou X, Guo P, Jiang K, Chen Y, Li J, Chen Z, Zhang W (2024) Onetracker: Unifying visual object tracking with foundation models and efficient tuning. CoRR abs/2403.09634
Noordin A, Mohd Basri MA, Mohamed Z (2023) Adaptive pid control via sliding mode for position tracking of quadrotor mav: Simulation and real-time experiment evaluation. Aerospace 10(6):512
Article Google Scholar
Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 2544–2550. IEEE
Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Article MATH Google Scholar
Wu H, Sun D, Zhou Z (2004) Micro air vehicle: Configuration, analysis, fabrication, and test. IEEE/ASME Trans Mechatron 9(1):108–117
Article MATH Google Scholar
Grasmeyer J, Keennon M (2001) Development of the black widow micro air vehicle. In: 39th Aerospace Sciences Meeting and Exhibit, p 127
Lee YC, Wang L (2023) Investigation of the flow-transition behaviour on a micro air vehicle. In: 3rd International Conference on Mechanical, Aerospace and Automotive Engineering (CMAAE 2023), vol. 2023, pp 315–319. IET
Tanaka S, Asignacion A, Nakata T, Suzuki S, Liu H (2022) Review of biomimetic approaches for drones. Drones 6(11):320
Article MATH Google Scholar
Zhou H, Wang D, Song H, Nan L, Yang S (2024) Key technologies and development trends of catapult launched foldable unmanned aerial vehicles. Int Core J Eng 10(6):1–14
MATH Google Scholar
Chuang H-M, He D, Namiki A (2019) Autonomous target tracking of uav using high-speed visual feedback. Appl Sci 9(21):4552
Article MATH Google Scholar
Falanga D, Zanchettin A, Simovic A, Delmerico J, Scaramuzza D (2017) Vision-based autonomous quadrotor landing on a moving platform. In: 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR), pp 200–207. IEEE
Feng Y, Wang D, Yang K (2023) Research on target tracking algorithm of micro-uav based on monocular vision. J Robot 2023(1):6657120
Google Scholar
Xiong D, Lu H, Yu Q, Xiao J, Han W, Zheng Z (2020) Parallel tracking and detection for long-term object tracking. Int J Adv Rob Syst 17(2):1729881420902577
MATH Google Scholar
Ji Y, Li W, Li X, Zhang S, Pan F (2019) Multi-object tracking with micro aerial vehicle. J Beijing Inst Technol 28(3):389–398
MATH Google Scholar
Hensel S, Marinov MB, Panter R (2023) Design and implementation of a camera-based tracking system for mav using deep learning algorithms. Computation 11(12):244
Article Google Scholar
Zhang R (2019) Making convolutional networks shift-invariant again. In: International Conference on Machine Learning, pp 7324–7334. PMLR
Abdelouahab K, Pelcat M, Serot J, Berry F (2018) Accelerating cnn inference on fpgas: A survey. arXiv preprint arXiv:1806.01683
Guo K, Sui L, Qiu J, Yu J, Wang J, Yao S, Han S, Wang Y, Yang H (2017) Angel-eye: a complete design flow for mapping cnn onto embedded fpga. IEEE Trans Comput Aided Des Integr Circuits Syst 37(1):35–47
Article MATH Google Scholar
Wu D, Zhang Y, Jia X, Tian L, Li T, Sui L, Xie D, Shan Y (2019) A high-performance cnn processor based on fpga for mobilenets. In: 2019 29th International Conference on Field Programmable Logic and Applications (FPL), pp 136–143. IEEE
Farrukh FUD, Xie T, Zhang C, Wang Z (2018) Optimization for efficient hardware implementation of cnn on fpga. In: 2018 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), pp. 88–89. IEEE
Huang C, Ni S, Chen G (2017) A layer-based structured design of cnn on fpga. In: 2017 IEEE 12th International Conference on ASIC (ASICON), pp. 1037–1040. IEEE
Jameil AK, Al-Raweshidy H (2022) Efficient cnn architecture on fpga using high level module for healthcare devices. IEEE Access 10:60486–60495
Article Google Scholar
Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409
Lukezic A, Vojir T, Čhovin Zajc L, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6309–6318
Danelljan M, Häger G, Khan F, Felsberg M (2014) Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, Nottingham, September 1-5, 2014. Bmva Press
Li F, Tian C, Zuo W, Zhang L, Yang M-H (2018) Learning spatial-temporal regularized correlation filters for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4904–4913
Kiani Galoogahi H, Fagg A, Lucey S (2017) Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1135–1143
Wu Y, Lim J, Yang M-H (2013) Online object tracking: A benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2411–2418
Matthias M, Neil S, Bernard G (2016) A benchmark and simulator for uav tracking. In: European Conference on Computer Vision

Download references

Acknowledgements

Our work is supported by the National Natural Science Foundation of China (Grant No.62271166 and 62401177).

Funding

Our work is supported by the National Natural Science Foundation of China (Grant No.62271166 and 62401177).

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Yikuang Street, Harbin, 150001, Heilongjiang Province, China
Yuxin Zhang, Jiazheng Wen, Ran Wu, Huanyu Liu & Junbao Li

Authors

Yuxin Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Jiazheng Wen
View author publications
You can also search for this author inPubMed Google Scholar
Ran Wu
View author publications
You can also search for this author inPubMed Google Scholar
Huanyu Liu
View author publications
You can also search for this author inPubMed Google Scholar
Junbao Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Conceptualization, J.L. and H.L.; methodology, H.L., Y.Z. and R.W.; software, Y.Z.; validation, Y.Z., J.W. and R.W.; formal analysis, Y.Z. and H.L.; writing—original draft preparation, Y.Z. and J.W.; writing—review and editing, Y.Z., H.L. and J.L.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Huanyu Liu.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Ethics approval and consent to participate

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Wen, J., Wu, R. et al. Siamsdt: a self-adaptive dynamic template siamese network for airborne visual tracking of MAVs on heterogeneous FPGA-SoC. J Supercomput 81, 481 (2025). https://doi.org/10.1007/s11227-025-06928-1

Download citation

Accepted: 09 January 2025
Published: 11 February 2025
DOI: https://doi.org/10.1007/s11227-025-06928-1

Keywords

Part of a collection:

Section - Advanced Embedded Systems

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Siamsdt: a self-adaptive dynamic template siamese network for airborne visual tracking of MAVs on heterogeneous FPGA-SoC

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning convolutional self-attention module for unmanned aerial vehicle tracking

TMTB: Transformer based multi-task branching multi-object tracking algorithm for wide-view scenes

Siamese object tracking for unmanned aerial vehicle: a review and comprehensive analysis

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval and consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now