Towards Highly Effective Moving Tiny Ball Tracking via Vision Transformer

Yu, Jizhe; Liu, Yu; Wei, Hongkui; Xu, Kaiping; Cao, Yifei; Li, Jiangquan

doi:10.1007/978-981-97-5588-2_31

Jizhe Yu ORCID: orcid.org/0009-0008-8365-2317¹⁰,
Yu Liu¹⁰,
Hongkui Wei¹¹,
Kaiping Xu¹⁰,
Yifei Cao¹⁰ &
…
Jiangquan Li¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14864))

Included in the following conference series:

International Conference on Intelligent Computing

465 Accesses

Abstract

Recent tiny ball tracking methods based on deep neural networks have significantly progressed. However, since moving balls in the video are always blurred, most existing methods cannot achieve accurate tracking due to limited receptive fields and sampling depth. Furthermore, as high-resolution competition videos become increasingly common, existing methods perform poorly on high-resolution images. To this end, we provide a strong baseline for tracking tiny balls called TrackFormer. Firstly, we use Vision Transformer to build the whole network architecture and enhance the tiny ball localization through its powerful spatial mining ability. Secondly, we develop a Global Context Sampling Module (GCSM) to capture more powerful global features, thereby increasing the accuracy of tiny ball identification. Finally, we design a Context Enhancement Module (CEM) to enhance tiny ball semantics to achieve robust tracking performance. To promote research and development of tiny ball tracking, we establish a Large-scale Tiny Ball Tracking dataset called LaTBT. Specifically, LaTBT is founded on three types of tiny balls (badminton, tennis, and squash), offering more than 300 video sequences and over 223K annotations from 19 types of professional matches to address various tracking challenges in diverse and complex backgrounds. To our knowledge, LaTBT is the first large-scale dataset for tiny ball tracking. Experiments demonstrate that our baseline achieves state-of-the-art performance on our proposed benchmark dataset. The dataset and the algorithm code are available at https://github.com/Gi-gigi/TrackFormer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Tracking Small and Fast Moving Ball in Broadcast Videos Using Transfer Learning and the Enhanced Interactive Multi-motion Model

An Improved YOLOv8 Target Detection Algorithm and Its Application in Tennis Ball Picking Robot

Real-Time Tracking of Basketball Trajectory Based on the Associative MCMC Model

Article 09 July 2024

References

Chu, W.-T., Situmeang, S.I.G.: Badminton video analysis based on spatiotemporal and stroke features. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (2017)
Google Scholar
Huang, Y., et al.: TrackNet: a deep learning network for tracking high-speed and tiny objects in sports applications. In: 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 1–8 (2019)
Google Scholar
Sun, Nien-En et al. TrackNetV2: efficient shuttlecock tracking network. In: 2020 International Conference on Pervasive Artificial Intelligence, pp. 86–91 (2020)
Google Scholar
Paul, M., et al.: Robust visual tracking by segmentation. arXiv preprint arXiv:2203.11191 (2022)
Cao, S., et al.: SwinCGH-Net: enhancing robustness of object detection in autonomous driving with weather noise via attention. In: International Conference on Intelligent Computing (2023)
Google Scholar
Yu, M., Leung, H.: Small-object detection for UAV-based images. In: 2023 IEEE International Systems Conference, pp. 1–6 (2023)
Google Scholar
Zhu, Y., et al.: Tiny object tracking: a large-scale dataset and a baseline. IEEE Trans. Neural Networks and Learning Systems PP (2022)
Google Scholar
Yang, X., et al.: Relation learning reasoning meets tiny object tracking in satellite videos. IEEE Trans. Geosci. Remote Sens. (2024)
Google Scholar
Wu, X., et al.: UIU-Net: U-Net in U-Net for infrared small object detection. IEEE Trans. Image Process. 32, 364–376 (2022)
Article Google Scholar
Qin, X., et al.: U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection. Pattern Recognit. 106, 107404 (2020)
Article Google Scholar
Tian, X., et al.: Bi-directional object-context prioritization learning for saliency ranking. In; 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5872–5881 (2022)
Google Scholar
Wang, W., et al.: PVT v2: improved baselines with pyramid vision transformer. Comput. Visual Media 8, 415–424 (2021)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR. arXiv preprint arXiv:1412.6980 (2014)
Kim, T., et al.: Revisiting image pyramid structure for high resolution salient object detection. In: Asian Conference on Computer Vision (2022)
Google Scholar
Zhang, T., et al.: AGPCNet: attention-guided pyramid context networks for infrared small target detection. arXiv preprint arXiv:2111.03580 (2021)
Google Scholar
Sun, H., et al.: Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset IRDST. IEEE Trans. Geosci. Remote Sens. 61, 1–13 (2023)
Article Google Scholar
Wu, T., et al.: MTU-Net: Multilevel TransUNet for Space-Based Infrared Tiny Ship Detection. IEEE Trans. Geosci. Remote Sens. 61, 1–15 (2022)
Google Scholar
Liu, J., et al.: PoolNet+: exploring the potential of pooling for salient object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45, 887–904 (2022)
Article Google Scholar
Hamed, B.A., Ibrahim, O.A.S., Abd, E.-H.: Optimizing classification efficiency with machine learning techniques for pattern matching. J. Big Data 10(1), 124 (2023)
Article Google Scholar
Liu, N., et al.: Visual saliency transformer. In: 2021 IEEE/CVF International Conference on Computer Vision, pp, 4702–4712 (2021)
Google Scholar

Download references

Acknowledgments

This work is Supported by the National Natural Science Foundation of China under Grant 61672128, and in part by the Dalian Key Field Innovation Team Support Plan under Grant 2020RT07.

Author information

Authors and Affiliations

DUT School of Software Technology and DUT-RU International School of Information Science and Engineering, Dalian University of Technology, Dalian, China
Jizhe Yu, Yu Liu, Kaiping Xu, Yifei Cao & Jiangquan Li
State Key Laboratory of Intelligent Manufacturing System Technology, Beijing Institute of Electronic System Engineering, Beijing, China
Hongkui Wei

Authors

Jizhe Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hongkui Wei
View author publications
You can also search for this author in PubMed Google Scholar
Kaiping Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yifei Cao
View author publications
You can also search for this author in PubMed Google Scholar
Jiangquan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jizhe Yu .

Editor information

Editors and Affiliations

Eastern Institute of Technology, Ningbo, China
De-Shuang Huang
Tianjin University of Science and Technology, Tianjin, China
Zhanjun Si
Eastern Institute of Technology, Ningbo, China
Yijie Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, J., Liu, Y., Wei, H., Xu, K., Cao, Y., Li, J. (2024). Towards Highly Effective Moving Tiny Ball Tracking via Vision Transformer. In: Huang, DS., Si, Z., Pan, Y. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14864. Springer, Singapore. https://doi.org/10.1007/978-981-97-5588-2_31

Download citation

DOI: https://doi.org/10.1007/978-981-97-5588-2_31
Published: 13 August 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5587-5
Online ISBN: 978-981-97-5588-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Highly Effective Moving Tiny Ball Tracking via Vision Transformer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Tracking Small and Fast Moving Ball in Broadcast Videos Using Transfer Learning and the Enhanced Interactive Multi-motion Model

An Improved YOLOv8 Target Detection Algorithm and Its Application in Tennis Ball Picking Robot

Real-Time Tracking of Basketball Trajectory Based on the Associative MCMC Model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Towards Highly Effective Moving Tiny Ball Tracking via Vision Transformer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Tracking Small and Fast Moving Ball in Broadcast Videos Using Transfer Learning and the Enhanced Interactive Multi-motion Model

An Improved YOLOv8 Target Detection Algorithm and Its Application in Tennis Ball Picking Robot

Real-Time Tracking of Basketball Trajectory Based on the Associative MCMC Model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation