Skip to main content

Towards Highly Effective Moving Tiny Ball Tracking via Vision Transformer

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14864))

Included in the following conference series:

  • 465 Accesses

Abstract

Recent tiny ball tracking methods based on deep neural networks have significantly progressed. However, since moving balls in the video are always blurred, most existing methods cannot achieve accurate tracking due to limited receptive fields and sampling depth. Furthermore, as high-resolution competition videos become increasingly common, existing methods perform poorly on high-resolution images. To this end, we provide a strong baseline for tracking tiny balls called TrackFormer. Firstly, we use Vision Transformer to build the whole network architecture and enhance the tiny ball localization through its powerful spatial mining ability. Secondly, we develop a Global Context Sampling Module (GCSM) to capture more powerful global features, thereby increasing the accuracy of tiny ball identification. Finally, we design a Context Enhancement Module (CEM) to enhance tiny ball semantics to achieve robust tracking performance. To promote research and development of tiny ball tracking, we establish a Large-scale Tiny Ball Tracking dataset called LaTBT. Specifically, LaTBT is founded on three types of tiny balls (badminton, tennis, and squash), offering more than 300 video sequences and over 223K annotations from 19 types of professional matches to address various tracking challenges in diverse and complex backgrounds. To our knowledge, LaTBT is the first large-scale dataset for tiny ball tracking. Experiments demonstrate that our baseline achieves state-of-the-art performance on our proposed benchmark dataset. The dataset and the algorithm code are available at https://github.com/Gi-gigi/TrackFormer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chu, W.-T., Situmeang, S.I.G.: Badminton video analysis based on spatiotemporal and stroke features. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (2017)

    Google Scholar 

  2. Huang, Y., et al.: TrackNet: a deep learning network for tracking high-speed and tiny objects in sports applications. In: 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 1–8 (2019)

    Google Scholar 

  3. Sun, Nien-En et al. TrackNetV2: efficient shuttlecock tracking network. In: 2020 International Conference on Pervasive Artificial Intelligence, pp. 86–91 (2020)

    Google Scholar 

  4. Paul, M., et al.: Robust visual tracking by segmentation. arXiv preprint arXiv:2203.11191 (2022)

  5. Cao, S., et al.: SwinCGH-Net: enhancing robustness of object detection in autonomous driving with weather noise via attention. In: International Conference on Intelligent Computing (2023)

    Google Scholar 

  6. Yu, M., Leung, H.: Small-object detection for UAV-based images. In: 2023 IEEE International Systems Conference, pp. 1–6 (2023)

    Google Scholar 

  7. Zhu, Y., et al.: Tiny object tracking: a large-scale dataset and a baseline. IEEE Trans. Neural Networks and Learning Systems PP (2022)

    Google Scholar 

  8. Yang, X., et al.: Relation learning reasoning meets tiny object tracking in satellite videos. IEEE Trans. Geosci. Remote Sens. (2024)

    Google Scholar 

  9. Wu, X., et al.: UIU-Net: U-Net in U-Net for infrared small object detection. IEEE Trans. Image Process. 32, 364–376 (2022)

    Article  Google Scholar 

  10. Qin, X., et al.: U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection. Pattern Recognit. 106, 107404 (2020)

    Article  Google Scholar 

  11. Tian, X., et al.: Bi-directional object-context prioritization learning for saliency ranking. In; 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5872–5881 (2022)

    Google Scholar 

  12. Wang, W., et al.: PVT v2: improved baselines with pyramid vision transformer. Comput. Visual Media 8, 415–424 (2021)

    Article  Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR. arXiv preprint arXiv:1412.6980 (2014)

  14. Kim, T., et al.: Revisiting image pyramid structure for high resolution salient object detection. In: Asian Conference on Computer Vision (2022)

    Google Scholar 

  15. Zhang, T., et al.: AGPCNet: attention-guided pyramid context networks for infrared small target detection. arXiv preprint arXiv:2111.03580 (2021)

    Google Scholar 

  16. Sun, H., et al.: Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset IRDST. IEEE Trans. Geosci. Remote Sens. 61, 1–13 (2023)

    Article  Google Scholar 

  17. Wu, T., et al.: MTU-Net: Multilevel TransUNet for Space-Based Infrared Tiny Ship Detection. IEEE Trans. Geosci. Remote Sens. 61, 1–15 (2022)

    Google Scholar 

  18. Liu, J., et al.: PoolNet+: exploring the potential of pooling for salient object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45, 887–904 (2022)

    Article  Google Scholar 

  19. Hamed, B.A., Ibrahim, O.A.S., Abd, E.-H.: Optimizing classification efficiency with machine learning techniques for pattern matching. J. Big Data 10(1), 124 (2023)

    Article  Google Scholar 

  20. Liu, N., et al.: Visual saliency transformer. In: 2021 IEEE/CVF International Conference on Computer Vision, pp, 4702–4712 (2021)

    Google Scholar 

Download references

Acknowledgments

This work is Supported by the National Natural Science Foundation of China under Grant 61672128, and in part by the Dalian Key Field Innovation Team Support Plan under Grant 2020RT07.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jizhe Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, J., Liu, Y., Wei, H., Xu, K., Cao, Y., Li, J. (2024). Towards Highly Effective Moving Tiny Ball Tracking via Vision Transformer. In: Huang, DS., Si, Z., Pan, Y. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14864. Springer, Singapore. https://doi.org/10.1007/978-981-97-5588-2_31

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-5588-2_31

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-5587-5

  • Online ISBN: 978-981-97-5588-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics