skip to main content
10.1145/3689061.3689073acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

SynthNet: Leveraging Synthetic Data for 3D Trajectory Estimation from Monocular Video

Published: 28 October 2024 Publication History

Abstract

Reconstructing 3D trajectories from video is often cumbersome and expensive, relying on complex or multi-camera setups. This paper proposes SynthNet, an end-to-end pipeline for monocular reconstruction of 3D tennis ball trajectories. The pipeline consists of two parts: Hit and bounce detection and 3D trajectory reconstruction. The hit and bounce detection is performed by a GRU-based model, which segments the videos into individual shots. Next, a fully connected neural network reconstructs the 3D trajectory through a novel physics-based training approach relying on purely synthetic training data. Instability in the training loop caused by relying on Euler-time integration and camera projections is circumvented by our synthetic approach, which directly calculates loss from estimated initial conditions, improving stability and performance.\\ In experiments, SynthNet is compared to an existing reconstruction baseline on a number of conventional and customized metrics defined to validate our synthetic approach. SynthNet outperforms the baseline based on our own proposed metrics and in a qualitative inspection of the reconstructed 3D trajectories.

References

[1]
Vanyi Chao, Ankhzaya Jamsrandorj, Yin May Oo, Kyung-Ryoul Mun, and Jinwook Kim. 2023. 3D Ball Trajectory Reconstruction of a Ballistic Shot from a Monocular Basketball Video. In IECON 2023- 49th Annual Conference of the IEEE Industrial Electronics Society. IEEE, 1--6. https://doi.org/10.1109/IECON51785.2023.10312079 ISSN: 2577--1647.
[2]
Hua-Tsung Chen, Wen-Jiin Tsai, Suh-Yin Lee, and Jen-Yu Yu. 2012. Ball tracking and 3D trajectory approximation with applications to tactics analysis from single-camera volleyball sequences. Multimedia Tools and Applications, Vol. 60, 3 (Oct. 2012), 641--667. https://doi.org/10.1007/s11042-011-0833-y
[3]
Dirk Farin, Susanne Krabbe, Peter H. N. De With, and Wolfgang Effelsberg. 2003. Robust camera calibration for sport videos using court models. In Storage and Retrieval Methods and Applications for Multimedia 2004, Minerva M. Yeung, Rainer W. Lienhart, and Chung-Sheng Li (Eds.), Vol. 5307. SPIE, San Jose, CA, 80--91. https://doi.org/10.1117/12.526813
[4]
Megan Fazio, KS Fisher, and Tori Fujinami. 2018. Tennis ball tracking: 3-D trajectory estimation using smartphone videos. Department of Electrical Engineering, Stanford University (2018).
[5]
Q Huang, S Cox, F Yan, TE deCampos, D Windridge, J Kittler, and W Christmas. 20110831 - 20110903. Improved Detection of Ball Hit Events in a Tennis Game Using Multimodal Information, In International Conference on Auditory-Visual Speech Processing. 11th International Conference on Auditory-Visual Speech Processing (AVSP) (20110831 - 20110903).
[6]
Qiang Huang, Stephen Cox, Xiangzeng Zhou, and Lei Xie. 2012. Detection of ball hits in a tennis game using audio and visual information. In Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference. IEEE, 1--10.
[7]
Yu-Chuan Huang, I.-No Liao, Ching-Hsuan Chen, and Wen-Chih Peng. 2019. TrackNet: A Deep Learning Network for Tracking High-speed and Tiny Objects in Sports Applications. http://arxiv.org/abs/1907.03698 arXiv:1907.03698 [cs, stat].
[8]
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. 2023. Ultralytics YOLO. https://github.com/ultralytics/ultralytics Retrieved may 1, 2024 from
[9]
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. 2023. Ultralytics YOLO. https://github.com/ultralytics/ultralytics Retrieved may 1, 2024 from
[10]
S.X. Ju, M.J. Black, and Y. Yacoob. 1996. Cardboard people: a parameterized model of articulated image motion. In Proceedings of the Second International Conference on Automatic Face and Gesture Recognition. IEEE Comput. Soc. Press, Killington, VT, USA, 38--44. https://doi.org/10.1109/AFGR.1996.557241
[11]
Jacek Komorowski, Grzegorz Kurzejamski, and Grzegorz Sarwas. 2019. DeepBall: Deep Neural-Network Ball Detector. In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SCITEPRESS - Science and Technology Publications. https://doi.org/10.5220/0007348902970304
[12]
Paul Liu and Jui-Hsien Wang. 2022. MonoTrack: Shuttle trajectory reconstruction from monocular badminton video. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 3512--3521. https://doi.org/10.1109/CVPRW56347.2022.00395
[13]
Peng Lu, Tao Jiang, Yining Li, Xiangtai Li, Kai Chen, and Wenming Yang. 2024. RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation. http://arxiv.org/abs/2312.07526 arXiv:2312.07526 [cs].
[14]
M. J. D. Powell. 1964. An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput. J., Vol. 7, 2 (Jan. 1964), 155--162. https://doi.org/10.1093/comjnl/7.2.155
[15]
Jinchang Ren, James Orwell, Graeme A. Jones, and Ming Xu. 2009. Tracking the soccer ball using multiple fixed cameras. Computer Vision and Image Understanding, Vol. 113, 5 (2009), 633--642. https://doi.org/10.1016/j.cviu.2008.01.007 Computer Vision Based Analysis in Sport Environments.
[16]
Kosolapov Sergey. 2023. TennisCourtDetector. https://github.com/yastrebksv/TennisCourtDetector Accessed: March 2024.
[17]
Lejun Shen, Qing Liu, Lin Li, and Haipeng Yue. 2016. 3D reconstruction of ball trajectory from a single camera in the ball game. In Proceedings of the 10th International Symposium on Computer Science in Sports (ISCSS), Paul Chung, Andrea Soltoggio, Christian W. Dawson, Qinggang Meng, and Matthew Pain (Eds.). Vol. 392. Springer International Publishing, Cham, 33--39. https://doi.org/10.1007/978--3--319--24560--7_5 Series Title: Advances in Intelligent Systems and Computing.
[18]
Maria Skublewska-Paszkowska. Learning Three Dimensional Tennis Shots Using Graph Convolutional Networks. Sensors, Vol. 20 (10 2020), 6094. https://doi.org/10.3390/s20216094
[19]
Shuhei Tarashima, Muhammad Abdul Haq, Yushan Wang, and Norio Tagawa. 2023. Widely Applicable Strong Baseline for Sports Ball Detection and Tracking. http://arxiv.org/abs/2311.05237 BMVC2023.
[20]
Gabriel Van Zandycke and Christophe De Vleeschouwer. 2019. Real-time CNN-based Segmentation Architecture for Ball Detection in a Single View Setup. In Proceedings Proceedings of the 2nd International Workshop on Multimedia Content Analysis in Sports. ACM. https://doi.org/10.1145/3347318.3355517
[21]
Fei Wang, Lifeng Sun, Bo Yang, and Shiqiang Yang. 2006. Fast Arc Detection Algorithm for Play Field Registration in Soccer Video Mining. In 2006 IEEE International Conference on Systems, Man and Cybernetics, Vol. 6. IEEE, 4932--4936. https://doi.org/10.1109/ICSMC.2006.385087 ISSN: 1062--922X.
[22]
T. Watanabe, M. Haseyama, and H. Kitajima. 2004. A soccer field tracking method with wire frame model from TV images. In 2004 International Conference on Image Processing, 2004. ICIP '04., Vol. 3. IEEE, 1633--1636 Vol. 3. https://doi.org/10.1109/ICIP.2004.1421382 ISSN: 1522--4880.
[23]
Qingyu Xiao, Zulfiqar Zaidi, and Matthew Gombolay. 2024. Multi-Camera Asynchronous Ball Localization and Trajectory Prediction with Factor Graphs and Human Poses. http://arxiv.org/abs/2401.17185 arXiv:2401.17185 [cs].
[24]
Jia xin Cai and Xin Tang. 2018. RGB Video Based Tennis Action Recognition Using a Deep Historical Long Short-Term Memory. arXiv: Computer Vision and Pattern Recognition (2018). https://api.semanticscholar.org/CorpusID:52824593
[25]
Fei Yan. 2005. Tennis ball tracking for automatic annotation of broadcast tennis video. Proceedings of the British Machine Vision Conference (2005). https://www.semanticscholar.org/paper/Tennis-ball-tracking-for-automatic-annotation-of-Yan/d135b747e99e6a06f5ecac5462b53c1b7bd259e2?p2df

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MMSports '24: Proceedings of the 7th ACM International Workshop on Multimedia Content Analysis in Sports
October 2024
113 pages
ISBN:9798400711985
DOI:10.1145/3689061
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Check for updates

Author Tags

  1. 3d reconstruction
  2. ball tracking
  3. computer vision
  4. differential equations
  5. machine learning in sports
  6. neural network
  7. synthetic data

Qualifiers

  • Research-article

Conference

MM '24
Sponsor:
MM '24: The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne VIC, Australia

Acceptance Rates

Overall Acceptance Rate 29 of 49 submissions, 59%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 213
    Total Downloads
  • Downloads (Last 12 months)213
  • Downloads (Last 6 weeks)64
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media