research-article

P²ANet: A Large-Scale Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos

Authors:

Qingzhong Wang,

Haoyi XiongAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 4

Article No.: 118, Pages 1 - 23

https://doi.org/10.1145/3633516

Published: 11 January 2024 Publication History

Abstract

While deep learning has been widely used for video analytics, such as video classification and action detection, dense action detection with fast-moving subjects from sports videos is still challenging. In this work, we release yet another sports video benchmark P²ANet for Ping Pong-Action detection, which consists of 2,721 video clips collected from the broadcasting videos of professional table tennis matches in World Table Tennis Championships and Olympiads. We work with a crew of table tennis professionals and referees on a specially designed annotation toolbox to obtain fine-grained action labels (in 14 classes) for every ping-pong action that appeared in the dataset, and formulate two sets of action detection problems—action localization and action recognition. We evaluate a number of commonly seen action recognition (e.g., TSM, TSN, Video SwinTransformer, and Slowfast) and action localization models (e.g., BSN, BSN++, BMN, TCANet), using P²ANet for both problems, under various settings. These models can only achieve 48% area under the AR-AN curve for localization and 82% top-one accuracy for recognition since the ping-pong actions are dense with fast-moving subjects but broadcasting videos are with only 25 FPS. The results confirm that P²ANet is still a challenging task and can be used as a special benchmark for dense action detection from videos. We invite readers to examine our dataset by visiting the following link: https://github.com/Fred1991/P2ANET.

References

[1]

Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. 2016. Youtube-8M: A large-scale video classification benchmark. arXiv:1609.08675. Retrieved from https://arxiv.org/abs/1609.08675

[2]

Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, and Cordelia Schmid. 2021. ViViT: A video vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6836–6846.

[3]

Hongshu Bao and Xiang Yao. 2021. Dynamic 3D image simulation of basketball movement based on embedded system and computer vision. Microprocessors and Microsystems 81, C (2021), 103655.

Digital Library

[4]

Gedas Bertasius, Hyun Soo Park, Stella X. Yu, and Jianbo Shi. 2017. Am I a baller? Basketball performance assessment from first-person videos. In Proceedings of the IEEE International Conference on Computer Vision. 2177–2185.

[5]

Gedas Bertasius, Heng Wang, and Lorenzo Torresani. 2021. Is space-time attention all you need for video understanding?. In Proceedings of the International Conference on Machine Learning. PMLR, 813–824.

[6]

Jiang Bian, Abdullah Al Arafat, Haoyi Xiong, Jing Li, Li Li, Hongyang Chen, Jun Wang, Dejing Dou, and Zhishan Guo. 2022. Machine learning in real-time internet of things (IoT) systems: A survey. IEEE Internet of Things Journal 9, 11 (2022), 8364–8386.

[7]

Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, and Juan Carlos Niebles. 2015. ActivityNet: A large-scale video benchmark for human activity understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 961–970.

[8]

Desheng Cai, Shengsheng Qian, Quan Fang, Jun Hu, Wenkui Ding, and Changsheng Xu. 2023. Heterogeneous graph contrastive learning network for personalized micro-video recommendation. IEEE Transactions on Multimedia 25 (2023), 2761–2773.

[9]

Joao Carreira and Andrew Zisserman. 2017. Quo vadis, action recognition? A new model and the kinetics dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6299–6308.

[10]

Mohamed Chetoui and Moulay A. Akhloufi. 2020. Explainable end-to-end deep learning for diabetic retinopathy detection across multiple datasets. Journal of Medical Imaging 7, 4 (2020), 044503–044503.

[11]

Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V. Le. 2019. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 113–123.

[12]

Xiyang Dai, Bharat Singh, Guyue Zhang, Larry S. Davis, and Yan Qiu Chen. 2017. Temporal context network for activity localization in videos. In Proceedings of the IEEE International Conference on Computer Vision. 5793–5802.

[13]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol. 1, IEEE, 886–893.

Digital Library

[14]

Adrien Deliege, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, and Marc Van Droogenbroeck. 2021. SoccerNet-v2: A dataset and benchmarks for holistic understanding of broadcast soccer videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4508–4519.

[15]

Jianfeng Dong, Xirong Li, Chaoxi Xu, Xun Yang, Gang Yang, Xun Wang, and Meng Wang. 2022. Dual encoding for video retrieval by Text. IEEE Trans. Pattern Anal. Mach. Intell. 44, 8 (Aug 2022), 4065–4080.

[16]

Hayden Faulkner and Anthony Dick. 2017. TenniSet: A dataset for dense fine-grained event recognition, localisation and description. In Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications. IEEE, 1–8.

[17]

Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, and Kaiming He. 2019. Slowfast networks for video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6202–6211.

[18]

Valentin Gabeur, Chen Sun, Karteek Alahari, and Cordelia Schmid. 2020. Multi-modal transformer for video retrieval. In Proceedings of the European Conference on Computer Vision. Springer, 214–229.

Digital Library

[19]

Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, Joanna Materzynska, Susanne Westphal, Heuna Kim, Valentin Haenel, Ingo Fruend, Peter Yianilos, Moritz Mueller-Freitag, Florian Hoppe, Christian Thurau, Ingo Bax, and Roland Memisevic. 2017. The “something something” video database for learning and evaluating visual common sense. In Proceedings of the IEEE International Conference on Computer Vision. 5842–5850.

[20]

Chunhui Gu, Chen Sun, David A Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, and Jitendra Malik. 2018. Ava: A video dataset of spatio-temporally localized atomic visual actions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6047–6056.

[21]

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh. 2018. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet?. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6546–6555.

[22]

Mostafa S. Ibrahim, Srikanth Muralidharan, Zhiwei Deng, Arash Vahdat, and Greg Mori. 2016. A hierarchical deep temporal model for group activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1971–1980.

[23]

Yudong Jiang, Kaixu Cui, Leilei Chen, Canjin Wang, and Changliang Xu. 2020. SoccerDB: A large-scale database for comprehensive video understanding. In Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports. 1–8.

Digital Library

[24]

Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1725–1732.

Digital Library

[25]

Masanari Kimura. 2021. Understanding test-time augmentation. In Proceedings of the International Conference on Neural Information Processing. Springer, 558–569.

Digital Library

[26]

Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, and Boqing Gong. 2021. MoViNets: Mobile video networks for efficient video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16020–16030.

[27]

Maria Koshkina, Hemanth Pidaparthy, and James H. Elder. 2021. Contrastive learning for sports video: Unsupervised player classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4528–4536.

[28]

Kaustubh Milind Kulkarni and Sucheth Shenoy. 2021. Table tennis stroke recognition using two-dimensional human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4576–4584.

[29]

Ji Lin, Chuang Gan, and Song Han. 2019. TSM: Temporal shift module for efficient video understanding. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7083–7093.

[30]

Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, and Shilei Wen. 2019. BMN: Boundary-matching network for temporal action proposal generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3889–3898.

[31]

Tianwei Lin, Xu Zhao, Haisheng Su, Chongjing Wang, and Ming Yang. 2018. BSN: Boundary sensitive network for temporal action proposal generation. In Proceedings of the European Conference on Computer Vision. 3–19.

Digital Library

[32]

Wu Liu, Chenggang Clarence Yan, Jiangyu Liu, and Huadong Ma. 2017. Deep learning based basketball video analysis for intelligent arena application. Multimedia Tools and Applications 76, 23 (2017), 24983–25001.

Digital Library

[33]

Yang Liu, Samuel Albanie, Arsha Nagrani, and Andrew Zisserman. 2019. Use what you have: Video retrieval using representations from collaborative experts. arXiv:1907.13487. Retrieved from https://arxiv.org/abs/1907.13487

[34]

Yang Liu, Cheng Lyu, Zhiyuan Liu, and Dacheng Tao. 2019. Building effective short video recommendation. In Proceedings of the 2019 IEEE International Conference on Multimedia & Expo Workshops. IEEE, 651–656.

[35]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10012–10022.

[36]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10012–10022.

[37]

Xiang Long, Chuang Gan, Gerard De Melo, Jiajun Wu, Xiao Liu, and Shilei Wen. 2018. Attention clusters: Purely attention based local feature integration for video classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7834–7843.

[38]

Xiang Ma, Yonglei Li, Lipengcheng Wan, Zexin Xu, Jiannong Song, and Jinqiu Huang. 2023. Classification of seed corn ears based on custom lightweight convolutional neural network and improved training strategies. Engineering Applications of Artificial Intelligence 120, C (2023), 105936.

Digital Library

[39]

Yanjun Ma, Dianhai Yu, Tian Wu, and Haifeng Wang. 2019. PaddlePaddle: An open-source deep learning platform from industrial practice. Frontiers of Data and Domputing 1, 1 (2019), 105–115.

[40]

Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, and Julien Morlier. 2018. Sport action recognition with siamese spatio-temporal CNNs: Application to table tennis. In Proceedings of the 2018 International Conference on Content-Based Multimedia Indexing. IEEE, 1–6.

[41]

Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, and Julien Morlier. 2021. Three-stream 3D/1D CNN for fine-grained action classification and segmentation in table tennis. In Proceedings of the 4th International Workshop on Multimedia Content Analysis in Sports. 35–41.

Digital Library

[42]

Megha Nawhal and Greg Mori. 2021. Activity graph transformer for temporal action localization. arXiv:2101.08540. Retrieved from https://arxiv.org/abs/2101.08540

[43]

Juan Carlos Niebles, Chih-Wei Chen, and Li Fei-Fei. 2010. Modeling temporal structure of decomposable motion segments for activity classification. In Proceedings of the European Conference on Computer Vision. Springer, 392–405.

[44]

Olympedia. 2012. Referee Biographical Information. Retrieved 13 December 2023 from http://www.olympedia.org/athletes/5004924

[45]

Zhiwu Qing, Haisheng Su, Weihao Gan, Dongliang Wang, Wei Wu, Xiang Wang, Yu Qiao, Junjie Yan, Changxin Gao, and Nong Sang. 2021. Temporal context aggregation network for temporal action proposal refinement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 485–494.

[46]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems.

[47]

Dian Shao, Yue Zhao, Bo Dai, and Dahua Lin. 2020. FineGym: A hierarchical video dataset for fine-grained action understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2616–2625.

[48]

Zheng Shou, Dongang Wang, and Shih-Fu Chang. 2016. Temporal action localization in untrimmed videos via multi-stage CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1049–1058.

[49]

Karen Simonyan and Andrew Zisserman. 2014. Two-stream convolutional networks for action recognition in videos. Proceedings of the 27th International Conference on Neural Information Processing Systems.

[50]

Khurram Soomro and Amir R. Zamir. 2014. Action recognition in realistic sports videos. In Computer Vision in Sports, Thomas B. Moeslund, Graham Thomas, Adrian Hilton (Eds.)., Springer, 181–208.

[51]

Panyawut Sri-Iesaranusorn, Felan Carlo Garcia, Francis Tiausas, Supatsara Wattanakriengkrai, Kazushi Ikeda, and Junichiro Yoshimoto. 2021. Toward the perfect stroke: A multimodal approach for table tennis stroke evaluation. In Proceedings of the 2021 13th International Conference on Mobile Computing and Ubiquitous Network. IEEE, 1–5.

[52]

Haisheng Su, Weihao Gan, Wei Wu, Yu Qiao, and Junjie Yan. 2021. Bsn++: Complementary boundary regressor with scale-balanced relation modeling for temporal action proposal generation. In Proceedings of the AAAI Conference on Artificial Intelligence, 35 (2021), 2602–2610.

[53]

Haritha Thilakarathne, Aiden Nibali, Zhen He, and Stuart Morgan. 2022. Pose is all you need: The pose only group activity recognition system (pogars). Machine Vision and Applications 33, 6 (2022), 95.

[54]

Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning spatiotemporal features with 3D convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 4489–4497.

Digital Library

[55]

Roman Voeikov, Nikolay Falaleev, and Ruslan Baikulov. 2020. TTNet: Real-time temporal and spatial video analysis of table tennis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 884–885.

[56]

Nick Wadsworth, Lewis Charnock, Jamie Russell, and Martin Littlewood. 2020. Use of video-analysis feedback within a six-month coach education program at a professional football club. Journal of Sport Psychology in Action 11, 2 (2020), 73–91.

[57]

Bin Wang, Wei Shen, FanSheng Chen, and Dan Zeng. 2019. Football match intelligent editing system based on deep learning. KSII Transactions on Internet and Information Systems 13, 10 (2019), 5130–5143.

[58]

Limin Wang, Bingkun Huang, Zhiyu Zhao, Zhan Tong, Yinan He, Yi Wang, Yali Wang, and Yu Qiao. 2023. VideoMAE v2: Scaling video masked autoencoders with dual masking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14549–14560.

[59]

Limin Wang, Yu Qiao, Xiaoou Tang. 2014. Action recognition and detection by combining motion and appearance features. THUMOS14 Action Recognition Challenge 1, 2 (2014), 2.

[60]

Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. 2016. Temporal segment networks: Towards good practices for deep action recognition. In Proceedings of the European Conference on Computer Vision. Springer, 20–36.

[61]

Shangfei Wang, Longfei Hao, and Qiang Ji. 2019. Knowledge-augmented multimodal deep regression Bayesian networks for emotion video tagging. IEEE Transactions on Multimedia 22, 4 (2019), 1084–1097.

Digital Library

[62]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7794–7803.

[63]

Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Changxin Gao, and Nong Sang. 2021. Self-supervised learning for semi-supervised temporal action proposal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1905–1914.

[64]

Chen Wei, Haoqi Fan, Saining Xie, Chao-Yuan Wu, Alan Yuille, and Christoph Feichtenhofer. 2022. Masked feature prediction for self-supervised visual pre-training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14668–14678.

[65]

Chao-Yuan Wu, Ross Girshick, Kaiming He, Christoph Feichtenhofer, and Philipp Krahenbuhl. 2020. A multigrid method for efficiently training video models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 153–162.

[66]

Fei Wu, Qingzhong Wang, Jiang Bian, Ning Ding, Feixiang Lu, Jun Cheng, Dejing Dou, and Haoyi Xiong. 2023. A survey on video action recognition in sports: Datasets, methods and applications. IEEE Transactions on Multimedia. 25 (2023), 7943–7966.

[67]

Saining Xie, Chen Sun, Jonathan Huang, Zhuowen Tu, and Kevin Murphy. 2018. Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification. In Proceedings of the European Conference on Computer Vision. 305–321.

Digital Library

[68]

Mengmeng Xu, Juan-Manuel Pérez-Rúa, Victor Escorcia, Brais Martinez, Xiatian Zhu, Li Zhang, Bernard Ghanem, and Tao Xiang. 2021. Boundary-sensitive pre-training for temporal localization in videos. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7220–7230.

[69]

Huan Yan, Chunfeng Yang, Donghan Yu, Yong Li, Depeng Jin, and Dah Ming Chiu. 2019. Multi-site user behavior modeling and its application in video recommendation. IEEE Transactions on Knowledge and Data Engineering 33, 1 (2019), 180–193.

Digital Library

[70]

Serena Yeung, Olga Russakovsky, Greg Mori, and Li Fei-Fei. 2016. End-to-end learning of action detection from frame glimpses in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2678–2687.

[71]

Jun Yuan, Bingbing Ni, Xiaokang Yang, and Ashraf A. Kassim. 2016. Temporal action localization with pyramid of score distribution features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3093–3102.

[72]

Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and George Toderici. 2015. Beyond short snippets: Deep networks for video classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4694–4702.

[73]

Cemil Zalluhoglu and Nazli Ikizler-Cinbis. 2020. Collective sports: A multi-task dataset for collective activity recognition. Image and Vision Computing 94 (2020), 103870.

[74]

Yi Zhu, Xinyu Li, Chunhui Liu, Mohammadreza Zolfaghari, Yuanjun Xiong, Chongruo Wu, Zhi Zhang, Joseph Tighe, R. Manmatha, and Mu Li. 2020. A comprehensive study of deep video action recognition. arXiv:2012.06567. Retrieved from https://arxiv.org/abs/2012.06567

Cited By

Fujihara YShimada TKong XTanaka ANishikawa HTomiyama H(2025)Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per StrokeSensors10.3390/s2503083425:3(834)Online publication date: 30-Jan-2025
https://doi.org/10.3390/s25030834
Yang GNguyen MYan WLi X(2024)Foul Detection for Table Tennis Serves Using Deep LearningElectronics10.3390/electronics1401002714:1(27)Online publication date: 25-Dec-2024
https://doi.org/10.3390/electronics14010027
Ma LTong Y(2024)TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-TransformerFrontiers in Neurorobotics10.3389/fnbot.2024.144317718Online publication date: 21-Oct-2024
https://doi.org/10.3389/fnbot.2024.1443177
Show More Cited By

Index Terms

P²ANet: A Large-Scale Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Activity recognition and understanding
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Three-Stream 3D/1D CNN for Fine-Grained Action Classification and Segmentation in Table Tennis
MMSports'21: Proceedings of the 4th International Workshop on Multimedia Content Analysis in Sports

This paper proposes a fusion method of modalities extracted from video through a three-stream network with spatio-temporal and temporal convolutions for fine-grained action classification in sport. It is applied to TTStroke-21 dataset which consists of ...
A table tennis game for three players
OZCHI '06: Proceedings of the 18th Australia conference on Computer-Human Interaction: Design: Activities, Artefacts and Environments

Table tennis is a game that can provide healthy exercise and is also a social pastime for players of all ages across the world. However, players have to be collocated to play, and three players cannot usually play at the same time in fair or equitable ...
Design and Analysis of a Virtual Table Tennis Game Machine Circuit
ICITEE '22: Proceedings of the 5th International Conference on Information Technologies and Electrical Engineering

With the improvement of living standard, people pay more and more attention to physical exercise and leisure entertainment. Table tennis is the national sport of our country and is loved by the Chinese people. The traditional table tennis is limited by ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20, Issue 4

April 2024

676 pages

EISSN:1551-6865

DOI:10.1145/3613617

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2024

Online AM: 28 November 2023

Accepted: 21 October 2023

Revised: 20 September 2023

Received: 27 March 2023

Published in TOMM Volume 20, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
328
Total Downloads

Downloads (Last 12 months)197
Downloads (Last 6 weeks)26

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fujihara YShimada TKong XTanaka ANishikawa HTomiyama H(2025)Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per StrokeSensors10.3390/s2503083425:3(834)Online publication date: 30-Jan-2025
https://doi.org/10.3390/s25030834
Yang GNguyen MYan WLi X(2024)Foul Detection for Table Tennis Serves Using Deep LearningElectronics10.3390/electronics1401002714:1(27)Online publication date: 25-Dec-2024
https://doi.org/10.3390/electronics14010027
Ma LTong Y(2024)TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-TransformerFrontiers in Neurorobotics10.3389/fnbot.2024.144317718Online publication date: 21-Oct-2024
https://doi.org/10.3389/fnbot.2024.1443177
Tsuji SHashimoto AYang MMa JHonda HTanaka KLienhart RMoeslund TSaito H(2024)Audio-Visual Self-Supervision for Frame-Level Player-wise Offensive Shot Detection in Table Tennis MatchesProceedings of the 7th ACM International Workshop on Multimedia Content Analysis in Sports10.1145/3689061.3689064(27-33)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3689061.3689064
Fujihara YKong XTanaka ANishikawa HTomiyama H(2024)Table Tennis Stroke Classification from Game Videos Using 3D Human Keypoints2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)10.1109/VCIP63160.2024.10849923(1-5)Online publication date: 8-Dec-2024
https://doi.org/10.1109/VCIP63160.2024.10849923
Yan T(2024)Hybrid Self-Supervised and Semi-Supervised Framework for Robust Spatio-Temporal Action Detection2024 IEEE 7th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE)10.1109/AUTEEE62881.2024.10869674(289-294)Online publication date: 27-Dec-2024
https://doi.org/10.1109/AUTEEE62881.2024.10869674
Han ZQin WWang YWang QShi Y(2024)MultiSubjects: A multi-subject video dataset for single-person basketball action recognition from basketball gymComputer Vision and Image Understanding10.1016/j.cviu.2024.104193249(104193)Online publication date: Dec-2024
https://doi.org/10.1016/j.cviu.2024.104193

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents