Abstract
Tennis has gained global popularity, prompting a surge in interest towards 3D video-based tennis motion recognition. Early action recognition, which predates activity completion, is a critical classification task to preempt adverse outcomes. Prior research emphasizes effective feature extraction and modeling for swift, accurate classification, despite limited data availability. To establish a robust foundation, this study introduces an anticipatory action prediction module preceding the recognition component. The module forecasts subsequent motions based on observed ones, using an LSTM-GAN structure to mitigate motion blurring and generate predictions. This paper presents an innovative framework that leverages deep learning, particularly dilated neural networks, for real-time spatio-temporal tennis analysis on standard hardware, aiming to enhance player performance insights and action prediction through TensorFlow. The dilated RNN and CNN are integrated into the recognition module for comprehensive spatiotemporal feature modeling. To foster synergy between the prediction and recognition modules, a hard class mining mechanism is devised to enhance the learning capabilities of challenging class samples. As a result, the LSTM architecture combined with GAN provides an excellent 92.1 Precision, 91.2 Recall, 94.5 F-1 score and 95.0 Accuracy in action recognition and prediction of tennis sport, which is significantly higher than classical models i.e. GAN, Conv3DJ, Co-occurrence LSTM, and GAN + L1 + Mining.
Similar content being viewed by others
Data availability
Not applicable.
References
Ali M, Yin B, Kumar A, Sheikh AM et al. (2020) Reduction of multiplications in convolutional neural networks. In: 2020 39th Chinese Control Conference (CCC). IEEE, pp. 7406–7411. https://doi.org/10.23919/CCC50068.2020.9188843
Aslam M, Dai X, Hou J, Li Q, Ullah R, Ni Z, Liu Y (2020) Reliable control design for composite-driven scheme based on delay networked T–S fuzzy system. Int J Robust Nonlinear Control 30(4):1622–1642
Chen Z, Li S, Yang B, Li Q, Liu H (2021) Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, no. 2, pp 1113–1122
Cui R, Hua G, Wu J (2020) AP-GAN: predicting skeletal activity to improve early activity recognition. J vis Commun Image Represent 73:102923
Cust EE, Sweeting AJ, Ball K, Robertson S (2019) Machine and deep learning for sport-specific movement recognition: a systematic review of model development and performance. J Sports Sci 37(5):568–600
Fernando T, Denman S, Sridharan S, Fookes C (2019) Memory augmented deep generative models for forecasting the next shot location in tennis. IEEE Trans Knowl Data Eng 32(9):1785–1797
Gammulle H, Denman S, Sridharan S, Fookes C (2019) Predicting the future: a jointly learnt model for action anticipation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 5562–5571
Ghosh I, Ramasamy Ramamurthy S, Chakma A, Roy N (2023) Sports analytics review: artificial intelligence applications, emerging technologies, and algorithmic perspective. Wiley Interdiscip Rev Data Min Knowl Discov 13:e1496
Hazrat B, Yin B, Kumar A, Ali M, Zhang J, Yao J (2023) Jerk-bounded trajectory planning for rotary flexible joint manipulator: an experimental approach. Soft Comput 27(7):4029–4039. https://doi.org/10.1007/s00500-023-07923-5
Ignatov A (2018) Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl Soft Comput 62:915–922
Ilic F, Pock T, Wildes RP (2022) Is appearance free action recognition possible? In: European Conference on Computer Vision. Springer Nature Switzerland, Cham, pp 156–173
Jiang M, Kong J, Bebis G, Huo H (2015) Informative joints based human action recognition using skeleton contexts. Signal Process: Image Commun 33:29–40
Kanjilal R, Uysal I (2021) The future of human activity recognition: deep learning or feature engineering? Neural Process Lett 53:561–579
Kerrigan A, Duarte K, Rawat Y, Shah M (2021) Reformulating zero-shot action recognition for multi-label actions. Adv Neural Inf Process Syst 34:25566–25577
Korban M, Li X (2023) Semantics-enhanced early action detection using dynamic dilated convolution. Pattern Recogn 140:109595
Kumar A, Shaikh AM, Li Y et al (2021) Pruning filters with L1-norm and capped L1-norm for CNN compression. Appl Intell 51:1152–1160. https://doi.org/10.1007/s10489-020-01894-y
Lahiani H, Neji M (2018) Hand gesture recognition method based on HOG-LBP features for mobile devices. Procedia Comput Sci 126:254–263
Le VH (2023) Deep learning-based for human segmentation and tracking, 3D human pose estimation and action recognition on monocular video of MADS dataset. Multimed Tools Appl 82(14):20771–20818
Liu J, Huang G, Hyyppä J, Li J, Gong X, Jiang X (2023) A survey on location and motion tracking technologies methodologies and applications in precision sports. Expert Syst Appl 229:120492
Martin PE, Benois-Pineau J, Péteri R, Morlier J (2020) Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks: application to table tennis. Multimed Tools Appl 79:20429–20447
Mazzia V, Angarano S, Salvetti F, Angelini F, Chiaberge M (2022) Action transformer: a self-attention model for short-time pose-based human action recognition. Pattern Recogn 124:108487
Murthy CB, Hashmi MF, Bokde ND, Geem ZW (2020) Investigations of object detection in images/videos using various deep learning techniques and embedded platforms—a comprehensive review. Appl Sci 10(9):3280
Nguyen TT, Pham DT, Vu H, Le TL (2022) A robust and efficient method for skeleton-based human action recognition and its application for cross-dataset evaluation. IET Comput vis 16(8):709–726
Peng X, Tang L (2022) Biomechanics analysis of real-time tennis batting images using Internet of Things and deep learning. J Supercomput. https://doi.org/10.1007/s11227-021-04111-w
Perri T, Reid M, Murphy A, Howle K, Duffield R (2022) Prototype machine learning algorithms from wearable technology to detect tennis stroke and movement actions. Sensors 22(22):8868
Sen A, Hossain SMM, Uddin R, Deb K, Jo KH (2022) Sequence recognition of indoor tennis actions using transfer learning and long short-term memory. In: International Workshop on Frontiers of Computer Vision. Springer International Publishing, Cham, pp 312–324
Shamrooz M, Li Q, Hou J (2021) Fault detection for asynchronous T–S fuzzy networked Markov jump systems with new event-triggered scheme. IET Control Theory Appl 15(11):1461–1473
Shamrooz M, Qaisar I, Majid A, Shamrooz S (2023) Adaptive event-triggered robust H∞ control for Takagi-Sugeno fuzzy networked Markov jump systems with time-varying delay. Asian J Control 25(1):213–228
Shaos Z, Zhong Y, Yu Z, Chu X (2022) An improved neural network model for early detection of joint injuries in tai chi sports. Mob Inform Syst 2022:1–8
Shi L, Zhang Y, Cheng J, Lu H (2020) Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition. In: Proceedings of the Asian Conference on Computer Vision
Song L, Yu G, Yuan J, Liu Z (2021) Human pose estimation and its application to action recognition: a survey. J vis Commun Image Represent 76:103055
Tu Z, Xie W, Qin Q, Poppe R, Veltkamp RC, Li B, Yuan J (2018) Multi-stream CNN: learning representations based on human-related regions for action recognition. Pattern Recogn 79:32–43
Vinyes Mora S, Knottenbelt WJ (2017) Deep learning for domain-specific action recognition in tennis. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 114–122
Wang Y, Zhang Y (2021) Real time evaluation algorithm of human motion in tennis training robot. J Intell Fuzzy Syst 40(4):6049–6057
Wang H, Oneata D, Verbeek J, Schmid C (2016) A robust and efficient video representation for action recognition. Int J Comput vis 119:219–238
Wang L, Zhai Q, Yin B et al. (2019) Second-order convolutional network for crowd counting. In: Proc. SPIE 11198, Fourth International Workshop on Pattern Recognition, 111980T https://doi.org/10.1117/12.2540362
Wang C, Yan A, Deng W, Qi C (2022) Effect of tennis expertise on motion-in-depth perception at different speeds: an event-related potential study. Brain Sci 12(9):1160
Xu H, Sun Z, Cao Y et al (2023) A data-driven approach for intrusion and anomaly detection using automated machine learning for the Internet of Things. Soft Comput. https://doi.org/10.1007/s00500-023-09037-4
Yao W, Guo Y, Wu Y, Guo J (2017) Experimental validation of fuzzy PID control of flexible joint system in presence of uncertainties. In: 2017 36th Chinese Control Conference (CCC). IEEE, pp 4192–4197. https://doi.org/10.23919/ChiCC.2017.8028015
Yin B, Khan J, Wang L, Zhang J, Kumar A (2019) Real-time lane detection and tracking for advanced driver assistance systems. In: 2019 Chinese Control Conference (CCC). IEEE, pp 6772–6777 https://doi.org/10.23919/ChiCC.2019.8866334
Yin B, Aslam MS et al (2023) A practical study of active disturbance rejection control for rotary flexible joint robot manipulator. Soft Comput 27:4987–5001. https://doi.org/10.1007/s00500-023-08026-x
Zaidi SSA, Ansari MS, Aslam A, Kanwal N, Asghar M, Lee B (2022) A survey of modern deep learning based object detection models. Dig Signal Process 126:103514
Zheng S, Lan F, Castellani M (2023) A competitive learning scheme for deep neural network pattern classifier training. Appl Soft Comput 146:110662
Zhu W, Lan C, Xing J, Zeng W, Li Y, Shen L, Xie X (2016) Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 30, No. 1
Funding
No funding was provided for the completion of this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no financial or proprietary interests in any material discussed in this article. The authors declare that they have no conflict of interest.
Ethical approval
Not applicable.
Informed consent
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, X., Wang, Y. & Khan, J. Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities. Soft Comput 27, 18093–18112 (2023). https://doi.org/10.1007/s00500-023-09215-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-09215-4