Skip to main content
Log in

Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Tennis has gained global popularity, prompting a surge in interest towards 3D video-based tennis motion recognition. Early action recognition, which predates activity completion, is a critical classification task to preempt adverse outcomes. Prior research emphasizes effective feature extraction and modeling for swift, accurate classification, despite limited data availability. To establish a robust foundation, this study introduces an anticipatory action prediction module preceding the recognition component. The module forecasts subsequent motions based on observed ones, using an LSTM-GAN structure to mitigate motion blurring and generate predictions. This paper presents an innovative framework that leverages deep learning, particularly dilated neural networks, for real-time spatio-temporal tennis analysis on standard hardware, aiming to enhance player performance insights and action prediction through TensorFlow. The dilated RNN and CNN are integrated into the recognition module for comprehensive spatiotemporal feature modeling. To foster synergy between the prediction and recognition modules, a hard class mining mechanism is devised to enhance the learning capabilities of challenging class samples. As a result, the LSTM architecture combined with GAN provides an excellent 92.1 Precision, 91.2 Recall, 94.5 F-1 score and 95.0 Accuracy in action recognition and prediction of tennis sport, which is significantly higher than classical models i.e. GAN, Conv3DJ, Co-occurrence LSTM, and GAN + L1 + Mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data availability

Not applicable.

References

  • Ali M, Yin B, Kumar A, Sheikh AM et al. (2020) Reduction of multiplications in convolutional neural networks. In: 2020 39th Chinese Control Conference (CCC). IEEE, pp. 7406–7411. https://doi.org/10.23919/CCC50068.2020.9188843

  • Aslam M, Dai X, Hou J, Li Q, Ullah R, Ni Z, Liu Y (2020) Reliable control design for composite-driven scheme based on delay networked T–S fuzzy system. Int J Robust Nonlinear Control 30(4):1622–1642

    Article  MathSciNet  MATH  Google Scholar 

  • Chen Z, Li S, Yang B, Li Q, Liu H (2021) Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, no. 2, pp 1113–1122

  • Cui R, Hua G, Wu J (2020) AP-GAN: predicting skeletal activity to improve early activity recognition. J vis Commun Image Represent 73:102923

    Article  Google Scholar 

  • Cust EE, Sweeting AJ, Ball K, Robertson S (2019) Machine and deep learning for sport-specific movement recognition: a systematic review of model development and performance. J Sports Sci 37(5):568–600

    Article  Google Scholar 

  • Fernando T, Denman S, Sridharan S, Fookes C (2019) Memory augmented deep generative models for forecasting the next shot location in tennis. IEEE Trans Knowl Data Eng 32(9):1785–1797

    Google Scholar 

  • Gammulle H, Denman S, Sridharan S, Fookes C (2019) Predicting the future: a jointly learnt model for action anticipation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 5562–5571

  • Ghosh I, Ramasamy Ramamurthy S, Chakma A, Roy N (2023) Sports analytics review: artificial intelligence applications, emerging technologies, and algorithmic perspective. Wiley Interdiscip Rev Data Min Knowl Discov 13:e1496

    Article  Google Scholar 

  • Hazrat B, Yin B, Kumar A, Ali M, Zhang J, Yao J (2023) Jerk-bounded trajectory planning for rotary flexible joint manipulator: an experimental approach. Soft Comput 27(7):4029–4039. https://doi.org/10.1007/s00500-023-07923-5

    Article  Google Scholar 

  • Ignatov A (2018) Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl Soft Comput 62:915–922

    Article  Google Scholar 

  • Ilic F, Pock T, Wildes RP (2022) Is appearance free action recognition possible? In: European Conference on Computer Vision. Springer Nature Switzerland, Cham, pp 156–173

  • Jiang M, Kong J, Bebis G, Huo H (2015) Informative joints based human action recognition using skeleton contexts. Signal Process: Image Commun 33:29–40

    Google Scholar 

  • Kanjilal R, Uysal I (2021) The future of human activity recognition: deep learning or feature engineering? Neural Process Lett 53:561–579

    Article  Google Scholar 

  • Kerrigan A, Duarte K, Rawat Y, Shah M (2021) Reformulating zero-shot action recognition for multi-label actions. Adv Neural Inf Process Syst 34:25566–25577

    Google Scholar 

  • Korban M, Li X (2023) Semantics-enhanced early action detection using dynamic dilated convolution. Pattern Recogn 140:109595

    Article  Google Scholar 

  • Kumar A, Shaikh AM, Li Y et al (2021) Pruning filters with L1-norm and capped L1-norm for CNN compression. Appl Intell 51:1152–1160. https://doi.org/10.1007/s10489-020-01894-y

    Article  Google Scholar 

  • Lahiani H, Neji M (2018) Hand gesture recognition method based on HOG-LBP features for mobile devices. Procedia Comput Sci 126:254–263

    Article  Google Scholar 

  • Le VH (2023) Deep learning-based for human segmentation and tracking, 3D human pose estimation and action recognition on monocular video of MADS dataset. Multimed Tools Appl 82(14):20771–20818

    Article  Google Scholar 

  • Liu J, Huang G, Hyyppä J, Li J, Gong X, Jiang X (2023) A survey on location and motion tracking technologies methodologies and applications in precision sports. Expert Syst Appl 229:120492

    Article  Google Scholar 

  • Martin PE, Benois-Pineau J, Péteri R, Morlier J (2020) Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks: application to table tennis. Multimed Tools Appl 79:20429–20447

    Article  Google Scholar 

  • Mazzia V, Angarano S, Salvetti F, Angelini F, Chiaberge M (2022) Action transformer: a self-attention model for short-time pose-based human action recognition. Pattern Recogn 124:108487

    Article  Google Scholar 

  • Murthy CB, Hashmi MF, Bokde ND, Geem ZW (2020) Investigations of object detection in images/videos using various deep learning techniques and embedded platforms—a comprehensive review. Appl Sci 10(9):3280

    Article  Google Scholar 

  • Nguyen TT, Pham DT, Vu H, Le TL (2022) A robust and efficient method for skeleton-based human action recognition and its application for cross-dataset evaluation. IET Comput vis 16(8):709–726

    Article  Google Scholar 

  • Peng X, Tang L (2022) Biomechanics analysis of real-time tennis batting images using Internet of Things and deep learning. J Supercomput. https://doi.org/10.1007/s11227-021-04111-w

    Article  Google Scholar 

  • Perri T, Reid M, Murphy A, Howle K, Duffield R (2022) Prototype machine learning algorithms from wearable technology to detect tennis stroke and movement actions. Sensors 22(22):8868

    Article  Google Scholar 

  • Sen A, Hossain SMM, Uddin R, Deb K, Jo KH (2022) Sequence recognition of indoor tennis actions using transfer learning and long short-term memory. In: International Workshop on Frontiers of Computer Vision. Springer International Publishing, Cham, pp 312–324

  • Shamrooz M, Li Q, Hou J (2021) Fault detection for asynchronous T–S fuzzy networked Markov jump systems with new event-triggered scheme. IET Control Theory Appl 15(11):1461–1473

    Article  MathSciNet  Google Scholar 

  • Shamrooz M, Qaisar I, Majid A, Shamrooz S (2023) Adaptive event-triggered robust H∞ control for Takagi-Sugeno fuzzy networked Markov jump systems with time-varying delay. Asian J Control 25(1):213–228

    Article  MathSciNet  Google Scholar 

  • Shaos Z, Zhong Y, Yu Z, Chu X (2022) An improved neural network model for early detection of joint injuries in tai chi sports. Mob Inform Syst 2022:1–8

    Google Scholar 

  • Shi L, Zhang Y, Cheng J, Lu H (2020) Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition. In: Proceedings of the Asian Conference on Computer Vision

  • Song L, Yu G, Yuan J, Liu Z (2021) Human pose estimation and its application to action recognition: a survey. J vis Commun Image Represent 76:103055

    Article  Google Scholar 

  • Tu Z, Xie W, Qin Q, Poppe R, Veltkamp RC, Li B, Yuan J (2018) Multi-stream CNN: learning representations based on human-related regions for action recognition. Pattern Recogn 79:32–43

    Article  Google Scholar 

  • Vinyes Mora S, Knottenbelt WJ (2017) Deep learning for domain-specific action recognition in tennis. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 114–122

  • Wang Y, Zhang Y (2021) Real time evaluation algorithm of human motion in tennis training robot. J Intell Fuzzy Syst 40(4):6049–6057

    Article  Google Scholar 

  • Wang H, Oneata D, Verbeek J, Schmid C (2016) A robust and efficient video representation for action recognition. Int J Comput vis 119:219–238

    Article  MathSciNet  Google Scholar 

  • Wang L, Zhai Q, Yin B et al. (2019) Second-order convolutional network for crowd counting. In: Proc. SPIE 11198, Fourth International Workshop on Pattern Recognition, 111980T https://doi.org/10.1117/12.2540362

  • Wang C, Yan A, Deng W, Qi C (2022) Effect of tennis expertise on motion-in-depth perception at different speeds: an event-related potential study. Brain Sci 12(9):1160

    Article  Google Scholar 

  • Xu H, Sun Z, Cao Y et al (2023) A data-driven approach for intrusion and anomaly detection using automated machine learning for the Internet of Things. Soft Comput. https://doi.org/10.1007/s00500-023-09037-4

    Article  Google Scholar 

  • Yao W, Guo Y, Wu Y, Guo J (2017) Experimental validation of fuzzy PID control of flexible joint system in presence of uncertainties. In: 2017 36th Chinese Control Conference (CCC). IEEE, pp 4192–4197. https://doi.org/10.23919/ChiCC.2017.8028015

  • Yin B, Khan J, Wang L, Zhang J, Kumar A (2019) Real-time lane detection and tracking for advanced driver assistance systems. In: 2019 Chinese Control Conference (CCC). IEEE, pp 6772–6777 https://doi.org/10.23919/ChiCC.2019.8866334

  • Yin B, Aslam MS et al (2023) A practical study of active disturbance rejection control for rotary flexible joint robot manipulator. Soft Comput 27:4987–5001. https://doi.org/10.1007/s00500-023-08026-x

    Article  Google Scholar 

  • Zaidi SSA, Ansari MS, Aslam A, Kanwal N, Asghar M, Lee B (2022) A survey of modern deep learning based object detection models. Dig Signal Process 126:103514

    Article  Google Scholar 

  • Zheng S, Lan F, Castellani M (2023) A competitive learning scheme for deep neural network pattern classifier training. Appl Soft Comput 146:110662

    Article  Google Scholar 

  • Zhu W, Lan C, Xing J, Zeng W, Li Y, Shen L, Xie X (2016) Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 30, No. 1

Download references

Funding

No funding was provided for the completion of this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Wang.

Ethics declarations

Conflict of interest

The authors have no financial or proprietary interests in any material discussed in this article. The authors declare that they have no conflict of interest.

Ethical approval

Not applicable.

Informed consent

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, X., Wang, Y. & Khan, J. Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities. Soft Comput 27, 18093–18112 (2023). https://doi.org/10.1007/s00500-023-09215-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-09215-4

Keywords

Navigation