Abstract
Stroke recognition in table tennis is a challenging task, due to the variety of the movements. Many different sensors have been adopted in robotic table tennis, with the goal of detecting the players’ movements. In this paper, we propose a two-stage approach to directly recognize the table tennis racket’s movement. A bounding box around the racket can be extracted from an RGB image in the first stage. An efficient and lightweight CNN architecture is then developed to regress the racket 3D position by fusion of the cropped image and the 3D rotation data from an IMU in the second stage. Together with the rotation data, a robust 6D racket pose is available at a frame rate 100 Hz. In the experiments, two datasets are collected from our KUKA table tennis robot for evaluation and comparisons, which show a position error of 4.7 cm at a range of 6 m. One behavior cloning experiment is performed in order to reveal the potential of this work.
Supported by the Vector Stiftung and KUKA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarap, A.F.: Deep learning using rectified linear units (ReLu). CoRR abs/1803.08375 (2018). http://arxiv.org/abs/1803.08375
Bochkovskiy, A., Wang, C.-Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv (2020)
Beange, K.: Validation of Wearable Sensor Performance and Placement for the Evaluation of Spine Movement Quality. Ph.D. thesis, University of Ottawa (2019)
Bosch : Bosch. https://www.bosch-sensortec.com/
Bridgeman, L., Volino, M., Guillemaut, J., Hilton, A.: Multi-person 3D pose estimation and tracking in sports. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2487–2496 (2019)
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. arXiv preprint arXiv:1812.08008 (2018)
Cignoni, P., Callieri, M., Corsini, M., Dellepiane, M., Ganovelli, F., Ranzuglia, G.: MeshLab: an open-source mesh processing tool. In: Eurographics Italian Chapter Conference, vol. 2008, pp. 129–136 (2008)
Gao, Y., Tebbe, J., Krismer, J., Zell, A.: Markerless racket pose detection and stroke classification based on stereo vision for table tennis robots. In: 2019 Third IEEE International Conference on Robotic Computing (IRC), pp. 189–196, February 2019. https://doi.org/10.1109/IRC.2019.00036
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Kawakami, S., Ikumo, M., Oya, T.: Omron table tennis robot forpheus. https://www.omron.com/innovation/forpheus.html
Li, Z., Wang, G., Ji, X.: CDPN: Coordinates-based disentangled pose network for real-time RGB-based 6-DoF object pose estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019
Lin, T., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017)
Verucchi, M., Bartoli, L., Bagni, F., Gatti, F., Burgio, P., Bertogna, M.: Real-time clustering and lidar-camera fusion on embedded platforms for self-driving cars. In: 2020 Fourth IEEE International Conference on Robotic Computing (IRC) (2020)
von Marcard, T., Henschel, R., Black, M.J., Rosenhahn, B., Pons-Moll, G.: Recovering accurate 3D Human pose in the wild using IMUs and a moving camera. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 614–631. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_37
mbientlab: mbientlab. https://mbientlab.com/
Pavllo, D., Feichtenhofer, C., Grangier, D., Auli, M.: 3D human pose estimation in video with temporal convolutions and semi-supervised training. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PvNet: pixel-wise voting network for 6dof pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 4561–4570 (2019). https://doi.org/10.1109/CVPR.2019.00469
Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: CVPR (2017)
Staszak, R., Belter, D.: Hybrid 6D object pose estimation from the RGB image. In: ICINCO (2019)
Sundermeyer, M., Marton, Z.-C., Durner, M., Brucker, M., Triebel, R.: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 699–715. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_43
Tebbe, J., Gao, Y., Sastre-Rienietz, M., Zell, A.: A table tennis robot system using an industrial KUKA robot arm. In: Brox, T., Bruhn, A., Fritz, M. (eds.) GCPR 2018. LNCS, vol. 11269, pp. 33–45. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12939-2_3
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30. IEEE (2017)
Wang, J., Qiu, K., Peng, H., Fu, J., Zhu, J.: Ai coach: Deep human pose estimation and analysis for personalized athletic training assistance, pp. 374–382, October 2019. https://doi.org/10.1145/3343031.3350910
Wang, Z., Li, J.Z.: Text-enhanced representation learning for knowledge graph. In: IJCAI, pp. 1293–1299 (2016)
Wu, D., Zhuang, Z., Xiang, C., Zou, W., Li, X.: 6D-VNet: End-to-end 6DoF vehicle pose estimation from monocular RGB images. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1238–1247 (2019)
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes (2018)
Zhang, K., Fang, Z., Liu, J., Wu, Z., Tan, M.: Fusion of vision and IMU to track the racket trajectory in real time. In: 2017 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1769–1774, August 2017. https://doi.org/10.1109/ICMA.2017.8016085
Zhao, Y., Yang, R., Chevalier, G., Gong, M.: Deep residual BIDIR-LSTM for human activity recognition using wearable sensors. CoRR abs/1708.08989 (2017). http://arxiv.org/abs/1708.08989
Zimmermann, C., Welschehold, T., Dornhege, C., Burgard, W., Brox, T.: 3D human pose estimation in RGBD images for robotic task learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1986–1992 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Gao, Y., Tebbe, J., Zell, A. (2021). Robust Stroke Recognition via Vision and IMU in Robotic Table Tennis. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12891. Springer, Cham. https://doi.org/10.1007/978-3-030-86362-3_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-86362-3_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86361-6
Online ISBN: 978-3-030-86362-3
eBook Packages: Computer ScienceComputer Science (R0)