Skip to main content
Log in

Hierarchical learning recurrent neural networks for 3D motion synthesis

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Three-dimensional human motion synthesis is one of the key technologies in the field of computer animation and multimedia applications. It is well known that the human body's own motion is full of strong personality, emotion, and high-dimensional characteristics, leading to the automatic synthesis of diverse and lifelike 3D human motion data continues to be a challenging task. Facing the challenge, this paper proposes a human motion synthesis framework based on hierarchical learning recurrent neural networks (HL-RNN). The framework includes a low-level network and a high-level network, which are used to extract the path information of the movement and the spatio-temporal relationship of the human bone structure, respectively. Then, after fusion, motions that satisfy the path constraints could be generated. This method can not only synthesize high-quality human movements that follow a specified trajectory, but also synthesize smooth transitions between various movements, and can also be used to synthesize data of different motion styles. Compared with some latest methods, experiments showed that the proposed method can significantly improve the quality and generalization performance of motion synthesis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Lee K, Lee S, Lee J (2018) Interactive character animation by learning multi-objective control. ACM Trans Graph 37(6):1–10

    Article  Google Scholar 

  2. Oh J, Lee Y, Kim Y, Jin T et al (2016) Hand contact between remote users through virtual avatars. In: Proceedings of the 29th International Conference on Computer Animation and Social Agents, pp 97–100

  3. Cao Z, Gao H, Mangalam K et al (2020) Long-term human motion prediction with scene context. In: Proceedings of European Conference on Computer Vision, pp 387–404

  4. Jain A, Zamir AR, Savarese S et al (2016) Structural-RNN: deep learning on spatio-temporal graphs. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 5308–5317

  5. Adeli V, Adeli E, Reid I et al (2020) Socially and contextually aware human motion and pose forecasting. IEEE Robot Autom Lett 5(4):6033–6040

    Article  Google Scholar 

  6. Gui L, Zhang K, Wang Y et al (2018) Teaching robots to predict human motion. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 562–567

  7. Ding W, Hu B, Liu H et al (2020) Human posture recognition based on multiple features and rule learning. Int J Mach Learn Cybern 11(11):529–2540

    Article  Google Scholar 

  8. Klaus F (2015) From motion capture to performance synthesis: A data based approach on full-body animation. Aalto University publication series Doctoral Dissertations

  9. Butepage J, Kjellstrom H, Kragic D (2018) Classify, predict, detect, anticipate and synthesize: Hierarchical recurrent latent variable models for human activity modeling. CoRR

  10. Wang Y, Che W, Xu B (2017) Encoder–decoder recurrent network model for interactive character animation generation. Visual Comput 33(6–8):971–980

    Article  Google Scholar 

  11. Ondras J, Celiktutan O, Bremner P, Gunes H (2020) Audio-driven robot upper-body motion synthesis. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2020.2966730

    Article  Google Scholar 

  12. Du X, Vasudevan R, Johnson-Roberson M (2019) Bio-LSTM: A biomechanically inspired recurrent neural network for 3-D pedestrian pose and gait prediction. IEEE RA-L 4(2):1501–1508

    Google Scholar 

  13. Kim W, Ramanagopal MS, Barto C et al (2018) PedX: benchmark dataset for metric 3D pose estimation of pedestrians in complex urban intersections. IEEE Robot Autom Lett 4(2):1940–1947

    Article  Google Scholar 

  14. Safonova A, Hodgins JK (2008) Artificial Intelligence Techniques for Computer Graphics. Springer, Berlin, Heidelberg

    Google Scholar 

  15. Levine S, Wang JM, Haraux AZ et al (2012) Continuous character control with low-dimensional embeddings. ACM Trans Graph 31(28):1–10

    Article  Google Scholar 

  16. Mahmudi M, Kallmann M (2015) Multi-modal data-driven motion planning and synthesis. In: Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games, pp 119–124

  17. Kang C, Lee S (2017) Multi-contact locomotion using a contact graph with feasibility predictors. ACM Trans Graph 36(2):1–14

    Article  Google Scholar 

  18. Holden D, Saito J, Komura T (2016) A deep learning framework for character motion synthesis and editing. ACM Trans Graph 35(4):1–11

    Article  Google Scholar 

  19. Holden D, Komura T, Saito J (2017) Phase-functioned neural networks for character control. ACM Trans Graph 36(4):1–13

    Article  Google Scholar 

  20. Hwang J, Kim J, Suh IH et al (2018) Real-time locomotion controller using an inverted-pendulum-based abstract model. Comput Graph Forum 37(2):287–296

    Article  Google Scholar 

  21. Habibie I, Holden D, Schwarz J et al (2017) A recurrent variational autoencoder for human motion Synthesis. In: Proceedings of 28th British Machine Vision Conference, pp 1–12

  22. Li Z, Zhou Y, Xiao S et al (2018) Auto-conditioned recurrent networks for extended complex human motion synthesis. In: Proceedings of International Conference on Learning Representations. arXiv preprint arXiv:1707.05363

  23. Wang Z, Chai J, Xia S (2021) Combining recurrent neural networks and adversarial training for human motion synthesis and control. IEEE Trans Visual Comput Graphics 27(1):14–28

    Article  Google Scholar 

  24. Harvey FG, Pal C (2018) Recurrent transition networks for character locomotion. In: SIGGRAPH Asia 2018 Technical Briefs, pp 1–4

  25. Gopalakrishnan A, Mali A, Kifer D et al (2019) A neural temporal model for human motion prediction. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12116–12125

  26. Battan N, Agrawal Y, Rao SS, Goel A, Sharma A et al (2021) GlocalNet: Class-aware Long-term Human Motion Synthesis. In: IEEE Winter Conference on Applications of Computer Vision, January 5–9, Virtual

  27. Zhao R, Su H, Ji Q (2020) Bayesian adversarial human motion synthesis. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6225–6234

  28. Peng XB, Berseth G, Yin K, Panne MVD (2017) DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans Graph 36(4):1–13

    Article  Google Scholar 

  29. Peng XB, Abbeel P, Levine S et al (2018) DeepMimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans. Graph 37(4):1–14

    Google Scholar 

  30. Merel J, Tassa Y, Srinivasan S et al (2017) Learning human behaviors from motion capture by adversarial imitation. arXiv preprint arXiv:1707.02201

  31. Cho K, Merriënboer BV, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the ACL Conference on Empirical Methods in Natural Language Processing, pp 1724–1734

  32. Stoer J, Bulirsch R (1980) Introduction to Numerical Analysis. Springer, New York

    Book  Google Scholar 

  33. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(9):533–536

    Article  Google Scholar 

  34. Pavllo D, Feichtenhofer C, Auli M et al (2019) Modeling human motion with quaternion-based neural networks. Int J Comput Vis 128(4):855–872

    Article  Google Scholar 

  35. Pavllo D, Grangier D, Auli M (2018) QuaterNet: a quaternion-based recurrent model for human motion. In: Proceedings of British Machine Vision Conference, pp 188

  36. Martinez J, Black MJ, Romero J (2017) On human motion prediction using recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2891–2900

  37. Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations. arXiv preprint arXiv:1412.6980

  38. Holden D, Habibie I, Kusajima I et al (2017) Fast neural style transfer for motion data. IEEE Comput Graphics Appl 37(4):42–49

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by the Key Program of NSFC (Grant No. U1908214); Special Project of Central Government Guiding Local Science and Technology Development (Grant No. 2021JH6/10500140); the Program for the Liaoning Distinguished Professor; the Program for Science and Technology Innovation Fund of Dalian (Grant No. 2020JJ25CY001); the Project for Technology Innovation Team of Liaoning Province and Dalian University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongsheng Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (MP4 30083 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, D., Guo, C., Liu, R. et al. Hierarchical learning recurrent neural networks for 3D motion synthesis. Int. J. Mach. Learn. & Cyber. 12, 2255–2267 (2021). https://doi.org/10.1007/s13042-021-01304-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01304-w

Keyword

Navigation