Abstract
Three-dimensional human motion synthesis is one of the key technologies in the field of computer animation and multimedia applications. It is well known that the human body's own motion is full of strong personality, emotion, and high-dimensional characteristics, leading to the automatic synthesis of diverse and lifelike 3D human motion data continues to be a challenging task. Facing the challenge, this paper proposes a human motion synthesis framework based on hierarchical learning recurrent neural networks (HL-RNN). The framework includes a low-level network and a high-level network, which are used to extract the path information of the movement and the spatio-temporal relationship of the human bone structure, respectively. Then, after fusion, motions that satisfy the path constraints could be generated. This method can not only synthesize high-quality human movements that follow a specified trajectory, but also synthesize smooth transitions between various movements, and can also be used to synthesize data of different motion styles. Compared with some latest methods, experiments showed that the proposed method can significantly improve the quality and generalization performance of motion synthesis.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13042-021-01304-w/MediaObjects/13042_2021_1304_Fig14_HTML.png)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Lee K, Lee S, Lee J (2018) Interactive character animation by learning multi-objective control. ACM Trans Graph 37(6):1–10
Oh J, Lee Y, Kim Y, Jin T et al (2016) Hand contact between remote users through virtual avatars. In: Proceedings of the 29th International Conference on Computer Animation and Social Agents, pp 97–100
Cao Z, Gao H, Mangalam K et al (2020) Long-term human motion prediction with scene context. In: Proceedings of European Conference on Computer Vision, pp 387–404
Jain A, Zamir AR, Savarese S et al (2016) Structural-RNN: deep learning on spatio-temporal graphs. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 5308–5317
Adeli V, Adeli E, Reid I et al (2020) Socially and contextually aware human motion and pose forecasting. IEEE Robot Autom Lett 5(4):6033–6040
Gui L, Zhang K, Wang Y et al (2018) Teaching robots to predict human motion. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 562–567
Ding W, Hu B, Liu H et al (2020) Human posture recognition based on multiple features and rule learning. Int J Mach Learn Cybern 11(11):529–2540
Klaus F (2015) From motion capture to performance synthesis: A data based approach on full-body animation. Aalto University publication series Doctoral Dissertations
Butepage J, Kjellstrom H, Kragic D (2018) Classify, predict, detect, anticipate and synthesize: Hierarchical recurrent latent variable models for human activity modeling. CoRR
Wang Y, Che W, Xu B (2017) Encoder–decoder recurrent network model for interactive character animation generation. Visual Comput 33(6–8):971–980
Ondras J, Celiktutan O, Bremner P, Gunes H (2020) Audio-driven robot upper-body motion synthesis. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2020.2966730
Du X, Vasudevan R, Johnson-Roberson M (2019) Bio-LSTM: A biomechanically inspired recurrent neural network for 3-D pedestrian pose and gait prediction. IEEE RA-L 4(2):1501–1508
Kim W, Ramanagopal MS, Barto C et al (2018) PedX: benchmark dataset for metric 3D pose estimation of pedestrians in complex urban intersections. IEEE Robot Autom Lett 4(2):1940–1947
Safonova A, Hodgins JK (2008) Artificial Intelligence Techniques for Computer Graphics. Springer, Berlin, Heidelberg
Levine S, Wang JM, Haraux AZ et al (2012) Continuous character control with low-dimensional embeddings. ACM Trans Graph 31(28):1–10
Mahmudi M, Kallmann M (2015) Multi-modal data-driven motion planning and synthesis. In: Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games, pp 119–124
Kang C, Lee S (2017) Multi-contact locomotion using a contact graph with feasibility predictors. ACM Trans Graph 36(2):1–14
Holden D, Saito J, Komura T (2016) A deep learning framework for character motion synthesis and editing. ACM Trans Graph 35(4):1–11
Holden D, Komura T, Saito J (2017) Phase-functioned neural networks for character control. ACM Trans Graph 36(4):1–13
Hwang J, Kim J, Suh IH et al (2018) Real-time locomotion controller using an inverted-pendulum-based abstract model. Comput Graph Forum 37(2):287–296
Habibie I, Holden D, Schwarz J et al (2017) A recurrent variational autoencoder for human motion Synthesis. In: Proceedings of 28th British Machine Vision Conference, pp 1–12
Li Z, Zhou Y, Xiao S et al (2018) Auto-conditioned recurrent networks for extended complex human motion synthesis. In: Proceedings of International Conference on Learning Representations. arXiv preprint arXiv:1707.05363
Wang Z, Chai J, Xia S (2021) Combining recurrent neural networks and adversarial training for human motion synthesis and control. IEEE Trans Visual Comput Graphics 27(1):14–28
Harvey FG, Pal C (2018) Recurrent transition networks for character locomotion. In: SIGGRAPH Asia 2018 Technical Briefs, pp 1–4
Gopalakrishnan A, Mali A, Kifer D et al (2019) A neural temporal model for human motion prediction. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12116–12125
Battan N, Agrawal Y, Rao SS, Goel A, Sharma A et al (2021) GlocalNet: Class-aware Long-term Human Motion Synthesis. In: IEEE Winter Conference on Applications of Computer Vision, January 5–9, Virtual
Zhao R, Su H, Ji Q (2020) Bayesian adversarial human motion synthesis. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6225–6234
Peng XB, Berseth G, Yin K, Panne MVD (2017) DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans Graph 36(4):1–13
Peng XB, Abbeel P, Levine S et al (2018) DeepMimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans. Graph 37(4):1–14
Merel J, Tassa Y, Srinivasan S et al (2017) Learning human behaviors from motion capture by adversarial imitation. arXiv preprint arXiv:1707.02201
Cho K, Merriënboer BV, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the ACL Conference on Empirical Methods in Natural Language Processing, pp 1724–1734
Stoer J, Bulirsch R (1980) Introduction to Numerical Analysis. Springer, New York
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(9):533–536
Pavllo D, Feichtenhofer C, Auli M et al (2019) Modeling human motion with quaternion-based neural networks. Int J Comput Vis 128(4):855–872
Pavllo D, Grangier D, Auli M (2018) QuaterNet: a quaternion-based recurrent model for human motion. In: Proceedings of British Machine Vision Conference, pp 188
Martinez J, Black MJ, Romero J (2017) On human motion prediction using recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2891–2900
Kingma D, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations. arXiv preprint arXiv:1412.6980
Holden D, Habibie I, Kusajima I et al (2017) Fast neural style transfer for motion data. IEEE Comput Graphics Appl 37(4):42–49
Acknowledgement
This work was supported by the Key Program of NSFC (Grant No. U1908214); Special Project of Central Government Guiding Local Science and Technology Development (Grant No. 2021JH6/10500140); the Program for the Liaoning Distinguished Professor; the Program for Science and Technology Innovation Fund of Dalian (Grant No. 2020JJ25CY001); the Project for Technology Innovation Team of Liaoning Province and Dalian University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (MP4 30083 kb)
Rights and permissions
About this article
Cite this article
Zhou, D., Guo, C., Liu, R. et al. Hierarchical learning recurrent neural networks for 3D motion synthesis. Int. J. Mach. Learn. & Cyber. 12, 2255–2267 (2021). https://doi.org/10.1007/s13042-021-01304-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-021-01304-w