Skip to main content
Log in

Adversarial learning for modeling human motion

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

We investigate how adversarial learning may be used for various animation tasks related to human motion synthesis. We propose a learning framework that we decline for building various models corresponding to various needs: a random synthesis generator that randomly produces realistic motion capture trajectories; conditional variants that allow controlling the synthesis by providing high-level features that the animation should match; a style transfer model that allows transforming an existing animation in the style of another one. Our work is built on the adversarial learning strategy that has been proposed in the machine learning field very recently (2014) for learning accurate generative models on complex data, and that has been shown to provide impressive results, mainly on image data. We report both objective and subjective evaluation results on motion capture data performed under emotion, the Emilya Dataset. Our results show the potential of our proposals for building models for a variety of motion synthesis tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. Codes can be found here: https://bit.ly/2LD0A6w.

References

  1. Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating Sentences from a Continuous Space. ICLR pp. 1–13 (2016)

  2. Brand, M., Hertzmann, A.: Style machines. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp. 183–192. ACM Press/Addison-Wesley Publishing Co. (2000)

  3. Chen, M., Denoyer, L.: Multi-view Generative Adversarial Networks. CoRR abs/1611.02019 (2016)

  4. Chen, M., Denoyer, L., Artieres, T.: Multi-view data generation without view supervision. In: International Conference on Learning Representations (2018)

  5. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. arXiv:1606.03657 [cs.LG] pp. 1–14 (2016)

  6. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. CoRR abs/1606.03657 (2016). arXiv:1606.03657

  7. Cho, K., van Merriënboer, B., Gülçehre, Ç., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Association for Computational Linguistics, Doha, Qatar (2014)

  8. Chollet, F., et al.: Keras. https://keras.io (2015)

  9. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv preprint arXiv:1412.3555 (2014)

  10. Chung, J., Kastner, K., Dinh, L., Goel, K., Courville, A.C., Bengio, Y.: A Recurrent Latent Variable Model for Sequential Data. arxiv (2015)

  11. Denton, E., Birodkar, V.: Unsupervised Learning of Disentangled Representations from Video. CoRR abs/1705.10915 (2017)

  12. Denton, E., Chintala, S., Szlam, A., Fergus, R.: Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. Arxiv pp. 1–10 (2015)

  13. Ding, Y., Prepin, K., Huang, J., Pelachaud, C., Artières, T.: Laughter animation synthesis. In: AAMAS (2014)

  14. Fourati, N., Pelachaud, C.: Emilya: Emotional body expression in daily actions database. In: LREC, pp. 3486–3493 (2014)

  15. Fragkiadaki, K., Levine, S., Felsen, P., Malik, J.: Recurrent Network Models for Human Dynamics. In: 2015 IEEE International Conference on Computer Vision (ICCV) pp. 4346–4354 (2015). https://doi.org/10.1109/ICCV.2015.494

  16. Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: ICML. JMLR Workshop and Conference Proceedings (2015)

  17. Gatys, L.A., Ecker, A.S., Bethge, M.: A Neural Algorithm of Artistic Style. arXiv preprint arXiv:1508.06576 (2015)

  18. Gleicher, M.: Motion editing with space–time constraints. In: Proceedings of the 1997 Symposium on Interactive 3D Graphics, pp. 139–ff. ACM (1997)

  19. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 2672–2680 (2014)

    Google Scholar 

  20. Graves, A.: Generating Sequences with Recurrent Neural Networks. Technical Reports pp. 1–43 (2013). https://doi.org/10.1145/2661829.2661935. arXiv:1308.0850

  21. Grochow, K., Martin, S.L., Hertzmann, A., Popovic, Z.: Style-based inverse kinematics. ACM Trans. Graph. 23(3), 522–531 (2004). https://doi.org/10.1145/1015706.1015755

    Article  Google Scholar 

  22. Holden, D., Komura, T., Saito, J.: Phase-functioned neural networks for character control. ACM Trans. Graph. (TOG) 36(4), 42 (2017)

    Article  Google Scholar 

  23. Holden, D., Saito, J., Komura, T.: A deep learning framework for character motion synthesis and editing. ACM Trans. Graph. 35(4), 1–11 (2016). https://doi.org/10.1145/2897824.2925975

    Article  Google Scholar 

  24. Hsu, E., Pulli, K., Popović, J.: Style translation for human motion. In: ACM Transactions on Graphics (TOG), vol. 24, pp. 1082–1089. ACM (2005)

  25. Jain, A., Zamir, A.R., Savarese, S., Saxena, A.: Structural-rnn: deep learning on spatio-temporal graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5308–5317 (2016)

  26. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  27. Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 5969–5978 (2017)

  28. Levine, S., Wang, J., Popović, A.H.Z., Koltun, V.: Continuous character control with low-dimensional embeddings. ACM Trans. Graph. 31(4), 1–10 (2012). https://doi.org/10.1145/2185520.2335379

    Article  Google Scholar 

  29. Li, Y., Liu, S., Yang, J., Yang, M.H.: Generative face completion

  30. Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I.: Adversarial Autoencoders. arXiv pp. 1–10 (2015). arXiv:1511.05644

  31. Mathieu, M., Zhao, J.J., Sprechmann, P., Ramesh, A., LeCun, Y.: Disentangling Factors of Variation in Deep Representations Using Adversarial Training. CoRR abs/1611.03383 (2016). arXiv:1611.03383

  32. Mirza, M., Osindero, S.: Conditional Generative Adversarial Nets. CoRR pp. 1–7 (2014). arXiv:1411.1784

  33. Mirza, M., Osindero, S.: Conditional generative adversarial nets. CoRR (2014)

  34. Müller, M., Röder, T., Clausen, M., Eberhardt, B., Krüger, B., Weber, A.: Documentation mocap database hdm05. Tech. Rep. CG-2007-2, Universität Bonn (2007)

  35. Nguyen, A., Yosinski, J., Bengio, Y., Dosovitskiy, A., Clune, J.: Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space. Iccv (3) (2017)

  36. van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: Wavenet: A Generative Model for Raw Audio. arXiv pp. 846–849 (2015). https://doi.org/10.1109/ICASSP.2009.4960364

  37. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  38. Peng, X.B., Abbeel, P., Levine, S., van de Panne, M.: Deepmimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. arXiv preprint arXiv:1804.02717 (2018)

    Google Scholar 

  39. Perarnau, G., van de Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible Conditional GANs for Image Editing. CoRR abs/1611.06355 (2016). arXiv:1611.06355

  40. Radenen, M., Artières, T.: Contextual hidden markov models. In: ICASSP, pp. 2113–2116 (2012)

  41. Schmidhuber, J., Hochreiter, S.: Long short-term memory. Neural computation 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  42. Shapiro, A., Cao, Y., Faloutsos, P.: Style components. In: Proceedings of the Graphics Interface 2006 Conference, June 7–9, 2006, Quebec, Canada, pp. 33–39 (2006). https://doi.org/10.1145/1143079.1143086

  43. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to Sequence Learning with Neural Networks. Nips pp. 3104–3112 (2014). https://doi.org/10.1007/s10107-014-0839-0

    Article  MathSciNet  Google Scholar 

  44. Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models for human motion. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 283–298 (2008)

    Article  Google Scholar 

  45. Wang, Q., Artieres, T.: Motion capture synthesis with adversarial learning. In: International Conference on Intelligent Virtual Agents, pp. 467–470. Springer (2017)

  46. Wang, Q., Artières, T., Ding, Y.: Learning activity patterns performed with emotion. In: Proceedings of the 3rd International Symposium on Movement and Computing, MOCO (2016)

  47. Wang, Q., Chen, M., Artires, T., Denoyer, L.: transferring style in motion capture sequences with adversarial learning. In: ESANN (2018)

  48. Welman, C.: Inverse Kinematics and Geometric Constraints for Articulated Figure Manipulation. Simon Fraser University (1994)

  49. Xia, S., Wang, C., Chai, J., Hodgins, J.: Realtime style transfer for unlabeled heterogeneous human motion. ACM Trans. Graph. (TOG) 34(4), 119 (2015)

    Article  Google Scholar 

  50. Yumer, M.E., Mitra, N.J.: Spectral style transfer for human motion between independent actions. ACM Trans. Graph. (TOG) 35(4), 137 (2016)

    Article  Google Scholar 

  51. Zhou, Y., Li, Z., Xiao, S., He, C., Huang, Z., Li, H.: Auto-conditioned recurrent networks for extended complex human motion synthesis (2018)

  52. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Download references

Acknowledgements

We warmly thank Catherine Pélachaud (CNRS, France) for providing the Emilya dataset. Part of this work was done within the framework of the French-funded ANR Deep in France Project (ANR-16-CE23-0006). The thesis of author Qi WANG is funded by China Scholarship Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thierry Artières.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 127 KB)

Supplementary material 2 (mp4 44929 KB)

Supplementary material 3 (mp4 24630 KB)

Supplementary material 4 (mp4 108408 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Q., Artières, T., Chen, M. et al. Adversarial learning for modeling human motion. Vis Comput 36, 141–160 (2020). https://doi.org/10.1007/s00371-018-1594-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-018-1594-7

Keywords

Navigation