skip to main content
research-article

Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model

Published:27 September 2021Publication History
Skip Abstract Section

Abstract

This paper presents a novel deep learning-based framework for translating a motion into various styles within multiple domains. Our framework is a single set of generative adversarial networks that learns stylistic features from a collection of unpaired motion clips with style labels to support mapping between multiple style domains. We construct a spatio-temporal graph to model a motion sequence and employ the spatial-temporal graph convolution networks (ST-GCN) to extract stylistic properties along spatial and temporal dimensions. Through spatial-temporal modeling, our framework shows improved style translation results between significantly different actions and on a long motion sequence containing multiple actions. In addition, we first develop a mapping network for motion stylization that maps a random noise to style, which allows for generating diverse stylization results without using reference motions. Through various experiments, we demonstrate the ability of our method to generate improved results in terms of visual quality, stylistic diversity, and content preservation.

Skip Supplemental Material Section

Supplemental Material

References

  1. Kfir Aberman, Yijia Weng, Dani Lischinski, Daniel Cohen-Or, and Baoquan Chen. 2020. Unpaired motion style transfer from video to animation. ACM Transactions on Graphics (TOG) 39, 4 (2020), 64--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andreas Aristidou, Qiong Zeng, Efstathios Stavrakis, KangKang Yin, Daniel Cohen-Or, Yiorgos Chrysanthou, and Baoquan Chen. 2017. Emotion control of unstructured dance movements. In Proceedings of the ACM SIGGRAPH/Eurographics symposium on computer animation. 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Kyungjune Baek, Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Hyunjung Shim. 2020. Rethinking the truly unsupervised image-to-image translation. arXiv preprint arXiv:2006.06500 (2020).Google ScholarGoogle Scholar
  4. Jacky CP Chan and Edmond SL Ho. 2021. Emotion Transfer for 3D Hand and Full Body Motion Using StarGAN. Computers 10, 3 (2021), 38.Google ScholarGoogle ScholarCross RefCross Ref
  5. Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8188--8197.Google ScholarGoogle ScholarCross RefCross Ref
  6. Jeff Donahue and Karen Simonyan. 2019. Large scale adversarial representation learning. arXiv preprint arXiv:1907.02544 (2019).Google ScholarGoogle Scholar
  7. Yuzhu Dong, Andreas Aristidou, Ariel Shamir, Moshe Mahler, and Eakta Jain. 2020. Adult2child: Motion Style Transfer using CycleGANs. In Motion, Interaction and Games. 1--11.Google ScholarGoogle Scholar
  8. Han Du, Erik Herrmann, Janis Sprenger, Klaus Fischer, and Philipp Slusallek. 2019. Stylistic locomotion modeling and synthesis using variational generative models. In Motion, Interaction and Games. 1--10.Google ScholarGoogle Scholar
  9. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2015. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015).Google ScholarGoogle Scholar
  10. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2414--2423.Google ScholarGoogle ScholarCross RefCross Ref
  11. Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  12. Daniel Holden, Ikhsanul Habibie, Ikuo Kusajima, and Taku Komura. 2017. Fast neural style transfer for motion data. IEEE computer graphics and applications 37, 4 (2017), 42--49.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Eugene Hsu, Kari Pulli, and Jovan Popović. 2005. Style translation for human motion. In ACM SIGGRAPH 2005 Papers. 1082--1089.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision. 1501--1510.Google ScholarGoogle ScholarCross RefCross Ref
  16. Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal Unsupervised Image-to-Image Translation. arXiv:1804.04732 [cs.CV]Google ScholarGoogle Scholar
  17. Leslie Ikemoto, Okan Arikan, and David Forsyth. 2009. Generalizing motion edits with gaussian processes. ACM Transactions on Graphics (TOG) 28, 1 (2009), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.Google ScholarGoogle ScholarCross RefCross Ref
  19. Diederik P Kingma and Jimmy Lei Ba. 2015. Adam: A method for stochastic gradient descent. In ICLR: International Conference on Learning Representations. 1--15.Google ScholarGoogle Scholar
  20. Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Singh, and Ming-Hsuan Yang. 2018. Diverse image-to-image translation via disentangled representations. In Proceedings of the European conference on computer vision (ECCV). 35--51.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu, Maneesh Singh, and Ming-Hsuan Yang. 2020. Drit++: Diverse image-to-image translation via disentangled representations. International Journal of Computer Vision 128, 10 (2020), 2402--2417.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yanghao Li, Naiyan Wang, Jiaying Liu, and Xiaodi Hou. 2017. Demystifying neural style transfer. arXiv preprint arXiv:1701.01036 (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jongin Lim, Hyung Jin Chang, and Jin Young Choi. 2019. PMnet: Learning of Disentangled Pose and Movement for Unsupervised Motion Retargeting.. In BMVC. 136.Google ScholarGoogle Scholar
  24. Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. 2019. Few-shot unsupervised image-to-image translation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10551--10560.Google ScholarGoogle ScholarCross RefCross Ref
  25. Ian Mason, Sebastian Starke, He Zhang, Hakan Bilen, and Taku Komura. 2018. Few-shot Learning of Homogeneous Human Locomotion Styles. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 143--153.Google ScholarGoogle Scholar
  26. Jianyuan Min, Huajun Liu, and Jinxiang Chai. 2010. Synthesis and editing of personalized stylistic human motion. In Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games. 39--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252. https://doi.org/10.1007/s11263-015-0816-yGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  28. Harrison Jesse Smith, Chen Cao, Michael Neff, and Yingying Wang. 2019. Efficient neural networks for real-time motion style transfer. Proceedings of the ACM on Computer Graphics and Interactive Techniques 2, 2 (2019), 1--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.Google ScholarGoogle ScholarCross RefCross Ref
  30. Graham W Taylor and Geoffrey E Hinton. 2009. Factored conditional restricted Boltzmann machines for modeling motion style. In Proceedings of the 26th annual international conference on machine learning. 1025--1032.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2017. Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6924--6932.Google ScholarGoogle ScholarCross RefCross Ref
  32. Xinyao Wang, Liefeng Bo, and Li Fuxin. 2019. Adaptive wing loss for robust face alignment via heatmap regression. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6971--6981.Google ScholarGoogle ScholarCross RefCross Ref
  33. Shihong Xia, Congyi Wang, Jinxiang Chai, and Jessica Hodgins. 2015. Realtime style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics (TOG) 34, 4 (2015), 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.Google ScholarGoogle ScholarCross RefCross Ref
  35. Dingdong Yang, Seunghoon Hong, Yunseok Jang, Tiangchen Zhao, and Honglak Lee. 2019. Diversity-Sensitive Conditional Generative Adversarial Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJliMh09F7Google ScholarGoogle Scholar
  36. M Ersin Yumer and Niloy J Mitra. 2016. Spectral style transfer for human motion between independent actions. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1--8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Computer Graphics and Interactive Techniques
        Proceedings of the ACM on Computer Graphics and Interactive Techniques  Volume 4, Issue 3
        September 2021
        268 pages
        EISSN:2577-6193
        DOI:10.1145/3488568
        Issue’s Table of Contents

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 September 2021
        Published in pacmcgit Volume 4, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader