research-article

Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model

Authors:
Soomin Park

KAIST, Daejeon, Korea

KAIST, Daejeon, Korea
View Profile

,
Deok-Kyeong Jang

KAIST, Daejeon, Korea

KAIST, Daejeon, Korea
View Profile

,
Sung-Hee Lee

KAIST, Daejeon, Korea

KAIST, Daejeon, Korea
View Profile

Proceedings of the ACM on Computer Graphics and Interactive Techniques Volume 4 Issue 3Article No.: 36pp 1–17https://doi.org/10.1145/3480145

Published:27 September 2021Publication History

Proceedings of the ACM on Computer Graphics and Interactive Techniques

Abstract

This paper presents a novel deep learning-based framework for translating a motion into various styles within multiple domains. Our framework is a single set of generative adversarial networks that learns stylistic features from a collection of unpaired motion clips with style labels to support mapping between multiple style domains. We construct a spatio-temporal graph to model a motion sequence and employ the spatial-temporal graph convolution networks (ST-GCN) to extract stylistic properties along spatial and temporal dimensions. Through spatial-temporal modeling, our framework shows improved style translation results between significantly different actions and on a long motion sequence containing multiple actions. In addition, we first develop a mapping network for motion stylization that maps a random noise to style, which allows for generating diverse stylization results without using reference motions. Through various experiments, we demonstrate the ability of our method to generate improved results in terms of visual quality, stylistic diversity, and content preservation.

Supplemental Material

Available for Download

zip

park.zip (13.2 MB)

Supplemental movie, appendix, image and software files for, Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model

References

Kfir Aberman, Yijia Weng, Dani Lischinski, Daniel Cohen-Or, and Baoquan Chen. 2020. Unpaired motion style transfer from video to animation. ACM Transactions on Graphics (TOG) 39, 4 (2020), 64--1.Google ScholarDigital Library
Andreas Aristidou, Qiong Zeng, Efstathios Stavrakis, KangKang Yin, Daniel Cohen-Or, Yiorgos Chrysanthou, and Baoquan Chen. 2017. Emotion control of unstructured dance movements. In Proceedings of the ACM SIGGRAPH/Eurographics symposium on computer animation. 1--10.Google ScholarDigital Library
Kyungjune Baek, Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Hyunjung Shim. 2020. Rethinking the truly unsupervised image-to-image translation. arXiv preprint arXiv:2006.06500 (2020).Google Scholar
Jacky CP Chan and Edmond SL Ho. 2021. Emotion Transfer for 3D Hand and Full Body Motion Using StarGAN. Computers 10, 3 (2021), 38.Google ScholarCross Ref
Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8188--8197.Google ScholarCross Ref
Jeff Donahue and Karen Simonyan. 2019. Large scale adversarial representation learning. arXiv preprint arXiv:1907.02544 (2019).Google Scholar
Yuzhu Dong, Andreas Aristidou, Ariel Shamir, Moshe Mahler, and Eakta Jain. 2020. Adult2child: Motion Style Transfer using CycleGANs. In Motion, Interaction and Games. 1--11.Google Scholar
Han Du, Erik Herrmann, Janis Sprenger, Klaus Fischer, and Philipp Slusallek. 2019. Stylistic locomotion modeling and synthesis using variational generative models. In Motion, Interaction and Games. 1--10.Google Scholar
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2015. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015).Google Scholar
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2414--2423.Google ScholarCross Ref
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).Google Scholar
Daniel Holden, Ikhsanul Habibie, Ikuo Kusajima, and Taku Komura. 2017. Fast neural style transfer for motion data. IEEE computer graphics and applications 37, 4 (2017), 42--49.Google ScholarDigital Library
Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1--11.Google ScholarDigital Library
Eugene Hsu, Kari Pulli, and Jovan Popović. 2005. Style translation for human motion. In ACM SIGGRAPH 2005 Papers. 1082--1089.Google ScholarDigital Library
Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision. 1501--1510.Google ScholarCross Ref
Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal Unsupervised Image-to-Image Translation. arXiv:1804.04732 [cs.CV]Google Scholar
Leslie Ikemoto, Okan Arikan, and David Forsyth. 2009. Generalizing motion edits with gaussian processes. ACM Transactions on Graphics (TOG) 28, 1 (2009), 1--12.Google ScholarDigital Library
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.Google ScholarCross Ref
Diederik P Kingma and Jimmy Lei Ba. 2015. Adam: A method for stochastic gradient descent. In ICLR: International Conference on Learning Representations. 1--15.Google Scholar
Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Singh, and Ming-Hsuan Yang. 2018. Diverse image-to-image translation via disentangled representations. In Proceedings of the European conference on computer vision (ECCV). 35--51.Google ScholarDigital Library
Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu, Maneesh Singh, and Ming-Hsuan Yang. 2020. Drit++: Diverse image-to-image translation via disentangled representations. International Journal of Computer Vision 128, 10 (2020), 2402--2417.Google ScholarDigital Library
Yanghao Li, Naiyan Wang, Jiaying Liu, and Xiaodi Hou. 2017. Demystifying neural style transfer. arXiv preprint arXiv:1701.01036 (2017).Google ScholarDigital Library
Jongin Lim, Hyung Jin Chang, and Jin Young Choi. 2019. PMnet: Learning of Disentangled Pose and Movement for Unsupervised Motion Retargeting.. In BMVC. 136.Google Scholar
Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. 2019. Few-shot unsupervised image-to-image translation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10551--10560.Google ScholarCross Ref
Ian Mason, Sebastian Starke, He Zhang, Hakan Bilen, and Taku Komura. 2018. Few-shot Learning of Homogeneous Human Locomotion Styles. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 143--153.Google Scholar
Jianyuan Min, Huajun Liu, and Jinxiang Chai. 2010. Synthesis and editing of personalized stylistic human motion. In Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games. 39--46.Google ScholarDigital Library
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252. https://doi.org/10.1007/s11263-015-0816-yGoogle ScholarDigital Library
Harrison Jesse Smith, Chen Cao, Michael Neff, and Yingying Wang. 2019. Efficient neural networks for real-time motion style transfer. Proceedings of the ACM on Computer Graphics and Interactive Techniques 2, 2 (2019), 1--17.Google ScholarDigital Library
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.Google ScholarCross Ref
Graham W Taylor and Geoffrey E Hinton. 2009. Factored conditional restricted Boltzmann machines for modeling motion style. In Proceedings of the 26th annual international conference on machine learning. 1025--1032.Google ScholarDigital Library
Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2017. Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6924--6932.Google ScholarCross Ref
Xinyao Wang, Liefeng Bo, and Li Fuxin. 2019. Adaptive wing loss for robust face alignment via heatmap regression. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6971--6981.Google ScholarCross Ref
Shihong Xia, Congyi Wang, Jinxiang Chai, and Jessica Hodgins. 2015. Realtime style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics (TOG) 34, 4 (2015), 1--10.Google ScholarDigital Library
Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.Google ScholarCross Ref
Dingdong Yang, Seunghoon Hong, Yunseok Jang, Tiangchen Zhao, and Honglak Lee. 2019. Diversity-Sensitive Conditional Generative Adversarial Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJliMh09F7Google Scholar
M Ersin Yumer and Niloy J Mitra. 2016. Spectral style transfer for human motion between independent actions. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1--8.Google ScholarDigital Library
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.Google ScholarCross Ref

Index Terms

Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Motion processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

A deep learning framework for character motion synthesis and editing

We present a framework to synthesize character movements based on high level parameters, such that the produced movements respect the manifold of human motion, trained on a large motion capture dataset. The learned motion manifold, which is represented ...
Read More
Motion Puzzle: Arbitrary Motion Style Transfer by Body Part
This article presents Motion Puzzle, a novel motion style transfer network that advances the state-of-the-art in several important respects. The Motion Puzzle is the first that can control the motion style of individual body parts, allowing for local ...
Read More
Efficient Neural Networks for Real-time Motion Style Transfer

Style is an intrinsic, inescapable part of human motion. It complements the content of motion to convey meaning, mood, and personality. Existing state-of-the-art motion style methods require large quantities of example data and intensive computational ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Computer Graphics and Interactive Techniques Volume 4, Issue 3
September 2021
268 pages
EISSN:2577-6193
DOI:10.1145/3488568
Issue’s Table of Contents

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 September 2021
Published in pacmcgit Volume 4, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
character animation
deep learning
generative model
graph convolutional networks
motion synthesis
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 566
  Total Downloads
- Downloads (Last 12 months)222
- Downloads (Last 6 weeks)24
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model

Proceedings of the ACM on Computer Graphics and Interactive Techniques

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A deep learning framework for character motion synthesis and editing

Motion Puzzle: Arbitrary Motion Style Transfer by Body Part

Efficient Neural Networks for Real-time Motion Style Transfer

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model

Proceedings of the ACM on Computer Graphics and Interactive Techniques

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A deep learning framework for character motion synthesis and editing

Motion Puzzle: Arbitrary Motion Style Transfer by Body Part

Efficient Neural Networks for Real-time Motion Style Transfer

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media