skip to main content
10.1145/3623264.3624473acmconferencesArticle/Chapter ViewAbstractPublication PagesmigConference Proceedingsconference-collections
short-paper

Video-Based Motion Retargeting Framework between Characters with Various Skeleton Structure

Published: 15 November 2023 Publication History

Abstract

We introduce a motion retargeting framework capable of animating characters with distinct skeletal structures using video data. While prior studies have successfully performed motion retargeting between skeletons with different structures, retargeting noisy and unnatural motion data extracted from monocular videos has proved challenging. Addressing this issue, our approach proposes a deep learning framework, retargeting motion data procured from easily accessible monocular videos, to animate characters with diverse skeletal structures. Our approach is aimed at providing support for individual creators in character animation.
Our proposed framework pre-processes motion data derived from multiple monocular videos by two-stage pose estimation, using this as the training dataset for Skeleton-Aware Motion Retargeting Network (SAMRN). In addition, we introduce a loss function for the rotation angle of the character’s root node to address the rotation issue inherent in SAMRN. Furthermore, by incorporating motion data extracted from videos and adding a loss function for the character’s root node and end-effector’s velocities, the proposed method makes it possible to generate natural motion data that is closely aligned with the source video. We demonstrate the effectiveness of the proposed framework for motion retargeting between monocular videos and various characters through both qualitative and quantitative evaluations.

Supplementary Material

"Supplemental material", "Supplemental video" (supplemental_material.pdf)

References

[1]
Kfir Aberman, Peizhuo Li, Dani Lischinski, Olga Sorkine-Hornung, Daniel Cohen-Or, and Baoquan Chen. 2020. Skeleton-Aware Networks for Deep Motion Retargeting. ACM Trans. Graph. 39, 4, Article 62 (aug 2020), 14 pages. https://doi.org/10.1145/3386569.3392462
[2]
Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2021. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1 (jan 2021), 172–186. https://doi.org/10.1109/TPAMI.2019.2929257
[3]
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. 2018. Cascaded Pyramid Network for Multi-Person Pose Estimation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). CVF / IEEE Computer Society, USA, 7103–7112. https://doi.org/10.1109/CVPR.2018.00742
[4]
Michael Gleicher. 1998. Retargetting Motion to New Characters. In Proc. 25th Annual Conference on Computer Graphics and Interactive Techniques(SIGGRAPH ’98). ACM, New York, NY, USA, 33–42. https://doi.org/10.1145/280814.280820
[5]
Jia Gong, Lin Geng Foo, Zhipeng Fan, Qiuhong Ke, Hossein Rahmani, and Jun Liu. 2023. DiffPose: Toward More Reliable 3D Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13041–13051.
[6]
Adobe Inc.2023. Mixamo dataset. https://www.mixamo.com/ accessed: 13th July, 2023.
[7]
Hanyoung Jang, Byungjun Kwon, Moonwon Yu, Seong Uk Kim, and Jongmin Kim. 2018. A Variational U-Net for Motion Retargeting. In SIGGRAPH Asia 2018 Posters. ACM, New York, NY, USA, Article 1, 2 pages. https://doi.org/10.1145/3283289.3283316
[8]
Muhammed Kocabas, Nikos Athanasiou, and Michael J. Black. 2020. VIBE: Video Inference for Human Body Pose and Shape Estimation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). CVF / IEEE, USA, 5252–5262. https://doi.org/10.1109/CVPR42600.2020.00530
[9]
Jehee Lee and Sung Yong Shin. 1999. A Hierarchical Approach to Interactive Motion Editing for Human-like Figures. In Proc. 26th Annual Conference on Computer Graphics and Interactive Techniques(SIGGRAPH ’99). ACM Press/Addison-Wesley Publishing Co., USA, 39–48. https://doi.org/10.1145/311535.311539
[10]
Wenhao Li, Hong Liu, Hao Tang, Pichao Wang, and Luc Van Gool. 2022. MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13147–13156.
[11]
Julieta Martinez, Rayat Hossain, Javier Romero, and James J. Little. 2017. A Simple Yet Effective Baseline for 3D Human Pose Estimation. In Proc. IEEE International Conference on Computer Vision (ICCV). IEEE, USA, 2659–2668. https://doi.org/10.1109/ICCV.2017.288
[12]
Georgios Pavlakos, Xiaowei Zhou, and Kostas Daniilidis. 2018. Ordinal depth supervision for 3d human pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7307–7316.
[13]
Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli. 2019. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). CVF / IEEE, USA, 7753–7762. https://doi.org/10.1109/CVPR.2019.00794
[14]
Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, and Sergey Levine. 2018. SFV: Reinforcement Learning of Physical Skills from Videos. ACM Trans. Graph. 37, 6, Article 178 (Nov. 2018), 14 pages.
[15]
Seyoon Tak and Hyeong-Seok Ko. 2005. A Physically-Based Motion Retargeting Filter. ACM Trans. Graph. 24, 1 (jan 2005), 98–117. https://doi.org/10.1145/1037957.1037963
[16]
Shashank Tripathi, Lea Müller, Chun-Hao P. Huang, Omid Taheri, Michael J. Black, and Dimitrios Tzionas. 2023. 3D Human Pose Estimation via Intuitive Physics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4713–4725.
[17]
Ruben Villegas, Jimei Yang, Duygu Ceylan, and Honglak Lee. 2018. Neural Kinematic Networks for Unsupervised Motion Retargetting. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). CVF / IEEE, USA, 8639–8648. https://doi.org/10.1109/CVPR.2018.00901
[18]
Jinlu Zhang, Zhigang Tu, Jianyu Yang, Yujin Chen, and Junsong Yuan. 2022. MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13232–13242.
[19]
Ce Zheng, Wenhan Wu, Chen Chen, Taojiannan Yang, Sijie Zhu, Ju Shen, Nasser Kehtarnavaz, and Mubarak Shah. 2023. Deep learning-based human pose estimation: A survey. Comput. Surveys 56, 1 (2023), 1–37.

Cited By

View all
  • (2024)A System for Retargeting Human Motion to Robot with Augmented Feedback via a Digital Twin Setup2024 10th International Conference on Control, Automation and Robotics (ICCAR)10.1109/ICCAR61844.2024.10569840(95-100)Online publication date: 27-Apr-2024

Index Terms

  1. Video-Based Motion Retargeting Framework between Characters with Various Skeleton Structure

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MIG '23: Proceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games
    November 2023
    224 pages
    ISBN:9798400703935
    DOI:10.1145/3623264
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 November 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. character animation
    2. motion retargeting
    3. neural networks

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    MIG '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate -9 of -9 submissions, 100%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)75
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A System for Retargeting Human Motion to Robot with Augmented Feedback via a Digital Twin Setup2024 10th International Conference on Control, Automation and Robotics (ICCAR)10.1109/ICCAR61844.2024.10569840(95-100)Online publication date: 27-Apr-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media