Skip to main content

Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

Animating stylized characters to match a reference motion sequence is a highly demanded task in film and gaming industries. Existing methods mostly focus on rigid deformations of characters’ body, neglecting local deformations on the apparel driven by physical dynamics. They deform apparel the same way as the body, leading to results with limited details and unrealistic artifacts, e.g. body-apparel penetration. In contrast, we present a novel method aiming for high-quality motion transfer with realistic apparel animation. As existing datasets lack annotations necessary for generating realistic apparel animations, we build a new dataset named MMDMC, which combines stylized characters from the MikuMikuDance community with real-world Motion Capture data. We then propose a data-driven pipeline that learns to disentangle body and apparel deformations via two neural deformation modules. For body parts, we propose a geodesic attention block to effectively incorporate semantic priors into skeletal body deformation to tackle complex body shapes for stylized characters. Since apparel motion can significantly deviate from respective body joints, we propose to model apparel deformation in a non-linear vertex displacement field conditioned on its historic states. Extensive experiments show that our method produces results with superior quality for various types of apparel. Our dataset is released in https://github.com/rongakowang/MMDMC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Since joints are often defined inside the mesh but not on the surface, we associate each joint with its closest vertex on the mesh surface as an anchor vertex for geodesic distance computation.

References

  1. Mikumikudance. https://en.wikipedia.org/wiki/MikuMikuDance. Accessed 1 Mar 2024

  2. Mixamo. http://www.mixamo.com/. Accessed 1 Mar 2024

  3. Aberman, K., Li, P., Lischinski, D., Sorkine-Hornung, O., Cohen-Or, D., Chen, B.: Skeleton-aware networks for deep motion retargeting. ACM Trans. Graph. (TOG) 39(4), 62–1 (2020)

    Google Scholar 

  4. Bhatnagar, B.L., Tiwari, G., Theobalt, C., Pons-Moll, G.: Multi-garment net: learning to dress 3D people from images. In: IEEE International Conference on Computer Vision (ICCV). IEEE (2019)

    Google Scholar 

  5. Canfes, Z., Atasoy, M.F., Dirik, A., Yanardag, P.: Text and image guided 3D avatar generation and manipulation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 4421–4431 (2023)

    Google Scholar 

  6. Cao, Y., Cao, Y.P., Han, K., Shan, Y., Wong, K.Y.K.: DreamAvatar: text-and-shape guided 3D human avatar generation via diffusion models. arXiv preprint arXiv:2304.00916 (2023)

  7. Chen, J., Li, C., Lee, G.H.: Weakly-supervised 3D pose transfer with keypoints. arXiv preprint arXiv:2307.13459 (2023)

  8. Community, BO: Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018). http://www.blender.org

  9. Coumans, E., Bai, Y.: Pybullet, a python module for physics simulation for games, robotics and machine learning (2016–2021). http://pybullet.org

  10. Hagberg, A., Swart, P., S Chult, D.: Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab. (LANL), Los Alamos, NM (United States) (2008)

    Google Scholar 

  11. Li, P., Aberman, K., Hanocka, R., Liu, L., Sorkine-Hornung, O., Chen, B.: Learning skeletal articulations with neural blend shapes. ACM Trans. Graph. (TOG) 40(4), 1–15 (2021)

    Article  Google Scholar 

  12. Liao, Z., Yang, J., Saito, J., Pons-Moll, G., Zhou, Y.: Skeleton-free pose transfer for stylized 3D characters. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13662, pp. 640–656. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20086-1_37

    Chapter  Google Scholar 

  13. Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34(6), 248:1–248:16 (2015)

    Google Scholar 

  14. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  15. Magnenat, T., Laperrière, R., Thalmann, D.: Joint-dependent local deformations for hand animation and object grasping. Technical report, Canadian Inf. Process. Soc (1988)

    Google Scholar 

  16. Mahmood, N., Ghorbani, N., Troje, N.F., Pons-Moll, G., Black, M.J.: Amass: archive of motion capture as surface shapes. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5442–5451 (2019)

    Google Scholar 

  17. O’Hailey, T.: Rig It Right! Maya Animation Rigging Concepts, 2nd edn. Routledge, London (2018)

    Book  Google Scholar 

  18. Pan, X., et al.: Predicting loose-fitting garment deformations using bone-driven motion networks. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–10 (2022)

    Google Scholar 

  19. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  20. Patel, C., Liao, Z., Pons-Moll, G.: TailorNet: predicting clothing in 3D as a function of human pose, shape and garment style. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7365–7375 (2020)

    Google Scholar 

  21. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

    Google Scholar 

  22. Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. arXiv preprint arXiv:2201.02610 (2022)

  23. Shao, Y., Loy, C.C., Dai, B.: Towards multi-layered 3D garments animation. arXiv preprint arXiv:2305.10418 (2023)

  24. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  25. Villegas, R., Ceylan, D., Hertzmann, A., Yang, J., Saito, J.: Contact-aware retargeting of skinned motion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9720–9729 (2021)

    Google Scholar 

  26. Villegas, R., Yang, J., Ceylan, D., Lee, H.: Neural kinematic networks for unsupervised motion retargetting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8639–8648 (2018)

    Google Scholar 

  27. Wang, H., Huang, S., Zhao, F., Yuan, C., Shan, Y.: HMC: hierarchical mesh coarsening for skeleton-free motion retargeting. arXiv preprint arXiv:2303.10941 (2023)

  28. Wang, J., Li, X., Liu, S., De Mello, S., Gallo, O., Wang, X., Kautz, J.: Zero-shot pose transfer for unrigged stylized 3D characters. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8704–8714 (2023)

    Google Scholar 

  29. Wang, J., Wen, C., Fu, Y., Lin, H., Zou, T., Xue, X., Zhang, Y.: Neural pose transfer by spatially adaptive instance normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5831–5839 (2020)

    Google Scholar 

  30. Wang, T.Y., Shao, T., Fu, K., Mitra, N.J.: Learning an intrinsic garment space for interactive authoring of garment animation. ACM Trans. Graph. (TOG) 38(6), 1–12 (2019)

    Google Scholar 

  31. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (ToG) 38(5), 1–12 (2019)

    Article  Google Scholar 

  32. Xiang, D., et al.: Dressing avatars: deep photorealistic appearance for physically simulated clothing. ACM Trans. Graph. (TOG) 41(6), 1–15 (2022)

    Article  Google Scholar 

  33. Xiang, D., et al.: Modeling clothing as a separate layer for an animatable human avatar. ACM Trans. Graph. (TOG) 40(6), 1–15 (2021)

    Article  Google Scholar 

  34. Xu, Z., Zhou, Y., Kalogerakis, E., Landreth, C., Singh, K.: RigNet: neural rigging for articulated characters. ACM Trans. on Graphics 39 (2020)

    Google Scholar 

  35. Xu, Z., Zhou, Y., Kalogerakis, E., Singh, K.: Predicting animation skeletons for 3D articulated models via volumetric nets. In: 2019 International Conference on 3D Vision (3DV) (2019)

    Google Scholar 

  36. Yifan, W., Aigerman, N., Kim, V.G., Chaudhuri, S., Sorkine-Hornung, O.: Neural cages for detail-preserving 3D deformations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 75–83 (2020)

    Google Scholar 

  37. Zhang, J., et al.: Skinned motion retargeting with residual perception of motion semantics & geometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13864–13872 (2023)

    Google Scholar 

  38. Zhang, M., Ceylan, D., Mitra, N.J.: Motion guided deep dynamic 3D garments. ACM Trans. Graph. (TOG) 41(6), 1–12 (2022)

    Article  Google Scholar 

  39. Zhao, F., et al.: Learning anchor transformations for 3D garment animation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 491–500 (2023)

    Google Scholar 

  40. Zheng, M., Zhou, Y., Ceylan, D., Barbic, J.: A deep emulator for secondary motion of 3D characters. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5932–5940 (2021)

    Google Scholar 

Download references

Acknowledgements

This research is funded in part by an ARC Discovery Grant DP220100800 on human body pose estimation and visual sign language recognition.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rong Wang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 29811 KB)

Supplementary material 2 (mp4 43265 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, R., Mao, W., Lu, C., Li, H. (2025). Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15097. Springer, Cham. https://doi.org/10.1007/978-3-031-72933-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72933-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72932-4

  • Online ISBN: 978-3-031-72933-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics