Abstract
Animating stylized characters to match a reference motion sequence is a highly demanded task in film and gaming industries. Existing methods mostly focus on rigid deformations of characters’ body, neglecting local deformations on the apparel driven by physical dynamics. They deform apparel the same way as the body, leading to results with limited details and unrealistic artifacts, e.g. body-apparel penetration. In contrast, we present a novel method aiming for high-quality motion transfer with realistic apparel animation. As existing datasets lack annotations necessary for generating realistic apparel animations, we build a new dataset named MMDMC, which combines stylized characters from the MikuMikuDance community with real-world Motion Capture data. We then propose a data-driven pipeline that learns to disentangle body and apparel deformations via two neural deformation modules. For body parts, we propose a geodesic attention block to effectively incorporate semantic priors into skeletal body deformation to tackle complex body shapes for stylized characters. Since apparel motion can significantly deviate from respective body joints, we propose to model apparel deformation in a non-linear vertex displacement field conditioned on its historic states. Extensive experiments show that our method produces results with superior quality for various types of apparel. Our dataset is released in https://github.com/rongakowang/MMDMC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Since joints are often defined inside the mesh but not on the surface, we associate each joint with its closest vertex on the mesh surface as an anchor vertex for geodesic distance computation.
References
Mikumikudance. https://en.wikipedia.org/wiki/MikuMikuDance. Accessed 1 Mar 2024
Mixamo. http://www.mixamo.com/. Accessed 1 Mar 2024
Aberman, K., Li, P., Lischinski, D., Sorkine-Hornung, O., Cohen-Or, D., Chen, B.: Skeleton-aware networks for deep motion retargeting. ACM Trans. Graph. (TOG) 39(4), 62–1 (2020)
Bhatnagar, B.L., Tiwari, G., Theobalt, C., Pons-Moll, G.: Multi-garment net: learning to dress 3D people from images. In: IEEE International Conference on Computer Vision (ICCV). IEEE (2019)
Canfes, Z., Atasoy, M.F., Dirik, A., Yanardag, P.: Text and image guided 3D avatar generation and manipulation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 4421–4431 (2023)
Cao, Y., Cao, Y.P., Han, K., Shan, Y., Wong, K.Y.K.: DreamAvatar: text-and-shape guided 3D human avatar generation via diffusion models. arXiv preprint arXiv:2304.00916 (2023)
Chen, J., Li, C., Lee, G.H.: Weakly-supervised 3D pose transfer with keypoints. arXiv preprint arXiv:2307.13459 (2023)
Community, BO: Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018). http://www.blender.org
Coumans, E., Bai, Y.: Pybullet, a python module for physics simulation for games, robotics and machine learning (2016–2021). http://pybullet.org
Hagberg, A., Swart, P., S Chult, D.: Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab. (LANL), Los Alamos, NM (United States) (2008)
Li, P., Aberman, K., Hanocka, R., Liu, L., Sorkine-Hornung, O., Chen, B.: Learning skeletal articulations with neural blend shapes. ACM Trans. Graph. (TOG) 40(4), 1–15 (2021)
Liao, Z., Yang, J., Saito, J., Pons-Moll, G., Zhou, Y.: Skeleton-free pose transfer for stylized 3D characters. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13662, pp. 640–656. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20086-1_37
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34(6), 248:1–248:16 (2015)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Magnenat, T., Laperrière, R., Thalmann, D.: Joint-dependent local deformations for hand animation and object grasping. Technical report, Canadian Inf. Process. Soc (1988)
Mahmood, N., Ghorbani, N., Troje, N.F., Pons-Moll, G., Black, M.J.: Amass: archive of motion capture as surface shapes. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5442–5451 (2019)
O’Hailey, T.: Rig It Right! Maya Animation Rigging Concepts, 2nd edn. Routledge, London (2018)
Pan, X., et al.: Predicting loose-fitting garment deformations using bone-driven motion networks. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–10 (2022)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Patel, C., Liao, Z., Pons-Moll, G.: TailorNet: predicting clothing in 3D as a function of human pose, shape and garment style. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7365–7375 (2020)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. arXiv preprint arXiv:2201.02610 (2022)
Shao, Y., Loy, C.C., Dai, B.: Towards multi-layered 3D garments animation. arXiv preprint arXiv:2305.10418 (2023)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Villegas, R., Ceylan, D., Hertzmann, A., Yang, J., Saito, J.: Contact-aware retargeting of skinned motion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9720–9729 (2021)
Villegas, R., Yang, J., Ceylan, D., Lee, H.: Neural kinematic networks for unsupervised motion retargetting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8639–8648 (2018)
Wang, H., Huang, S., Zhao, F., Yuan, C., Shan, Y.: HMC: hierarchical mesh coarsening for skeleton-free motion retargeting. arXiv preprint arXiv:2303.10941 (2023)
Wang, J., Li, X., Liu, S., De Mello, S., Gallo, O., Wang, X., Kautz, J.: Zero-shot pose transfer for unrigged stylized 3D characters. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8704–8714 (2023)
Wang, J., Wen, C., Fu, Y., Lin, H., Zou, T., Xue, X., Zhang, Y.: Neural pose transfer by spatially adaptive instance normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5831–5839 (2020)
Wang, T.Y., Shao, T., Fu, K., Mitra, N.J.: Learning an intrinsic garment space for interactive authoring of garment animation. ACM Trans. Graph. (TOG) 38(6), 1–12 (2019)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (ToG) 38(5), 1–12 (2019)
Xiang, D., et al.: Dressing avatars: deep photorealistic appearance for physically simulated clothing. ACM Trans. Graph. (TOG) 41(6), 1–15 (2022)
Xiang, D., et al.: Modeling clothing as a separate layer for an animatable human avatar. ACM Trans. Graph. (TOG) 40(6), 1–15 (2021)
Xu, Z., Zhou, Y., Kalogerakis, E., Landreth, C., Singh, K.: RigNet: neural rigging for articulated characters. ACM Trans. on Graphics 39 (2020)
Xu, Z., Zhou, Y., Kalogerakis, E., Singh, K.: Predicting animation skeletons for 3D articulated models via volumetric nets. In: 2019 International Conference on 3D Vision (3DV) (2019)
Yifan, W., Aigerman, N., Kim, V.G., Chaudhuri, S., Sorkine-Hornung, O.: Neural cages for detail-preserving 3D deformations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 75–83 (2020)
Zhang, J., et al.: Skinned motion retargeting with residual perception of motion semantics & geometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13864–13872 (2023)
Zhang, M., Ceylan, D., Mitra, N.J.: Motion guided deep dynamic 3D garments. ACM Trans. Graph. (TOG) 41(6), 1–12 (2022)
Zhao, F., et al.: Learning anchor transformations for 3D garment animation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 491–500 (2023)
Zheng, M., Zhou, Y., Ceylan, D., Barbic, J.: A deep emulator for secondary motion of 3D characters. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5932–5940 (2021)
Acknowledgements
This research is funded in part by an ARC Discovery Grant DP220100800 on human body pose estimation and visual sign language recognition.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 2 (mp4 43265 KB)
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, R., Mao, W., Lu, C., Li, H. (2025). Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15097. Springer, Cham. https://doi.org/10.1007/978-3-031-72933-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-72933-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72932-4
Online ISBN: 978-3-031-72933-1
eBook Packages: Computer ScienceComputer Science (R0)