ABSTRACT
The development of 3D digital humans has the potential to revolutionize many industries, including film, gaming and virtual reality. We propose the innovative Pose Controllable human textured mesh generator (PC-GET) that allows to synthesize 3D human textured mesh with given pose. The human textured mesh format can be directly used by 3D rendering engines and deployable to new lighting scenarios, thus suitable for downstream application. The key point is how to use the prior information of human pose to guide the generation process. We use parametric model SMPL-x[8] as human body template to represent given human pose and shape. We explore two independent ways to generate digital human with the given pose, rasterization tri-plane feature based on template and the coarse point cloud feature representation. First, we obtain the orthogonal-viewed tri-plane feature through rasterization based on template to explicity guide the generation. Second, we use the coarse point cloud of the template as the input and extract the feature to implicitly guide the generation. We use 2D images with the camera pose of dataset Thuman2.0[7] to train the model without supervision of 3D data. Experimental results show that our model PC-GET is capable of generating pose-controllable 3D human textured mesh.
- Mildenhall B, Srinivasan P P, Tancik M, Nerf: Representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106Google ScholarDigital Library
- Noguchi A, Sun X, Lin S, Unsupervised learning of efficient geometry-aware neural articulated representations[C]//Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII. Cham: Springer Nature Switzerland, 2022: 597-614Google Scholar
- Hong F, Chen Z, Lan Y, Eva3d: Compositional 3d human generation from 2d image collections[J]. arXiv preprint arXiv:2210.04888, 2022Google Scholar
- Gao J, Shen T, Wang Z, GET3D: A generative model of high quality 3d textured shapes learned from images[J]. arXiv preprint arXiv:2209.11163, 2022Google Scholar
- Shen T, Gao J, Yin K, Deep marching tetrahedra: a hybrid representation for high resolution 3d shape synthesis[J]. Advances in Neural Information Processing Systems, 2021, 34: 6087-6101Google Scholar
- Laine S, Hellsten J, Karras T, Modular primitives for high-performance differentiable rendering[J]. ACM Transactions on Graphics (TOG), 2020, 39(6): 1-14Google ScholarDigital Library
- Yu T, Zheng Z, Guo K, Function4d: Real-time human volumetric capture from very sparse consumer rgbd sensors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 5746-5756Google Scholar
- Pavlakos G, Choutas V, Ghorbani N, Expressive body capture: 3d hands, face, and body from a single image[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 10975-1098Google Scholar
- Sun J, Wang X, Wang L, Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars[J]. arXiv preprint arXiv:2211.11208, 2022Google Scholar
- Liu Z, Tang H, Lin Y, Point-voxel cnn for efficient 3d deep learning[J]. Advances in Neural Information Processing Systems, 2019, 32Google Scholar
- Karras, T., Laine S, Aila T. A style-Based generator architecture for generative adversarial networks[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019Google Scholar
- Heusel M, Ramsauer H, Unterthiner T, GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium[J]. Advances in neural information processing systems, 30, 2017Google Scholar
- Andriluka M, Pishchulin L, Gehler P, 2d human pose estimation: New benchmark and state of the art analysis. [C]// In Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pp. 3686–3693, 2014Google Scholar
Index Terms
- PC-GET: Pose Controllable Human Textured Mesh Generation
Recommendations
SkeletonGAN: Fine-Grained Pose Synthesis of Human-Object Interactions
ICMVA '23: Proceedings of the 2023 6th International Conference on Machine Vision and ApplicationsSynthesizing Human-Object Interactions (HOI) is a challenging problem since the human body has a complex and versatile representation. Existing solutions can generate individual objects or faces very well but still face difficulty in generating ...
Unsupervised Textured Terrain Generation via Differentiable Rendering
MM '22: Proceedings of the 30th ACM International Conference on MultimediaConstructing large-scale realistic terrains using modern modeling tools is an extremely challenging task even for professional users, undermining the effectiveness of video games, virtual reality, and other applications. In this paper, we present a step ...
Pose-guided 3D human generation in indoor scene
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial IntelligenceIn this work, we address the problem of scene-aware 3D human avatar generation based on human-scene interactions. In particular, we pay attention to the fact that physical contact between a 3D human and a scene (i.e., physical human-scene interactions) ...
Comments