Abstract
Fabric materials are central to recreating realistic appearance of avatars in a virtual world and many VR applications, ranging from virtual try-on, teleconferencing, to character animation. We propose an end-to-end network model that uses video input to estimate the fabric materials of the garment worn by a human or an avatar in a virtual world. To achieve the high accuracy, we jointly learn human body and the garment geometry as conditions to material prediction. Due to the highly dynamic and deformable nature of cloth, general data-driven garment modeling remains a challenge. To address this problem, we propose a two-level auto-encoder to account for both global and local features of any garment geometry that would directly affect material perception. Using this network, we can also achieve smooth geometry transitioning between different garment topologies. During the estimation, we use a closed-loop optimization structure to share information between tasks and feed the learned garment features for temporal estimation of garment materials. Experiments show that our proposed network structures greatly improve the material classification accuracy by 1.5x, with applicability to unseen input. It also runs at least three orders of magnitude faster than the state-of-the-art [59, 61]. We demonstrate the recovered fabric materials on virtual try-on, where we recreate the entire avatar appearance, including body shape and pose, garment geometry and materials from only a single video.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alldieck, T., Magnor, M.A., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single RGB camera. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 1175–1186. Computer Vision Foundation/IEEE (2019). https://doi.org/10.1109/CVPR.2019.00127, http://openaccess.thecvf.com/content_CVPR_2019/html/Alldieck_Learning_to_Reconstruct_People_in_Clothing_From_a_Single_RGB_CVPR_2019_paper.html
Alldieck, T., Magnor, M.A., Xu, W., Theobalt, C., Pons-Moll, G.: Video based reconstruction of 3d people models. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 8387–8397. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00875, http://openaccess.thecvf.com/content_cvpr_2018/html/Alldieck_Video_Based_Reconstruction_CVPR_2018_paper.html
Alldieck, T., Pons-Moll, G., Theobalt, C., Magnor, M.A.: Tex2shape: detailed full human body geometry from a single image. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019, pp. 2293–2303. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00238
Bednarik, J., Parashar, S., Gundogdu, E., Salzmann, M., Fua, P.: Shape reconstruction by learning differentiable surface representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4716–4725 (2020)
Bhat, K.S., Twigg, C.D., Hodgins, J.K., Khosla, P., Popovic, Z., Seitz, S.M.: Estimating cloth simulation parameters from video (2003)
Bhatnagar, B.L., Tiwari, G., Theobalt, C., Pons-Moll, G.: Multi-garment net: learning to dress 3d people from images. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019, pp. 5419–5429. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00552
Bi, W., Jin, P., Nienborg, H., Xiao, B.: Estimating mechanical properties of cloth from videos using dense motion trajectories: Human psychophysics and machine learning. J. Vision 18(5), 12–12 (2018)
Bi, W., Xiao, B.: Perceptual constancy of mechanical properties of cloth under variation of external forces. In: Proceedings of the ACM Symposium on Applied Perception, pp. 19–23 (2016)
Bickel, B., et al.: Design and fabrication of materials with desired deformation behavior. ACM Trans. Graph. (TOG) 29(4), 1–10 (2010)
Bouman, K.L., Xiao, B., Battaglia, P., Freeman, W.T.: Estimating the material properties of fabric from video. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1984–1991 (2013)
Bradley, D., Popa, T., Sheffer, A., Heidrich, W., Boubekeur, T.: Markerless garment capture. In: ACM SIGGRAPH 2008 papers, pp. 1–9 (2008)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Casati, R., Daviet, G., Bertails-Descoubes, F.: Inverse elastic cloth design with contact and friction. Ph.D. thesis, Inria Grenoble Rhône-Alpes, Université de Grenoble (2016)
Chen, X., Zhou, B., Lu, F.X., Wang, L., Bi, L., Tan, P.: Garment modeling with a depth camera. ACM Trans. Graph. 34(6), 203–1 (2015)
Clyde, D., Teran, J., Tamstorf, R.: Modeling and data-driven parameter estimation for woven fabrics. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 1–11 (2017)
Daněřek, R., Dibra, E., Öztireli, C., Ziegler, R., Gross, M.: Deepgarment: 3d garment shape estimation from a single image. In: Computer Graphics Forum, vol. 36, pp. 269–280. Wiley Online Library (2017)
Deprelle, T., Groueix, T., Fisher, M., Kim, V., Russell, B., Aubry, M.: Learning elementary structures for 3d shape generation and matching. In: Advances in Neural Information Processing Systems, pp. 7433–7443 (2019)
Feydy, J., Séjourné, T., Vialard, F.X., Amari, S.I., Trouvé, A., Peyré, G.: Interpolating between optimal transport and mmd using sinkhorn divergences. arXiv preprint arXiv:1810.08278 (2018)
Gong, W., et al.: Human pose estimation from monocular images: a comprehensive survey. Sensors 16(12), 1966 (2016)
Guarnera, G.C., Hall, P., Chesnais, A., Glencross, M.: Woven fabric model creation from a single image. ACM Trans. Graph. (TOG) 36(5), 1–13 (2017)
Gundogdu, E., Constantin, V., Seifoddini, A., Dang, M., Salzmann, M., Fua, P.: Garnet: a two-stream network for fast and accurate 3d cloth draping. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8739–8748 (2019)
Habermann, M., Xu, W., Zollhoefer, M., Pons-Moll, G., Theobalt, C.: Deepcap: monocular human performance capture using weak supervision. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, June 2020
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Huang, Z., Xu, Y., Lassner, C., Li, H., Tung, T.: Arch: animatable reconstruction of clothed humans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3093–3102 (2020)
Jeong, M.H., Han, D.H., Ko, H.S.: Garment capture from a photograph. Comput. Animation Virtual Worlds 26(3–4), 291–300 (2015)
Jiang, B., Zhang, J., Hong, Y., Luo, J., Liu, L., Bao, H.: Bcnet: learning body and cloth shape from a single image. arXiv preprint arXiv:2004.00214 (2020)
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7122–7131 (2018)
Klokov, R., Lempitsky, V.: Escape from cells: Deep KD-networks for the recognition of 3d point cloud models. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
Kolotouros, N., Pavlakos, G., Black, M.J., Daniilidis, K.: Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2252–2261 (2019)
Lahner, Z., Cremers, D., Tung, T.: Deepwrinkles: accurate and realistic clothing modeling. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 667–684 (2018)
Li, J., Chen, B.M., Hee Lee, G.: So-net: self-organizing network for point cloud analysis. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on x-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)
Liang, J., Lin, M.C.: Shape-aware human pose and shape reconstruction using multi-view images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4352–4362 (2019)
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. (TOG) 34(6), 1–16 (2015)
Ma, Q., Saito, S., Yang, J., Tang, S., Black, M.J.: Scale: modeling clothed humans with a surface codec of articulated local elements. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16082–16093 (2021)
Mehta, D., et al.: VNect: real-time 3d human pose estimation with a single RGB camera. ACM Trans. Graph. (TOG) 36(4), 1–14 (2017)
Miguel, E., et al.: Data-driven estimation of cloth simulation models. In: Computer Graphics Forum, vol. 31, pp. 519–528. Wiley Online Library (2012)
Miguel, E., et al.: Modeling and estimation of internal friction in cloth. ACM Trans. Graph. (TOG) 32(6), 1–10 (2013)
Patel, C., Liao, Z., Pons-Moll, G.: TailorNet: predicting clothing in 3d as a function of human pose, shape and garment style. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, June 2020
Pumarola, A., Sanchez-Riera, J., Choi, G., Sanfeliu, A., Moreno-Noguer, F.: 3dpeople: Modeling the geometry of dressed humans. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2242–2251 (2019)
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3d object detection from RGB-d data. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)
Rasheed, A.H., Romero, V., Bertails-Descoubes, F., Wuhrer, S., Franco, J.S., Lazarus, A.: Learning to measure the static friction coefficient in cloth contact. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9912–9921 (2020)
Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2304–2314 (2019)
Saltelli, A.: Sensitivity analysis for importance assessment. Risk Anal. 22(3), 579–590 (2002)
Santesteban, I., Otaduy, M.A., Casas, D.: Learning-based animation of clothing for virtual try-on. In: Computer Graphics Forum, vol. 38, pp. 355–366. Wiley Online Library (2019)
Smith, D., Loper, M., Hu, X., Mavroidis, P., Romero, J.: Facsimile: fast and accurate scans from an image in less than a second. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5330–5339 (2019)
Tan, Q., Pan, Z., Gao, L., Manocha, D.: Realtime simulation of thin-shell deformable materials using CNN-based mesh embedding. IEEE Robot. Autom. Lett. 5(2), 2325–2332 (2020)
Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. (TOG) 38(4), 1–12 (2019)
Tiwari, G., Bhatnagar, B.L., Tung, T., Pons-Moll, G.: Sizer: a dataset and model for parsing 3d clothing and learning size sensitive 3d clothing. arXiv preprint arXiv:2007.11610 (2020)
Varol, G., et al.: Bodynet: Volumetric inference of 3d human body shapes. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 20–36 (2018)
Vidaurre, R., Casas, D., Garces, E., Lopez-Moreno, J.: BRDF estimation of complex materials with nested learning. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1347–1356. IEEE (2019)
Wang, H., O’Brien, J.F., Ramamoorthi, R.: Data-driven elastic models for cloth: modeling and measurement. ACM Trans. Graph. (TOG) 30(4), 1–12 (2011)
Wang, T.Y., Ceylan, D., Popovic, J., Mitra, N.J.: Learning a shared shape space for multimodal garment design. arXiv preprint arXiv:1806.11335 (2018)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016)
Xu, Y., Zhu, S.C., Tung, T.: DenseRaC: joint 3d pose and shape estimation by dense render-and-compare. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7760–7770 (2019)
Yang, S., et al.: Detailed garment recovery from a single-view image. arXiv preprint arXiv:1608.01250 (2016)
Yang, S., Liang, J., Lin, M.C.: Learning-based cloth material recovery from video. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4383–4393 (2017)
Yang, S., et al.: Physics-inspired garment recovery from a single-view image. ACM Trans. Graph. (TOG) 37(5), 1–14 (2018)
Yu, T., et al.: SimulCap: Single-view human performance capture with cloth simulation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5499–5509. IEEE (2019)
Zakharkin, I., Mazur, K., Grigorev, A., Lempitsky, V.: Point-based modeling of human clothing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14718–14727 (2021)
Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: DeepHuman: 3d human reconstruction from a single image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7739–7749 (2019)
Zhou, B., Chen, X., Fu, Q., Guo, K., Tan, P.: Garment modeling from a single image. In: Computer Graphics Forum, vol. 32, pp. 85–91. Wiley Online Library (2013)
Zhou, Y., Tuzel, O.: VoxelNet: End-to-end learning for point cloud based 3d object detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Zhu, H., et al.: Deep fashion3d: a dataset and benchmark for 3d garment reconstruction from single images. arXiv preprint arXiv:2003.12753 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liang, J., Lin, M. (2022). Fabric Material Recovery from Video Using Multi-scale Geometric Auto-Encoder. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13697. Springer, Cham. https://doi.org/10.1007/978-3-031-19836-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-031-19836-6_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19835-9
Online ISBN: 978-3-031-19836-6
eBook Packages: Computer ScienceComputer Science (R0)