Abstract
Creating convincing representations of humans is a fundamental problem in both traditional arts and modern media. In our digital world, virtual avatars allow us to simulate and render the human body for a variety of applications, including movie production, sports, human-computer interaction, and medical sciences. However, capturing digital representations of a person’s shape, appearance, and motion is an expensive and time-consuming process which usually requires a lot of manual adjustments.
With the advances in consumer-grade virtual reality devices, personalized virtual avatars became an essential part of interactive and immersive applications like telepresence and virtual try-on for online fashion shopping, thereby increasing the need for versatile easy-to-use self-digitization.
In this chapter, we discuss a selection of recent acquisition methods for personalized human avatar reconstruction. In contrast to conventional setups, these fully-automatic approaches only use low-cost monocular video cameras to effectively fuse information from multiple points in time and realistically complete reconstructions from sparse observations. We address both straight-forward and sophisticated reconstruction methods focused on accuracy, simplicity, and usability to compare and provide insights into their visual fidelity and robustness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, N., de Aguiar, E., Theobalt, C., Magnor, M., Seidel, H.P.: Automatic generation of personalized human avatars from multi-view video. In: Proceedings of the ACM Symposium on Virtual Reality Software and Technology, pp. 257–260. ACM (2005)
Aliev, K.A., Ulyanov, D., Lempitsky, V.: Neural point-based graphics. arXiv preprint arXiv:1906.08240 (2019)
Allain, B., Franco, J.S., Boyer, E.: An efficient volumetric framework for shape tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 268–276. IEEE (2015)
Alldieck, T., Kassubeck, M., Wandt, B., Rosenhahn, B., Magnor, M.: Optical flow-based 3D human motion estimation from monocular video. In: Roth, V., Vetter, T. (eds.) GCPR 2017. LNCS, vol. 10496, pp. 347–360. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66709-6_28
Alldieck, T., Magnor, M., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single RGB camera. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1175–1186. IEEE (2019)
Alldieck, T., Magnor, M., Xu, W., Theobalt, C., Pons-Moll, G.: Detailed human avatars from monocular video. In: International Conference on 3D Vision, pp. 98–109. IEEE (2018)
Alldieck, T., Magnor, M., Xu, W., Theobalt, C., Pons-Moll, G.: Video based reconstruction of 3D people models. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8387–8397. IEEE (2018)
Alldieck, T., Pons-Moll, G., Theobalt, C., Magnor, M.: Tex2Shape: detailed full human body geometry from a single image. In: IEEE International Conference on Computer Vision. IEEE (2019)
Allen, B., Curless, B., Curless, B., Popović, Z.: The space of human body shapes: reconstruction and parameterization from range scans. ACM Trans. Graph. 22(3), 587–594 (2003)
Allen, B., Curless, B., Popović, Z., Hertzmann, A.: Learning a correlated model of identity and pose-dependent body shape variation for real-time synthesis. In: Proceedings of the 2006 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 147–156 (2006)
Alp Güler, R., Neverova, N., Kokkinos, I.: DensePose: dense human pose estimation in the wild. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7297–7306. IEEE (2018)
Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J.: SCAPE: shape completion and animation of people. ACM Trans. Graph. 24(3), 408–416 (2005)
Bălan, A.O., Black, M.J.: The naked truth: estimating body shape under clothing. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 15–29. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88688-4_2
Bălan, A.O., Sigal, L., Black, M.J., Davis, J.E., Haussecker, H.W.: Detailed human shape and pose from images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Blinn, J.F., Newell, M.E.: Texture and reflection in computer generated images. Commun. ACM 19(10), 542–547 (1976)
Bogo, F., Black, M.J., Loper, M., Romero, J.: Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In: IEEE International Conference on Computer Vision, pp. 2300–2308. IEEE (2015)
Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3D human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_34
Bogo, F., Romero, J., Pons-Moll, G., Black, M.J.: Dynamic FAUST: registering human bodies in motion. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Sign. Process. Mag. 34, 18–42 (2017)
Caelles, S., Maninis, K.K., Pont-Tuset, J., Leal-Taixé, L., Cremers, D., Van Gool, L.: One-shot video object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: FaceWarehouse: a 3D facial expression database for visual computing. IEEE Trans. Visual. Comput. Graph. 20(3), 413–425 (2013)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Carranza, J., Theobalt, C., Magnor, M.A., Seidel, H.P.: Free-viewpoint video of human actors. ACM Trans. Graph. 22(3), 569–577 (2003)
Chen, X., Guo, Y., Zhou, B., Zhao, Q.: Deformable model for estimating clothed and naked human shapes from a single image. Vis. Comput. 29(11), 1187–1196 (2013)
Chen, Z., Zhang, H.: Learning implicit fields for generative shape modeling. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Collet, A., et al.: High-quality streamable free-viewpoint video. ACM Trans. Graph. 34(4), 69 (2015)
Cui, Y., Chang, W., Nöll, T., Stricker, D.: KinectAvatar: fully automatic body capture using a single kinect. In: Park, J.-I., Kim, J. (eds.) ACCV 2012. LNCS, vol. 7729, pp. 133–147. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37484-5_12
De Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.P., Thrun, S.: Performance capture from sparse multi-view video. ACM Trans. Graph. 27(3), 98 (2008)
De Aguiar, E., Theobalt, C., Magnor, M., Seidel, H.P., et al.: Reconstructing human shape and motion from multi-view video. In: 2nd European Conference on Visual Media Production (CVMP), pp. 42–49 (2005)
Dibra, E., Jain, H., Öztireli, C., Ziegler, R., Gross, M.: HS-Nets: estimating human body shape from silhouettes with convolutional neural networks. In: International Conference on 3D Vision, pp. 108–117. IEEE (2016)
Dibra, E., Jain, H., Öztireli, C., Ziegler, R., Gross, M.: Human shape from silhouettes using generative HKS descriptors and cross-modal neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Dibra, E., Öztireli, C., Ziegler, R., Gross, M.: Shape from selfies: human body shape estimation using CCA regression forests. In: European Conference on Computer Vision, pp. 88–104 (2016)
Dou, M., et al.: Fusion4D: real-time performance capture of challenging scenes. ACM Trans. Graph. 35(4), 114 (2016)
Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., Seidel, H.P.: Motion capture using joint skeleton tracking and surface estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1746–1753. IEEE (2009)
Gilbert, A., Volino, M., Collomosse, J., Hilton, A.: Volumetric performance capture from minimal camera viewpoints. In: European Conference on Computer Vision (2018)
Gong, K., Liang, X., Li, Y., Chen, Y., Yang, M., Lin, L.: Instance-level human parsing via part grouping network. In: European Conference on Computer Vision (2018)
Guan, P., Weiss, A., Bălan, A.O., Black, M.J.: Estimating human shape and pose from a single image. In: IEEE International Conference on Computer Vision, pp. 1381–1388. IEEE (2009)
Guler, R.A., Kokkinos, I.: Holopose: holistic 3D human reconstruction in-the-wild. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10884–10894. IEEE (2019)
Guo, Y., Chen, X., Zhou, B., Zhao, Q.: Clothed and naked human shapes estimation from a single image. In: Hu, S.-M., Martin, R.R. (eds.) CVM 2012. LNCS, vol. 7633, pp. 43–50. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34263-9_6
Habermann, M., Xu, W., Zollhöfer, M., Pons-Moll, G., Theobalt, C.: LiveCap: real-time human performance capture from monocular video. ACM Trans. Graph. 38(2), 14:1–14:17 (2019)
Hasler, N., Ackermann, H., Rosenhahn, B., Thormahlen, T., Seidel, H.P.: Multilinear pose and body shape estismation of dressed subjects from image sets. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1823–1830. IEEE (2010)
Hasler, N., Stoll, C., Sunkel, M., Rosenhahn, B., Seidel, H.P.: A statistical model of human pose and body shape. Comput. Graph. Forum 28(2), 337–346 (2009)
Henderson, P., Ferrari, V.: Learning to generate and reconstruct 3D meshes with only 2D supervision. In: British Machine Vision Conference (2018)
Hesse, N., Pujades, S., Black, M.J., Arens, M., Hofmann, U., Schroeder, S.: Learning and tracking the 3D body shape of freely moving infants from RGB-D sequences. Trans. Pattern Anal. Mach. Intell. (TPAMI) (2019). https://doi.org/10.1109/TPAMI.2019.2917908. 12 Pages
Hilton, A., Beresford, D.J., Gentils, T., Smith, R.S., Sun, W.: Virtual people: capturing human models to populate virtual worlds. Proc. Comput. Anim. 99, 174 (1999)
Hirshberg, D.A., Loper, M., Rachlin, E., Black, M.J.: Coregistration: simultaneous alignment and modeling of articulated 3D shape. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 242–255. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_18
Huang, C.H., Allain, B., Franco, J.S., Navab, N., Ilic, S., Boyer, E.: Volumetric 3D tracking by detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3862–3870. IEEE (2016)
Huang, Y., et al.: Towards accurate markerless human shape and pose estimation over time. In: International Conference on 3D Vision. IEEE (2017)
Huang, Y., Kaufmann, M., Aksan, E., Black, M.J., Hilliges, O., Pons-Moll, G.: Deep inertial poser learning to reconstruct human pose from sparseinertial measurements in real time. ACM Trans. Graph. 37(6), 185:1–185:15 (2018)
Huang, Z., et al.: Deep volumetric video from very sparse multi-view performance capture. In: European Conference on Computer Vision, pp. 336–354 (2018)
Innmann, M., Zollhöfer, M., Nießner, M., Theobalt, C., Stamminger, M.: VolumeDeform: real-time volumetric non-rigid reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 362–379. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_22
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: DeeperCut: a deeper, stronger, and faster multi-person pose estimation model. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 34–50. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_3
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134. IEEE (2017)
Jackson, A.S., Manafas, C., Tzimiropoulos, G.: 3D human body reconstruction from a single image via volumetric regression. In: European Conference on Computer Vision, pp. 64–77 (2018)
Joo, H., Simon, T., Sheikh, Y.: Total capture: a 3D deformation model for tracking faces, hands, and bodies. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8320–8329. IEEE (2018)
Kakadiaris, I.A., Metaxas, D.: 3D human body model acquisition from multiple views. In: IEEE International Conference on Computer Vision. IEEE (1995)
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2018)
Kanazawa, A., Zhang, J.Y., Felsen, P., Malik, J.: Learning 3D human dynamics from video. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5614–5623. IEEE (2019)
Kim, M., et al.: Data-driven physics for human soft tissue animation. ACM Trans. Graph. 36(4), 1–12 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, vol. 5 (2015)
Kolotouros, N., Pavlakos, G., Daniilidis, K.: Convolutional mesh regression for single-image human shape reconstruction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M.J., Gehler, P.V.: Unite the people: closing the loop between 3D and 2D human representations. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Leroy, V., Franco, J.S., Boyer, E.: Multi-view dynamic shape refinement using local temporal integration. In: IEEE International Conference on Computer Vision. IEEE (2017)
Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3D self-portraits. ACM Trans. Graph. 32(6), 187 (2013)
Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2016)
Lombardi, S., Simon, T., Saragih, J., Schwartz, G., Lehrmann, A., Sheikh, Y.: Neural volumes: learning dynamic renderable volumes from images. arXiv preprint arXiv:1906.07751 (2019)
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. 34(6), 248:1–248:16 (2015)
von Marcard, T., Henschel, R., Black, M.J., Rosenhahn, B., Pons-Moll, G.: Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In: European Conference on Computer Vision (2018)
von Marcard, T., Pons-Moll, G., Rosenhahn, B.: Human pose estimation from video and IMUs. Trans. Pattern Anal. Mach. Intell. (PAMI) 38, 1533–1547 (2016)
von Marcard, T., Rosenhahn, B., Black, M.J., Pons-Moll, G.: Sparse inertial poser: automatic 3D human pose estimation from sparse IMUs. In: Computer Graphics Forum, pp. 349–360 (2017)
Matusik, W., Buehler, C., Raskar, R., Gortler, S.J., McMillan, L.: Image-based visual hulls. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 369–374 (2000)
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3D reconstruction in function space. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Michalkiewicz, M., Pontes, J.K., Jack, D., Baktashmotlagh, M., Eriksson, A.: Deep level sets: implicit surface representations for 3D shape inference. arXiv preprint arXiv:1901.06802 (2019)
Natsume, R., et al.: SiCloPe: silhouette-based clothed people. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 343–352. IEEE (2015)
Omran, M., Lassner, C., Pons-Moll, G., Gehler, P., Schiele, B.: Neural body fitting: unifying deep learning and model based human pose and shape estimation. In: International Conference on 3D Vision. IEEE (2018)
Orts-Escolano, S., et al.: Holoportation: virtual 3D teleportation in real-time. In: Symposium on User Interface Software and Technology, pp. 741–754 (2016)
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: DeepSDF: learning continuous signed distance functions for shape representation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Pavlakos, G., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Pavlakos, G., Zhu, L., Zhou, X., Daniilidis, K.: Learning to estimate 3D human pose and shape from a single color image. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2018)
Pishchulin, L., et al.: DeepCut: joint subset partition and labeling for multi person pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2016)
Pons-Moll, G., Baak, A., Helten, T., Müller, M., Seidel, H.P., Rosenhahn, B.: Multisensor-fusion for 3D full-body human motion capture. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2010)
Pons-Moll, G., Fleet, D.J., Rosenhahn, B.: Posebits for monocular human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2345–2352. IEEE (2014)
Pons-Moll, G., Pujades, S., Hu, S., Black, M.J.: ClothCap: seamless 4D clothing capture and retargeting. ACM Trans. Graph. 36(4), 1–15 (2017)
Pons-Moll, G., Romero, J., Mahmood, N., Black, M.J.: Dyna: a model of dynamic human shape in motion. ACM Trans. Graph. 34, 120 (2015)
Pons-Moll, G., Rosenhahn, B.: Model-based pose estimation. In: Moeslund, T., Hilton, A., Krüger, V., Sigal, L. (eds.) Visual Analysis of Humans, pp. 139–170. Springer, London (2011). https://doi.org/10.1007/978-0-85729-997-0_9
Pumarola, A., Sanchez, J., Choi, G., Sanfeliu, A., Moreno-Noguer, F.: 3DPeople: modeling the geometry of dressed humans. arXiv preprint arXiv:1904.04571 (2019)
Rhodin, H., Robertini, N., Casas, D., Richardt, C., Seidel, H.-P., Theobalt, C.: General automatic human shape and motion capture using volumetric contour cues. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 509–526. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_31
Robertini, N., Casas, D., Rhodin, H., Seidel, H.P., Theobalt, C.: Model-based outdoor performance capture. In: International Conference on 3D Vision. IEEE (2016)
Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM Trans. Graph. 36(6), 245 (2017)
Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: PIFu: pixel-aligned implicit function for high-resolution clothed human digitization. In: IEEE International Conference on Computer Vision. IEEE (2019)
Shapiro, A., et al.: Rapid avatar capture and simulation using commodity depth sensors. Comput. Anim. Virtual Worlds 25(3–4), 201–211 (2014)
Shysheya, A., et al.: Textured neural avatars. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2387–2397. IEEE (2019)
Sigal, L., Balan, A., Black, M.J.: Combined discriminative and generative articulated pose and non-rigid shape estimation. In: Advances in Neural Information Processing Systems, pp. 1337–1344 (2007)
Sitzmann, V., Thies, J., Heide, F., Nießner, M., Wetzstein, G., Zollhöfer, M.: DeepVoxels: learning persistent 3D feature embeddings. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Slavcheva, M., Baust, M., Cremers, D., Ilic, S.: KillingFusion: non-rigid 3D reconstruction without correspondences. In: IEEE Conference on Computer Vision and Pattern Recognition, p. 7, no. 4. IEEE (2017)
Sminchisescu, C., Telea, A.: Human pose estimation from silhouettes. A consistent approach using distance level sets. In: 10th International Conference on Computer Graphics, Visualization and Computer Vision (WSCG 2002) (2002)
Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3D human tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, p. I. IEEE (2003)
Starck, J., Hilton, A.: Surface capture for performance-based animation. IEEE Comput. Graph. Appl. 27(3), 21–31 (2007)
Stoll, C., Hasler, N., Gall, J., Seidel, H.P., Theobalt, C.: Fast articulated motion tracking using a sums of gaussians body model. In: IEEE International Conference on Computer Vision, pp. 951–958. IEEE (2011)
Tao, Y., et al.: DoubleFusion: real-time capture of human performance with inner body shape from a depth sensor. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2018)
Tao, Y., et al.: SimulCap: single-view human performance capture with cloth simulation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE (2019)
Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 677–684. IEEE (2000)
Theobalt, C., Aguiar, E., Magnor, M.A., Seidel, H.P.: Reconstructing human shape, motion and appearance from multi-view video. In: Ozaktas, H.M., Onural, L. (eds.) Three-Dimensional Television. Signals and Communication Technology, pp. 29–57. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-72532-9_3
Theobalt, C., Carranza, J., Magnor, M.A.: Enhancing silhouette-based human motion capture with 3D motion fields. In: Proceedings of the 11th Pacific Conference on Computer Graphics and Applications, pp. 185–193 (2003)
Tung, H.Y., Tung, H.W., Yumer, E., Fragkiadaki, K.: Self-supervised learning of motion capture. In: Advances in Neural Information Processing Systems, pp. 5236–5246 (2017)
Varol, G., et al.: BodyNet: volumetric inference of 3D human body shapes. In: European Conference on Computer Vision (2018)
Vlasic, D., Baran, I., Matusik, W., Popović, J.: Articulated mesh animation from multi-view silhouettes. ACM Trans. Graph. 27(3), 97 (2008)
Wang, W., Qiangeng, X., Ceylan, D., Mech, R., Neumann, U.: DISN: deep implicit surface network for high-quality single-view 3D reconstruction. arXiv preprint arXiv:1905.10711 (2019)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: Asilomar Conference on Signals, Systems & Computers, vol. 2, pp. 1398–1402 (2003)
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2016)
Weiss, A., Hirshberg, D., Black, M.J.: Home 3D body scans from noisy image and range data. In: IEEE International Conference on Computer Vision, pp. 1951–1958. IEEE (2011)
Xiang, D., Joo, H., Sheikh, Y.: Monocular total capture: posing face, body, and hands in the wild. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10965–10974. IEEE (2019)
Xu, W., et al.: MonoPerfCap: human performance capture from monocular video. ACM Trans. Graph. 37, 1–15 (2018)
Yao, P., Fang, Z., Wu, F., Feng, Y., Li, J.: DenseBody: directly regressing dense 3d human pose and shape from a single color image. arXiv preprint arXiv:1903.10153 (2019)
Zeng, M., Zheng, J., Cheng, X., Liu, X.: Templateless quasi-rigid shape modeling with implicit loop-closure. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 145–152. IEEE (2013)
Zhang, C., Pujades, S., Black, M.J., Pons-Moll, G.: Detailed, accurate, human shape estimation from clothed 3D scan sequences. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2017)
Zhang, Q., Fu, B., Ye, M., Yang, R.: Quality dynamic human body modeling using a single low-cost depth camera. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 676–683. IEEE (2014)
Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: DeepHuman: 3D human reconstruction from a single image. arXiv preprint arXiv:1903.06473 (2019)
Zhu, H., Zuo, X., Wang, S., Cao, X., Yang, R.: Detailed human shape estimation from a single image by hierarchical mesh deformation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4491–4500. IEEE (2019)
Zuffi, S., Kanazawa, A., Jacobs, D., Black, M.J.: 3D menagerie: modeling the 3D shape and pose of animals. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5524–5532. IEEE (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Alldieck, T., Kappel, M., Castillo, S., Magnor, M. (2020). Reconstructing 3D Human Avatars from Monocular Images. In: Magnor, M., Sorkine-Hornung, A. (eds) Real VR – Immersive Digital Reality. Lecture Notes in Computer Science(), vol 11900. Springer, Cham. https://doi.org/10.1007/978-3-030-41816-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-41816-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41815-1
Online ISBN: 978-3-030-41816-8
eBook Packages: Computer ScienceComputer Science (R0)