research-article

Motion2fusion: real-time volumetric performance capture

Authors:

Philip Davidson,

Sean Ryan Fanello,

Christoph Rhemann,

Vladimir Tankovich,

Shahram IzadiAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 36, Issue 6

Article No.: 246, Pages 1 - 16

https://doi.org/10.1145/3130800.3130801

Published: 20 November 2017 Publication History

Abstract

We present Motion2Fusion, a state-of-the-art 360 performance capture system that enables *real-time* reconstruction of arbitrary non-rigid scenes. We provide three major contributions over prior work: 1) a new non-rigid fusion pipeline allowing for far more faithful reconstruction of high frequency geometric details, avoiding the over-smoothing and visual artifacts observed previously. 2) a high speed pipeline coupled with a machine learning technique for 3D correspondence field estimation reducing tracking errors and artifacts that are attributed to fast motions. 3) a backward and forward non-rigid alignment strategy that more robustly deals with topology changes but is still free from scene priors. Our novel performance capture system demonstrates real-time results nearing 3x speed-up from previous state-of-the-art work on the exact same GPU hardware. Extensive quantitative and qualitative comparisons show more precise geometric and texturing results with less artifacts due to fast motions or topology changes than prior art.

Supplementary Material

MP4 File (a246-dou.mp4)

Download
93.93 MB

References

[1]

Christian Bailer, Bertram Taetz, and Didier Stricker. 2015. Flow Fields: Dense correspondence fields for highly accurate large displacement optical flow estimation. In Proceedings of the IEEE International Conference on Computer Vision. 4015--4023.

Digital Library

[2]

Ilya Baran and Jovan Popović. 2007. Automatic Rigging and Animation of 3D Characters. ACM TOG 26, 3 (2007), 72.

Digital Library

[3]

Chen Cao, Derek Bradley, Kun Zhou, and Thabo Beeler. 2015. Real-time high-fidelity facial performance capture. ACM TOG 34, 4 (2015), 46.

Digital Library

[4]

Chen Cao, Yanlin Weng, Stephen Lin, and Kun Zhou. 2013. 3D Shape Regression for Real-time Facial Animation. ACM TOG 32, 4, Article 41 (2013), 10 pages.

Digital Library

[5]

Nathan A Carr and John C Hart. 2002. Meshed atlases for real-time procedural solid texturing. ACM TOG 21, 2 (2002), 106--131.

Digital Library

[6]

Alvaro Collet, Ming Chuang, Pat Sweeney, Don Gillett, Dennis Evseev, David Calabrese, Hugues Hoppe, Adam Kirk, and Steve Sullivan. 2015a. High-quality Streamable Free-viewpoint Video. ACM TOG (2015).

Digital Library

[7]

Alvaro Collet, Ming Chuang, Pat Sweeney, Don Gillett, Dennis Evseev, David Calabrese, Hugues Hoppe, Adam Kirk, and Steve Sullivan. 2015b. High-quality streamable free-viewpoint video. ACM TOG 34, 4 (2015), 69.

Digital Library

[8]

Brian Curless and Marc Levoy. 1996. A volumetric method for building complex models from range images. In SIGGRAPH. 303--312.

Digital Library

[9]

Mingsong Dou, Sameh Khamis, Yury Degtyarev, Philip Davidson, Sean Ryan Fanello, Adarsh Kowdle, Sergio Orts Escolano, Christoph Rhemann, David Kim, Jonathan Taylor, Pushmeet Kohli, Vladimir Tankovich, and Shahram Izadi. 2016. Fusion4D: Real-time Performance Capture of Challenging Scenes. ACM TOG 35, 4 (2016), 114.

Digital Library

[10]

Sean Ryan Fanello, Cem Keskin, Shahram Izadi, Pushmeet Kohli, David Kim, David Sweeney, Antonio Criminisi, Jamie Shotton, Sing Bing Kang, and Tim Paek. 2014. Learning to be a depth camera for close-range human capture and interaction. In ACM Transactions on Graphics (TOG).

Digital Library

[11]

Sean Ryan Fanello, Christoph Rhemann, Vladimir Tankovich, A Kowdle, S Orts Escolano, D Kim, and S Izadi. 2016. Hyperdepth: Learning depth from structured light without matching. In CVPR.

[12]

Sean Ryan Fanello, Julien Valentin, Adarsh Kowdle, Christoph Rhemann, Vladimir Tankovich, Carlo Ciliberto, Philip Davidson, and Shahram Izadi. 2017a. Low Compute and Fully Parallel Computer Vision with HashMatch. In ICCV.

[13]

Sean Ryan Fanello, Julien Valentin, Christoph Rhemann, Adarsh Kowdle, Vladimir Tankovich, Philip Davidson, and Shahram Izadi. 2017b. UltraStereo: Efficient Learning-based Matching for Active Stereo Systems. In CVPR.

[14]

Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazirbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, and Thomas Brox. 2015. FlowNet: Learning Optical Flow with Convolutional Networks. In ICCV. 2758--2766.

Digital Library

[15]

Kaiwen Guo, Feng Xu, Yangang Wang, Yebin Liu, and Qionghai Dai. 2015. Robust Non-Rigid Motion Tracking and Surface Reconstruction Using L0 Regularization. In ICCV. 3083--3091.

Digital Library

[16]

Kaiwen Guo, Feng Xu, Tao Yu, Xiaoyang Liu, Qionghai Dai, and Yebin Liu. 2017. Real-time Geometry, Albedo and Motion Reconstruction Using a Single RGBD Camera. ACM Transactions on Graphics (TOG) (2017).

Digital Library

[17]

Matthias Innmann, Michael Zollhöfer, Matthias Nießner, Christian Theobalt, and Marc Stamminger. 2016. VolumeDeform: Real-time volumetric non-rigid reconstruction. In ECCV. 362--379.

[18]

Varun Jain and Hao Zhang. 2006. Robust 3D Shape Correspondence in the Spectral Domain. In SMA. 19--19.

Digital Library

[19]

Ladislav Kavan, Steven Collins, Jiří Žára, and Carol O'Sullivan. 2007. Skinning with dual quaternions. In Proceedings of the 2007 symposium on Interactive 3D graphics and games. ACM, 39--46.

Digital Library

[20]

S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In CVPR, Vol. 2. 2169--2178.

Digital Library

[21]

Marius Leordeanu, Martial Hebert, and Rahul Sukthankar. 2009. An Integer Projected Fixed Point Method for Graph Matching and MAP Inference. In NIPS.

Digital Library

[22]

Bruno Lévy, Sylvain Petitjean, Nicolas Ray, and Jérome Maillot. 2002. Least squares conformal maps for automatic texture atlas generation. ACM TOG 21, 3 (2002), 362--371.

Digital Library

[23]

Hao Li, Bart Adams, Leonidas J Guibas, and Mark Pauly. 2009. Robust single-view geometry and motion reconstruction. In ACM Transactions on Graphics (TOG), Vol. 28. ACM, 175.

Digital Library

[24]

Hao Li, Etienne Vouga, Anton Gudym, Linjie Luo, Jonathan T Barron, and Gleb Gusev. 2013. 3D self-portraits. ACM Transactions on Graphics (TOG) 32, 6 (2013), 187.

Digital Library

[25]

Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. 2017. VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera. ACM Transactions on Graphics 36, 4, 14.

Digital Library

[26]

Mark Meyer, Mathieu Desbrun, Peter Schröder, and Alan H Barr. 2002. Discrete differential-geometry operators for triangulated 2-manifolds. Visualization and mathematics 3, 2 (2002), 52--58.

[27]

Richard A Newcombe, Dieter Fox, and Steven M Seitz. 2015. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In Proceedings of the IEEE conference on computer vision and pattern recognition. 343--352.

[28]

Richard A Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J Davison, Pushmeet Kohi, Jamie Shotton, Steve Hodges, and Andrew Fitzgibbon. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Mixed and augmented reality (ISMAR), 2011 10th IEEE international symposium on. IEEE, 127--136.

Digital Library

[29]

Sergio Orts-Escolano, Christoph Rhemann, Sean Fanello, Wayne Chang, Adarsh Kowdle, Yury Degtyarev, David Kim, Philip L Davidson, Sameh Khamis, Mingsong Dou, et al. 2016. Holoportation: Virtual 3D Teleportation in Real-time. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, 741--754.

Digital Library

[30]

Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2016. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In CVPR.

[31]

Ali Rahimi and Benjamin Recht. 2007. Random Features for Large-scale Kernel Machines. In NIPS. 5.

Digital Library

[32]

Pedro V Sander, John Snyder, Steven J Gortler, and Hugues Hoppe. 2001. Texture mapping progressive meshes. In SIGGRAPH. ACM, 409--416.

Digital Library

[33]

Alla Sheffer and John C Hart. 2002. Seamster: inconspicuous low-distortion texture seam layout. In Visualization. 291--298.

Digital Library

[34]

Jamie Shotton, Toby Sharp, Alex Kipman, Andrew Fitzgibbon, Mark Finocchio, Andrew Blake, Mat Cook, and Richard Moore. 2013. Real-time human pose recognition in parts from single depth images. Commun. ACM 56, 1 (2013), 116--124.

Digital Library

[35]

Marc Soucy, Guy Godin, and Marc Rioux. 1996. A texture-mapping approach for the compression of colored 3D triangulations. The Visual Computer 12, 10 (1996), 503--514.

[36]

Robert W Sumner, Johannes Schmid, and Mark Pauly. 2007. Embedded deformation for shape manipulation. ACM TOG 26, 3 (2007), 80.

Digital Library

[37]

David Joseph Tan, Thomas Cashman, Jonathan Taylor, Andrew Fitzgibbon, Daniel Tarlow, Sameh Khamis, Shahram Izadi, and Jamie Shotton. 2016. Fits Like a Glove: Rapid and Reliable Hand Shape Personalization. In IEEE Conference on Computer Vision and Pattern Recognition.

[38]

Jonathan Taylor, Lucas Bordeaux, Thomas Cashman, Bob Corish, Cem Keskin, Toby Sharp, Eduardo Soto, David Sweeney, Julien Valentin, Benjamin Luff, Arran Topalian, Erroll Wood, Sameh Khamis, Pushmeet Kohli, Shahram Izadi, Richard Banks, Andrew Fitzgibbon, and Jamie Shotton. 2016. Efficient and Precise Interactive Hand Tracking Through Joint, Continuous Optimization of Pose and Correspondences. SIGGRAPH (2016).

Digital Library

[39]

C. Theobalt, E. de Aguiar, C. Stoll, H.-P. Seidel, and S. Thrun. 2010. Performance Capture from Multi-view Video. In Image and Geometry Processing for 3D-Cinematography, R. Ronfard and G. Taubin (Eds.). Springer, 127ff.

[40]

J. Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, and M. Nießner. 2016. Face2Face: Real-time Face Capture and Reenactment of RGB Videos. In CVPR.

[41]

Shenlong Wang, Sean Ryan Fanello, Christoph Rhemann, Shahram Izadi, and Pushmeet Kohli. 2016. The Global Patch Collider. In CVPR. 127--135.

[42]

Franco Woolfe, Edo Liberty, Vladimir Rokhlin, and Mark Tygert. 2008. A fast randomized algorithm for the approximation of matrices. Applied and Computational Harmonic Analysis 25, 3 (2008), 335--366.

[43]

Jin Xie, Yi Fang, Fan Zhu, and Edward Wong. 2015. Deepshape: Deep learned shape descriptor for 3D shape matching and retrieval. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1275--1283.

[44]

Mao Ye and Ruigang Yang. 2014. Real-time simultaneous pose and shape estimation for articulated objects using a single depth camera. In CVPR. IEEE.

Digital Library

[45]

Mao Ye, Qing Zhang, Liang Wang, Jiejie Zhu, Ruigang Yang, and Juergen Gall. 2013. A survey on human motion analysis from depth data. In Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. Springer, 149--187.

[46]

Sergey Zagoruyko and Nikos Komodakis. 2015. Learning to Compare Image Patches via Convolutional Neural Networks. In CVPR. 4353--4361.

[47]

Mikhail Zaslavskiy, Francis Bach, and Jean-Philippe Vert. 2009. A Path Following Algorithm for the Graph Matching Problem. PAMI 31, 12 (2009), 2227--2242.

Digital Library

[48]

Jure Žbontar and Yann LeCun. 2015. Computing the stereo matching cost with a convolutional neural network. In CVPR. 1592--1599.

[49]

F. Zhou and F. De la Torre. 2012. Factorized graph matching. In CVPR. 127--134.

Digital Library

[50]

Kun Zhou, John Synder, Baining Guo, and Heung-Yeung Shum. 2004. Iso-charts: stretch-driven mesh parameterization using spectral analysis. In SGP. 45--54.

Digital Library

[51]

Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rhemann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, et al. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM TOG 33, 4 (2014), 156.

Digital Library

Cited By

Liu XLi JLu G(2025)Reconstructing Complex Shaped Clothing From a Single Image With Feature Stable Unsigned Distance FieldsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.338193731:4(2142-2154)Online publication date: Apr-2025
https://doi.org/10.1109/TVCG.2024.3381937
Jiang YShen ZHong YGuo CWu YZhang YYu JXu L(2024)Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric VideosACM Transactions on Graphics10.1145/368792643:6(1-15)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687926
Yang HZheng MMa CLai YWan PHuang H(2024)VRMM: A Volumetric Relightable Morphable Head ModelACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657406(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657406
Show More Cited By

Index Terms

Motion2fusion: real-time volumetric performance capture
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction
      2. Image and video acquisition
        Motion capture

Recommendations

Fusion4D: real-time performance capture of challenging scenes

We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well ...
Combining dense nonrigid structure from motion and 3D morphable models for monocular 4D face reconstruction
CVMP '18: Proceedings of the 15th ACM SIGGRAPH European Conference on Visual Media Production

Monocular 4D face reconstruction is a challenging problem, especially in the case that the input video is captured under unconstrained conditions, i.e. "in the wild". The majority of the state-of-the-art approaches build upon 3D Morphable Modelling (...
Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

This article proposes a real-time method that uses a single-view RGB-D input (a depth sensor integrated with a color camera) to simultaneously reconstruct a casual scene with a detailed geometry model, surface albedo, per-frame non-rigid motion, and per-...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 36, Issue 6

December 2017

973 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3130800

Editor:
Kavita Bala

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 November 2017

Published in TOG Volume 36, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

136
Total Citations
View Citations
2,756
Total Downloads

Downloads (Last 12 months)111
Downloads (Last 6 weeks)9

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu XLi JLu G(2025)Reconstructing Complex Shaped Clothing From a Single Image With Feature Stable Unsigned Distance FieldsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.338193731:4(2142-2154)Online publication date: Apr-2025
https://doi.org/10.1109/TVCG.2024.3381937
Jiang YShen ZHong YGuo CWu YZhang YYu JXu L(2024)Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric VideosACM Transactions on Graphics10.1145/368792643:6(1-15)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687926
Yang HZheng MMa CLai YWan PHuang H(2024)VRMM: A Volumetric Relightable Morphable Head ModelACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657406(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657406
Kyriakou Tde la Campa Crespo MPanayiotou AChrysanthou YCharalambous PAristidou A(2024)Virtual Instrument Performances (VIP): A Comprehensive ReviewComputer Graphics Forum10.1111/cgf.1506543:2Online publication date: 30-Apr-2024
https://doi.org/10.1111/cgf.15065
Liu XLi JLu G(2024)Modeling Realistic Clothing From a Single Image Under Normal GuideIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.324558330:7(3995-4007)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3245583
Zheng CLin WXu F(2024)EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key PointsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336614846:8(5779-5790)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3366148
Jiang YLi JQin HDai YLiu JZhang GZhang CYang T(2024)GS-SFS: Joint Gaussian Splatting and Shape-From-Silhouette for Multiple Human Reconstruction in Large-Scale Sports ScenesIEEE Transactions on Multimedia10.1109/TMM.2024.344363726(11095-11110)Online publication date: 2024
https://doi.org/10.1109/TMM.2024.3443637
Sun JJiao HLi GZhang ZZhao LXing W(2024)3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01954(20675-20685)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01954
Jiang YShen ZWang PSu ZHong YZhang YYu JXu L(2024)HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01866(19734-19745)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01866
Zhao CZhang JDu JShan ZWang JYu JWang JXu L(2024)I'M HOI: Inertia-Aware Monocular Capture of 3D Human-Object Interactions2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00076(729-741)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.00076
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents