research-article

FaceCollage: A Rapidly Deployable System for Real-time Head Reconstruction for On-The-Go 3D Telepresence

Authors:

Tat-Jen ChamAuthors Info & Claims

MM '17: Proceedings of the 25th ACM international conference on Multimedia

Pages 64 - 72

https://doi.org/10.1145/3123266.3123281

Published: 19 October 2017 Publication History

Abstract

This paper presents FaceCollage, a robust and real-time system for head reconstruction that can be used to create easy-to-deploy telepresence systems, using a pair of consumer-grade RGBD cameras that provide a wide range of views of the reconstructed user. A key feature is that the system is very simple to rapidly deploy, with autonomous calibration and requiring minimal intervention from the user, other than casually placing the cameras. This system is realized through three technical contributions: (1) a fully automatic calibration method, which analyzes and correlates the left and right RGBD faces just by the face features; (2) an implementation that exploits the parallel computation capability of GPU throughout most of the system pipeline, in order to attain real-time performance; and (3) a complete integrated system on which we conducted various experiments to demonstrate its capability, robustness, and performance, including testing the system on twelve participants with visually-pleasing results.

References

[1]

S. Beck, A. Kunert, A. Kulik, and B. Froehlich. 2013. Immersive Group-to-Group Telepresence. IEEE Transactions on Visualization and Computer Graphics, Vol. 19, 4 (2013), 616--625.

Digital Library

[2]

Thabo Beeler, Bernd Bickel, Paul Beardsley, Bob Sumner, and Markus Gross. 2010. High-Quality Single-Shot Capture of Facial Geometry. ACM Transactions on Graphics (SIGGRAPH) Vol. 29, 3 (2010), 40:1--40:9.

Digital Library

[3]

Thabo Beeler, Fabian Hahn, Derek Bradley, Bernd Bickel, Paul Beardsley, Craig Gotsman, Robert W. Sumner, and Markus Gross. 2011. High-quality passive facial performance capture using anchor frames. ACM Transactions on Graphics (SIGGRAPH) Vol. 30 (2011), 75:1--75:10.

Digital Library

[4]

Derek Bradley, Wolfgang Heidrich, Tiberiu Popa, and Alla Sheffer. 2010. High Resolution Passive Facial Performance Capture. ACM Transactions on Graphics (SIGGRAPH) Vol. 29, 3 (2010).

Digital Library

[5]

Thomas Brox, Andrés Bruhn, Nils Papenberg, and Joachim Weickert. 2004. High Accuracy Optical Flow Estimation Based on a Theory for Warping European Conference on Computer Vision (ECCV). 25--36.

[6]

Chen Cao, Yanlin Weng, Stephen Lin, and Kun Zhou. 2013. 3D Shape Regression for Real-time Facial Animation. ACM Transactions on Graphics (SIGGRAPH) Vol. 32, 4 (2013), 41:1--41:10.

Digital Library

[7]

V. Caselles, R. Kimmel, and G. Sapiro. 1995. Geodesic active contours. In IEEE International Conference on Computer Vision (ICCV). 694--699.

Digital Library

[8]

Carolina Cruz-Neira, Daniel J. Sandin, and Thomas A. DeFanti. 1993. Surround-screen Projection-based Virtual Reality: The Design and Implementation of the CAVE SIGGRAPH. 135--142.

Digital Library

[9]

Teng Deng, Jianfei Cai, Tat-Jen Cham, and Jianmin Zheng. 2017. Multiple consumer-grade depth camera registration using everyday objects. Image and Vision Computing Vol. 62 (2017), 1--7. Greg Welch, Matt Cutts, Adam Lake, Lev Stesin, and Henry Fuchs. 1998. The Office of the Future: A Unified Approach to Image-based Modeling and Spatially Immersive Displays. In SIGGRAPH. 179--188.

Digital Library

[10]

S. Rusinkiewicz and M. Levoy. 2001. Efficient variants of the ICP algorithm. In International Conference on 3-D Digital Imaging and Modeling. 145--152.

[11]

J.M. Saragih, S. Lucey, and J.F. Cohn. 2009. Face alignment through subspace constrained mean-shifts IEEE International Conference on Computer Vision (ICCV). 1034--1041.

[12]

Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, and King Ngi Ngan. 2017. A Generative Model for Depth-based Robust 3D Facial Pose Tracking IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]

Yang Wang, Xiaolei Huang, Chan-Su Lee, Song Zhang, Zhiguo Li, Dimitris Samaras, Dimitris Metaxas, Ahmed Elgammal, and Peisen Huang. 2004. High resolution acquisition, learning and transfer of dynamic 3-D facial expressions. Computer Graphics Forum (Eurographics) Vol. 23, 3 (2004), 677--686.

[14]

Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. 2011. Realtime Performance-Based Facial Animation. ACM Transactions on Graphics (SIGGRAPH) Vol. 30, 4 (2011).

Digital Library

[15]

Cha Zhang, Qin Cai, P.A. Chou, Zhengyou Zhang, and R. Martin-Brualla. 2013. Viewport: A Distributed, Immersive Teleconferencing System with Infrared Dot Pattern. IEEE Multimedia, Vol. 20, 1 (2013), 17--27.

Digital Library

[16]

Li Zhang, Noah Snavely, Brian Curless, and Steven M. Seitz. 2004. Spacetime Faces: High Resolution Capture for Modeling and Animation. ACM Transactions on Graphics (SIGGRAPH) Vol. 23, 3 (2004), 548--558.

Digital Library

[17]

Zhengyou Zhang. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 22, 11 (2000), 1330--1334.

Digital Library

[18]

M. Zhao, C. W. Fu, J. Cai, and T. J. Cham. 2015. Real-Time and Temporal-Coherent Foreground Extraction With Commodity RGBD Camera. IEEE Journal of Selected Topics in Signal Processing, Vol. 9, 3 (April. 2015), 449-461.

[19]

Michael Zollhöfer, Michael Martinek, Günther Greiner, Marc Stamminger, and Jochen Süßmuth. 2011. Automatic Reconstruction of Personalized Avatars from 3D Face Scans. Computer Animation and Virtual Worlds (CASA), Vol. 22, 2--3 (2011), 195--202.

Digital Library

[20]

Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rhemann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, and Marc Stamminger. 2014. Real-time Non-rigid Reconstruction using an RGB-D Camera. ACM Transactions on Graphics (SIGGRAPH) Vol. 33, 4 (2014), 156:1--156:12.

Digital Library

Cited By

Fadzli FIsmail AAbd Karim Ishigaki S(2023)A systematic literature review: Real-time 3D reconstruction method for telepresence systemPLOS ONE10.1371/journal.pone.028715518:11(e0287155)Online publication date: 15-Nov-2023
https://doi.org/10.1371/journal.pone.0287155
Rasmuson SSintorn EAssarsson U(2020)A low-cost, practical acquisition and rendering pipeline for real-time free-viewpoint video communicationThe Visual Computer10.1007/s00371-020-01823-7Online publication date: 7-Mar-2020
https://doi.org/10.1007/s00371-020-01823-7
Sheng LCai JCham TPavlovic VNgan K(2019)Visibility Constrained Generative Model for Depth-Based 3D Facial Pose TrackingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.287767541:8(1994-2007)Online publication date: 1-Aug-2019
https://doi.org/10.1109/TPAMI.2018.2877675
Show More Cited By

Index Terms

FaceCollage: A Rapidly Deployable System for Real-time Head Reconstruction for On-The-Go 3D Telepresence
1. Applied computing
  1. Physical sciences and engineering
    1. Telecommunications
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
  2. Computer graphics
    1. Graphics systems and interfaces

Recommendations

Multiview face capture using polarized spherical gradient illumination
SA '11: Proceedings of the 2011 SIGGRAPH Asia Conference

We present a novel process for acquiring detailed facial geometry with high resolution diffuse and specular photometric information from multiple viewpoints using polarized spherical gradient illumination. Key to our method is a new pair of linearly ...
SPARK: Self-supervised Personalized Real-time Monocular Face Capture
SA '24: SIGGRAPH Asia 2024 Conference Papers
Feedforward monocular face capture methods seek to reconstruct posed faces from a single image of a person. Current state of the art approaches have the ability to regress parametric 3D face models in real-time across a wide range of identities, lighting ...
Real-time human body tracking based on data fusion from multiple RGB-D sensors

In this work we present a human pose estimation method based on the skeleton fusion and tracking using multiple RGB-D sensors. The proposed method considers the skeletons provided by each RGB-D device and constructs an improved skeleton, taking into ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '17: Proceedings of the 25th ACM international conference on Multimedia

October 2017

2028 pages

ISBN:9781450349062

DOI:10.1145/3123266

General Chairs:
Qiong Liu
FXPAL, USA
,
Rainer Lienhart
Universität Augsburg, Germany
,
Haohong Wang
TCL America, USA
,
Program Chairs:
Sheng-Wei "Kuan-Ta" Chen
Academia Sinica, Taiwan
,
Susanne Boll
University of Oldenburg, Germany
,
Phoebe Chen
La Trobe University, Australia
,
Gerald Friedland
Lawrence Livermore National Lab, USA
,
Jia Li
Google, USA
,
Shuicheng Yan
Qihoo 360, China

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Research Foundation Prime Minister's Office Singapore

Conference

MM '17

Sponsor:

SIGMM

MM '17: ACM Multimedia Conference

October 23 - 27, 2017

California, Mountain View, USA

Acceptance Rates

MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
215
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)6

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fadzli FIsmail AAbd Karim Ishigaki S(2023)A systematic literature review: Real-time 3D reconstruction method for telepresence systemPLOS ONE10.1371/journal.pone.028715518:11(e0287155)Online publication date: 15-Nov-2023
https://doi.org/10.1371/journal.pone.0287155
Rasmuson SSintorn EAssarsson U(2020)A low-cost, practical acquisition and rendering pipeline for real-time free-viewpoint video communicationThe Visual Computer10.1007/s00371-020-01823-7Online publication date: 7-Mar-2020
https://doi.org/10.1007/s00371-020-01823-7
Sheng LCai JCham TPavlovic VNgan K(2019)Visibility Constrained Generative Model for Depth-Based 3D Facial Pose TrackingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.287767541:8(1994-2007)Online publication date: 1-Aug-2019
https://doi.org/10.1109/TPAMI.2018.2877675
Song GCai JCham TZheng JZhang JFuchs HBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Real-time 3D Face-Eye Performance Capture of a Person Wearing VR HeadsetProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240570(923-931)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240570

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten