Geometry-assisted image-based rendering for facial analysis and synthesis

https://doi.org/10.1016/j.image.2006.03.003Get rights and content

Abstract

In this paper, we present an image-based method for the tracking and rendering of faces. We use the algorithm in an immersive video conferencing system where multiple participants are placed in a common virtual room. This requires viewpoint modification of dynamic objects. Since hair and uncovered areas are difficult to model by pure 3-D geometry-based warping, we add image-based rendering techniques to the system. By interpolating novel views from a 3-D image volume, natural looking results can be achieved. The image-based component is embedded into a geometry-based approach in order to limit the number of images that have to be stored initially for interpolation. Also temporally changing facial features are warped using the approximate geometry information. Both geometry and image cube data are jointly exploited in facial expression analysis and synthesis.

Section snippets

Facial analysis and synthesis for model-based coding and virtual conferencing

Although the algorithms for rendering and tracking can be used for any application related to facial animation like text-driven animation [27], man-machine interfaces, and avatar control [29], we focus in this context on the application of virtual conferencing which has some implications on the settings and the experiments made.

In virtual conferencing, multiple distant participants can meet in a virtual room as shown in Fig. 1. The use of a synthetic 3-D computer graphics scene allows more than

3-D model-based facial expression analysis and synthesis

In this section, we will briefly describe our original 3-D model-based coding system [12], [11]. Although purely geometry-based, it is used in this work to represent global head motion (except for head turns) and jaw movements, which severely effects the silhouette if the person is viewed from a sideways directions. This technique uses a 3-D head model with a single texture map extracted from the first frame of the video sequence. All temporal changes are modeled by motion and deformations of

Image-based tracking and rendering

In this section, we describe an extension of the pure geometry-based estimation and rendering of Section 2. By adding image-based interpolation techniques, the maximum range of head rotation can be broadened while preserving the correct outline, even in presence of hair. In contrast to other image-based techniques in facial animation like active appearance models [8], [16] that describe local features like mouth or eyes by a set of images, we use the captured set of video frames to

Conclusions

In this paper, we have presented a method for the analysis and synthesis of head-and-shoulder scenes in the context of virtual video conferencing. We have extended a 3-D model-based coding approach with image-based rendering techniques, in order to obtain naturally looking images even for large modifications of the viewing direction. In order to reduce the demands on memory and capturing, only one degree of freedom related to head rotation around the vertical axis is described by image-based

References (36)

  • K. Aizawa et al.

    Model-based analysis synthesis image coding (MBASIC) system for a person's face

    Signal Process.: Image Commun.

    (1989)
  • M. Kampmann et al.

    Automatic adaptation of a face model in a layered coder with an object-based analysis–synthesis layer and a knowledge-based layer

    Signal Process.: Image Commun.

    (1997)
  • E.H. Adelson et al.

    The plenoptic function and the elements of early vision

  • J. Ahlberg, Extraction and coding of face model parameters, Ph.D. thesis, University of Linköping, Sweden,...
  • V. Blanz, T. Vetter, A morphable model for the synthesis of 3D faces, in: Proceedings of the Computer Graphics...
  • J.-X. Chai, X. Tong, S.-C. Chan, H.-Y. Shum, Plenoptic sampling. in: Proceedings of the Computer Graphics (SIGGRAPH),...
  • Y. Chang, T. Ezzat, Transferable videorealistic speech animation, in: Proceedings of the ACM Eurographics, Los Angeles,...
  • S. Chao, J. Robinson, Model-based analysis/synthesis image coding with eye and mouth patch codebooks, in: Proceedings...
  • T.F. Cootes, G. J. Edwards, C.J. Taylor, Active appearance models, in: Proceedings of the European Conference on...
  • D. DeCarlo, D. Metaxas, Deformable model-based shape and motion analysis from images using motion residual error, in:...
  • P. Eisert, Model-based camera calibration using analysis by synthesis techniques, in: Proceedings of the International...
  • P. Eisert

    MPEG-4 facial animation in video analysis and synthesis

    Int. J. Imaging Syst. Technol.

    (2003)
  • P. Eisert et al.

    Analyzing facial expressions for virtual conferencing

    IEEE Comput. Graphics Appl.

    (1998)
  • P. Eisert, J. Rurainsky, Image-based rendering and tracking of faces, in: Proceedings of the International Conference...
  • R. Forchheimer, O. Fahlander, T. Kronander, Low bit-rate coding through animation, in: Proceedings of the Picture...
  • S.J. Gortler, R. Grzeszczuk, R. Szeliski, M.F. Cohen, The Lumigraph, in: Proceedings of the Computer Graphics...
  • R. Gross, I. Matthews, S. Baker, Constructing and fitting active appearance models with occlusions, in: Proceedings of...
  • M. Hess, G. Martinez, Automatic adaption of a human face model for model-based coding, in: Proceedings of the Picture...
  • Cited by (8)

    View all citing articles on Scopus
    View full text