Skip to main content
Log in

A nonparametric regression model for virtual humans generation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a novel nonparametric regression model to generate virtual humans from still images for the applications of next generation environments (NG). This model automatically synthesizes deformed shapes of characters by using kernel regression with elliptic radial basis functions (ERBFs) and locally weighted regression (LOESS). Kernel regression with ERBFs is used for representing the deformed character shapes and creating lively animated talking faces. For preserving patterns within the shapes, LOESS is applied to fit the details with local control. The results show that our method effectively simulates plausible movements for character animation, including body movement simulation, novel views synthesis, and expressive facial animation synchronized with input speech. Therefore, the proposed model is especially suitable for intelligent multimedia applications in virtual humans generation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Alexa M, Cohen-Or D, Levin D (2000) As-rigid-as-possible shape interpolation. In SIGGRAPH ’00 157–164

  2. Arad N, Dyn N, Reisfeld D, Yeshurun Y (1994) Image warping by radial basis functions: applications to facial expressions. CVGIP Graph Models Image Process 56(2):161–172

    Article  Google Scholar 

  3. Baker S, Scharstein D, Lewis JP, Roth S, Black MJ, Szeliski R (2007) A database and evaluation methodology for optical flow. In IEEE International Conference on Computer Vision 1–8

  4. Blanz V, Basso C, Poggio T, Vetter T (2003) Reanimating faces in images and video. Comput Graph Forum 22(3):641–650

    Article  Google Scholar 

  5. Botsch M, Sorkine O (2008) On linear variational surface deformation methods. IEEE Trans Vis Comput Graph 14(1):213–230

    Article  Google Scholar 

  6. Brand M (1999) Voice puppetry. In SIGGRAPH ’99 21–28

  7. Bruce HT, Calder P (1995) Animating direct manipulation interfaces. In the 8th ACM Symposium on User Interface Software and Technology 3–12

  8. Busso C, Deng Z, Grimm M, Neumann U, Narayanan SS (2007) Rigid head motion in expressive speech animation: analysis and synthesis. IEEE Trans Audio Speech Lang Process 15(8):1075–1086

    Article  Google Scholar 

  9. Busso C, Narayanan SS (2007) Interrelation between speech and facial gestures in emotional utterances: a single subject study. IEEE Trans Audio Speech Lang Process 15(8):2331–2347

    Article  Google Scholar 

  10. Chan TF (2001) Active contours without edges. IEEE Trans Image Process 10(2):266–277

    Article  MATH  Google Scholar 

  11. Chen SE, William L (1993) View interpolation for image synthesis. In SIGGRAPH ’93 279–288

  12. Chuang Y-Y, Goldman DB, Zheng KC, Curless B, Salesin D, Szeliski R (2005) Animating pictures with stochastic motion textures. ACM Trans Graph 24(3):853–860

    Article  Google Scholar 

  13. Deng Z, Neumann U (2006) efase: expressive facial animation synthesis and editing with phoneme-isomap controls. In SIGGRAPH/Eurographics Symposium on Computer Animation 251–260

  14. Ezzat TF, Geiger G, Poggio T (2002) Trainable video realistic speech animation. ACM Trans Graph 21(3):388–398

    Article  Google Scholar 

  15. Forstmann S, Ohya J, Krohn-Grimberghe A, McDougall R (2007) Deformation styles for spline-based skeletal animation. In SIGGRAPH/Eurographics Symposium on Computer Animation 141–150

  16. Fu T, Foroosh H (2004) Expression morphing from distant viewpoints. In International Conference on Image Processing 3519–3522

  17. Glocker B, Paragios N, Komodakis K, Tziritas G, Navab N (2008) Optical flow estimation with uncertainties through dynamic MRFs. In IEEE Conference on Computer Vision and Pattern Recognition

  18. Goldstein E, Gotsman C (1995) Polygon morphing using a multiresolution representation. In Graphics Interface ’95 247–254

  19. Herbrich R (2002) Learning kernel classifiers theory and algorithms. The MIT Press

  20. Hornung A, Dekkers E, Kobbelt L (2007) Character animation from 2D pictures and 3D motion data. ACM Transaction on Graphics 26(1) Article No. 1

  21. Igarashi T, Moscovich T, Hughes JF (2005) As-rigid-as-possible shape manipulation. ACM Trans Graph 24(3):1134–1141

    Article  Google Scholar 

  22. Jang Y, Botchen RP, Lauser A, Ebert DS, Gaither KP, Ertl T (2006) Enhancing the interactive visualization of procedurally encoded multifield data with ellipsoidal basis functions. Comput Graph Forum 25(3):587–596

    Article  Google Scholar 

  23. Lempitsky L, Roth S, Rother C (2008) FusionFlow: discrete-continuous optimization for optical flow estimation. In IEEE Conference on Computer Vision and Pattern Recognition

  24. Li Y, Huttenlocher D (2008) Learning for optical flow using stochastic optimization. In the 10th European Conference on Computer Vision 2:379–391

  25. Litwinowicz P, Williams L (1994) Animating images with drawings. In SIGGRAPH ’94 409–412

  26. Mahajan D, Huang F-C, Matusik W, Ramamoorthi R, Belhumeur P (2009) Moving gradients: a path-based method for plausible image interpolation. ACM Transaction on Graphics 28(3) Article No. 42

  27. McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 746–748

  28. Montgomery DC, Peck EA, Vining GG (2006) Introduction to linear regression analysis. Wiley

  29. Mukundan R, Ong SH, Lee PA (2001) Image analysis by tchebichef moments. IEEE Trans Image Process 10(9):1357–1364

    Article  MATH  MathSciNet  Google Scholar 

  30. Ngo T, Cutrell D, Dan J, Donald B, Loeb L, Zhu S (2000) Accessible animation and customizable graphics via simplicial configuration modeling. In SIGGRAPH ’00 403–410

  31. Park J, Sandberg WI (1993) Nonlinear approximations using elliptic basis function networks. In 32nd Conference on Decision and Control 3700–3705

  32. Rabiner LR (1990) A tutorial on hidden markov models and selected applications in speech recognition. Readings in speech recognition 267–296

  33. Ranjan V, Fournier A (1996) Matching and interpolation of shapes using unions of circles. Comput Graph Forum 15(3):129–142

    Article  Google Scholar 

  34. Ren X (2008) Local grouping for optical flow. In IEEE Conference on Computer Vision and Pattern Recognition

  35. Rother C, Kolmogorov V, Blake A (2004) “GrabCut”: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314

    Article  Google Scholar 

  36. Ruprecht D, Müller H (1995) Image warping with scattered data interpolation. IEEE Comput Graph Appl 15(2):37–43

    Article  Google Scholar 

  37. Schaefer S, Mcphail T, Warren J (2006) Image deformation using moving least squares. ACM Trans Graph 25(3):533–540

    Article  Google Scholar 

  38. Sederberg T, Greenwood E (1992) A physically based approach to 2D shape blending. In SIGGRAPH ’92 25–34

  39. Seitz SM, Dyer CR (1996) View morphing. In SIGGRAPH ’96 21–30

  40. Sethian JA (1996) Level set methods. Cambridge University Press

  41. Sethian JA (1999) Level set methods and fast marching methods: evolving interfaces in computational geometry, fluid mechanics, computer vision, and materials science. Cambridge University Press

  42. Sun D, Roth S, Lewis JP, Black MJ (2008) Learning optical flow. In the 10th European Conference on Computer Vision 3:83–97

  43. Trobin W, Pock T, Cremers D, Bischof H (2008) Continuous energy minimization via repeated binary fusion. In the 10th European Conference on Computer Vision 4:677–690

  44. Vedula S, Baker S, Kanade T (2005) Image-based spatio-temporal modeling and view interpolation of dynamic events. ACM Trans Graph 24(2):240–261

    Article  Google Scholar 

  45. Vorobyov SA, Cichocki A (2001) Hyper radial basis function neural networks for interference cancellation with nonlinear processing of reference signal. Digit Signal Process 11(3):204–221

    Article  Google Scholar 

  46. Wang Y, Xu K, Xiong Y, Cheng Z-Q (2008) 2D shape deformation based on rigid square matching. Computer Animation and Virtual Worlds 19(3–4):411–420

    Article  Google Scholar 

  47. Weber O, Ben-Chen M, Gotsman C (2009) Complex barycentric coordinates with applications to planar shape deformation. Comput Graph Forum 28(2):587–397

    Article  Google Scholar 

  48. Wolberg G (1998) Image morphing: a survey. Vis Comput 14(8):360–372

    Article  Google Scholar 

  49. Xu L, Chen J, Jia J (2008) Segmentation based variational model for accurate optical flow estimation. In the 10th European Conference on Computer Vision 1:671–684

  50. Yan H-B, Hu S-M, Martin RR, Yang Y-L (2008) Shape deformation using a skeleton to drive simplex transformations. IEEE Trans Vis Comput Graph 14(3):693–706

    Article  Google Scholar 

  51. Yotsukura T, Morishima S, Nakamura S (2003) Model-based talking face synthesis for anthropomorphic spoken dialog agent system. In the 11th ACM International Conference on Multimedia 351–354

Download references

Acknowledgements

This work is supported partially by the National Science Council, Republic of China, under grant NSC 98-2221-E-009-123-MY3. We would like to thank Prof. Sang-Soo Yeo and reviewers for their helpful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zen-Chung Shih.

Appendix 1 Hyper radial basis functions (HRBFs)

Appendix 1 Hyper radial basis functions (HRBFs)

HRBF is computed by using the Mahalanobis distance, which is defined in the matrix form as follows:

$$ \begin{array}{*{20}{c}} {k\left( {\vec u,\vec v} \right) = \exp \left( { - {{\left( {\vec u - \vec v} \right)}^T}\sum {\left( {\vec u - \vec v} \right)} } \right),} \\ {{\text{for }}\Sigma = diag\left( {\sigma_1^{ - 2}, \ldots, \sigma_N^{ - 2}} \right){\text{ and }}{\sigma_1}, \ldots, {\sigma_N} \in {\Re^{+} },} \\ \end{array} $$
(19)

where \( \sigma_N^2 \) should be the covariance of the multidimensional Gaussians rather than the single variance. HRBF differs from a standard RBF insofar each axis of the input space \( \chi \subseteq \ell_2^N \) (the space of square summable sequences of length N) has a separate smoothing parameter, i.e., a separate scale onto which the differences on this axis are viewed. It is worth mentioning that RBF kernels map the input space onto the surface of an infinite dimensional hyperspace. Note that N = 2 in arbitrary directional ERBF kernel represents the analysis of data distribution along the major axis and the minor axis in an ellipse. Along the orientation of arbitrary directional ERBF (the major axis and the minor axis), (1) is constructed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chou, YF., Shih, ZC. A nonparametric regression model for virtual humans generation. Multimed Tools Appl 47, 163–187 (2010). https://doi.org/10.1007/s11042-009-0412-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-009-0412-7

Keywords

Navigation