Abstract
For the existing motion capture (MoCap) data processing methods, manual interventions are always inevitable, most of which are derived from the data tracking process. This paper addresses the problem of tracking non-rigid 3D facial motions from sequences of raw MoCap data in the presence of noise, outliers and long time missing. We present a novel dynamic spatiotemporal framework to automatically solve the problem. First, based on a 3D facial topological structure, a sophisticated non-rigid motion interpreter (SNRMI) is put forward; together with a dynamic searching scheme, it cannot only track the non-missing data to the maximum extent but recover missing data (it can accurately recover more than five adjacent markers under long time (about 5 seconds) missing) accurately. To rule out wrong tracks of the markers labeled in open structures (such as mouth, eyes), a semantic-based heuristic checking method was raised. Second, since the existing methods have not taken the noise propagation problem into account, a forward processing framework is presented to solve the problem. Another contribution is the proposed method could track facial non-rigid motions automatically and forward, and is believed to greatly reduce even eliminate the requirements of human interventions during the facial MoCap data processing. Experimental results proved the effectiveness, robustness and accuracy of our system.


















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Gleicher, M., Ferrier, N.: Evaluating Video-Based Motion Capture. In: Conference on Computer Animation, Geneva, Switzerland, June 19–21 (2002)
Molet, T., Boulic, R., Thalmann, D.: A real time anatomical converter for human motion capture. In: Proceedings of the Eurographics Workshop on Computer Animation and Simulation, pp. 79–94 (1996)
Aristidou, A., Cameron, J., Lasenby, J.: Predicting Missing Markers to Drive Real-Time Centre of Rotation Estimation. Lecture Notes in Computer Science, vol. 5098, pp. 238–247 (2008)
Liu, G., Mcmillan, L.: Estimation of missing markers in human motion capture. Vis. Comput. 22(9–11), 721–728 (2006)
Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81, 231–268 (2001)
Moeslund, T.B., Hiltonb, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2–3), 90–126 (2006)
Dorfmüller-Ulhaas, K.: Robust optical user motion tracking using a Kalman filter. In: 10th ACM Symposium on Virtual Reality Software and Technology, Osaka, Japan (2003)
Li, B., Meng, Q., Holstein, H.: Articulated motion reconstruction from feature points. Pattern Recognit. 41(1), 418–431 (2008)
Moeslund, T.B., Granum, E.: Multiple cues used in model-based human motion capture. In: The Fourth International Conference on Automatic Face and Gesture Recognition, Grenoble, France, March, pp. 362–367 (2000)
Noh, J., Neumann, U.: A survey of facial modeling and animation techniques. University of Southern California Technical Report: University of Southern Californis, California (1998). 1998
Vicon motion systems: http://www.vicon.com/ (2009)
Motion analysis: http://www.motionanalysis.com/ (2009)
Autodesk Inc. Motion builder: http://usa.autodesk.com/adsk/servlet/index?id=6837710&siteID=123112 (2009)
Autodesk Inc. 3DsMax software: http://usa.autodesk.com/adsk/servlet/index?id=5659292&siteID=123112 (2009)
Autodesk Inc. Maya software: http://usa.autodesk.com/adsk/servlet/index?siteID=123112&id (2009)
Somasundaram, A., Parent, R.: A facial animation system for expressive audio-visual speech. OSU-CISRC-4/06-TR46, Department of Computer Science and Engineering, The Ohio State University, Columbus, OH (April 2006)
Park, S.I., Hodgins, J.K.: Capturing and animating skin deformation in human motion. ACM Trans. Graph. 25(3), 881–889 (2006)
Jiang, J., Alwan, A., Keating, P.A., Chaney, B., Auer, E.T., Jr., Bernstein, L.E.: On the relationship between face movements, tongue movements, and speech acoustics. EURASIP J. Appl. Signal Process. 2002(11), 1174–1188 (2002)
Edge, J.D., Sánchez, M.A., Maddock, S.: Animating speech from motion fragments. Technical Report CS-04-02, Department of Computer Science, University of Sheffield (2004)
Davis, J.: Mixed scale motion recovery: http://www-graphics.stanford.edu/papers/MixedScaleMotionRecovery/content/index.html (2009)
Havaldar, P.: Performance-driven facial animation. In: Proc. SIGGRAPH 2006 (Course Notes), pp. 23–42 (2006). [online]. Available http://old.siggraph.org/publications/2006cn/course29.pdf
Ekman, P., Friesen, W.V.: Facial Action Coding System. Consulting Psychologists Press, Palo Alto (1978)
Torresani, L., Hertzmann, A., Bregler, C.: Non-rigid structure-from-motion: estimating shape and motion with hierarchical priors. IEEE Trans. Pattern Anal. Mach. Intell. (2008)
Wrobel-Dautcourt, B., Berger, M.O., Potard, B., Laprie, Y., Ouni, S.: A low-cost stereovision based system for acquisition of visible articulatory data. In: Proceedings of the 5th Conference on Auditory-Visual Speech Processing, Vancouver Island, BC, Canada (2005)
Lin, I.C., Ouhyoung, M.: Mirror MoCap: Automatic and efficient capture of dense 3D facial motion parameters from video. The Visual Computer 21(6), 355–372 (2005)
Guenter, B., Guenter, B., Grimm, C., Wood, D., Malvar, H., Pighin, F.: Making faces. In: International Conference on Computer Graphics and Interactive Techniques: ACM SIGGRAPH 2005 Courses, Los Angeles, California, 31 July–4 August 2005
Sifakis, E., Neverov, I., Fedkiw, R.: Automatic determination of facial muscle activations from sparse motion capture marker data. In: Proceedings of ACM SIGGRAPH 2005, pp. 417–425 (2005)
Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., Gross, M.: Multi-scale capture of facial geometry and motion. ACM Trans. Graph. (2007)
Turk, G., O’Brien, J.F.: Modelling with implicit surfaces that interpolate. ACM Trans. Graph. 21(4), 855–873 (2002)
Bregler, C., Hertzmann, A., Biermann, H.: Recovering non-rigid 3D shape from image streams. In: Proc. Int. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 690–696 (2000)
Llado, X., Del Bue, A., Agapito, L.: Non-rigid 3D factorization for projective reconstruction. In: British Machine Vision Conference, Oxford, UK, September 2005
Xiao, J., Chai, J., Kanade, T.: A closed-form solution to non-rigid shape and motion recovery. In: ECCV, vol. 67, pp. 233–246 (2004)
Essa, I.A., Pentland, A.P.: Coding, analysis, interpretation, and recognition of facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 19, 757–763 (1997)
Cootes, et al.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. (2001)
Li, H., Roivainen, P., Forchheimer, R.: 3-D motion estimation in model-based facial image coding. IEEE Trans. Pattern Anal. Mach. Intell. 15(6), 545–555 (1993)
Grewal, M.S., Andrews, A.P.: Kalman Filtering: Theory and Practice Using Matlab, 2nd edn. Wiley, New York (2001)
Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME. J. Basic Eng., 33–45 (1960)
Sinopoli, B., Schenato, L., Franceschetti, M., Poola, K.: Kalman filtering with intermittent observations. IEEE Trans. Autom. Control 49(9), 1453–1464 (2004)
Liu, X., Goldsmith, A.: Kalman filtering with partial observation losses. In: IEEE Conference on Decision and Control, vol. 4, pp. 4180–4186 (2004)
Yu, X., Xu, C., Tian, Q., Leong, H.W.: A ball tracking framework for broadcast soccer video. In: International Conference on Multimedia and Expo, vol. 2, pp. 265–268 (2003)
Wei, X.P., Fang, X.Y., Zhang, Q., Zhou, D.S.: 3D point pattern matching based on spatial geometric flexibility. Comput. Sci. Inf. Syst. 7(1), 231–246 (2010)
Horn, B.K.P.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A 5, 1127–1135 (1998)
DoRealSoft: http://en.dorealsoft.com/ (2010)
Fang, X.Y., Wei, X.P.: General framework of fast online curve modeling. Comput. Aided Design (submitted)
Zhang, L., Snavely, N., Curless, B., Seitz, S.M.: Spacetime faces: high-resolution capture for modeling and animation. In: ACM SIGGRAPH Proceedings, Los Angeles, CA, August 2004
Acknowledgements
This work is supported by the Program for Changjiang Scholars and Innovative Research Team in University (No. IRT1109), the Program for Liaoning Science and Technology Research in University (No. LS2010008), the Program for Liaoning Innovative Research Team in University(No. LT2011018), Natural Science Foundation of Liaoning Province (201102008), the Program for Liaoning Key Lab of Intelligent Information Processing and Network Technology in University, “Liaoning BaiQianWan Talents Program (2010921010, 2011921009)” and by the General Project of Basic Research Program of Hunan Provincial Science and Technology Department (Grant No. 2012FJ3034).
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below are the links to the electronic supplementary material.
(MP4 9.0 MB)
(MP4 439 kB)
(MP4 879 kB)
(MP4 1.6 MB)
(MP4 1.2 MB)
(MP4 6.2 MB)
Rights and permissions
About this article
Cite this article
Fang, X., Wei, X., Zhang, Q. et al. Forward non-rigid motion tracking for facial MoCap. Vis Comput 30, 139–157 (2014). https://doi.org/10.1007/s00371-013-0790-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-013-0790-8