Facial video coding/decoding at ultra-low bit-rate: a 2D/3D model-based approach

Yu, Jun; Luo, Changwei; Yu, Lingyun; Li, Lingyan; Wang, Zengfu

doi:10.1007/s11042-016-3368-4

Facial video coding/decoding at ultra-low bit-rate: a 2D/3D model-based approach

Published: 04 March 2016

Volume 75, pages 12021–12041, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jun Yu¹,
Changwei Luo¹,
Lingyun Yu¹,
Lingyan Li¹ &
…
Zengfu Wang^1,2

234 Accesses
4 Citations
Explore all metrics

Abstract

A real-time facial video coding/decoding system was proposed based on the establishment of 2D/3D mixed coding/decoding scheme. It has better rate/distortion performance at ultra-low bit-rate. Multi-measurements and online appearance models were applied to track the 3D facial motion from video by the improved particle filtering. 3D facial animation was produced by combining the parameterized model and muscular model. 3D hair was synthesized based on the hair detection result in video. 3D coding/decoding result of foreground and 2D coding/decoding result of background were stitched seamlessly. At ultra-low bit-rate, the objective experiment confirmed the comprehensive advantage between coding efficiency and decoding quality of this system, and the subjective experiment indicated the suitability of subjective face identification by it.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Py-Feat: Python Facial Expression Analysis Toolbox

Article Open access 08 August 2023

Facial expression recognition in videos using hybrid CNN & ConvLSTM

Article 21 March 2023

A comprehensive review of facial expression recognition techniques

Article 30 July 2022

References

Ahlberg J (2001) Candide3-an updated parameterized face. Technical report, Technical Report LiTH-ISY-R-2326. Department of Electrical Engineering, Link?ping University, Sweden
Google Scholar
Ahlberg J (2002) Model-based coding: extraction, coding, and evaluation of face model parameters. Master’s thesis. Link?ping University, Sweden
Google Scholar
Ahlberg J (2002) Model-based coding: extraction, coding, and evaluation of face model parameters. Master’s thesis. Link?ping University, Sweden
Google Scholar
Arivazhagan S, Ganesan L, Priyal SP (2006) Texture classification using gabor wavelets based rotation invariant features. Pattern Recogn Lett 27(16):1976–1982
Article Google Scholar
Black MJ et al (1997) Recognizing facial expressions in image sequences using local parameterized models of image motion. Int J Comput Vis 13(11):1491–1506
MathSciNet Google Scholar
Blanz V, Basso C, Poggio T, Vetter T (2003) Reanimating faces in images and video. Comput Graph Forum 22(3):641–650
Article Google Scholar
Chowdhury M, Clark A, Downton A, Morimatsu E, Pearson D (1994) A switched model-based coder for video signals. TCSVT 4(3):216–227
Google Scholar
Dornaika F, Davoine F (2005) Simultaneous facial action tracking and expression recognition using a particle filter. In: 10th IEEE international conference on computer vision, 2005. ICCV 2005, vol 2, pp 1733–1738
Dornaika F, Davoine F (2008) Simultaneous facial action tracking and expression recognition in the presence of head motion. J Comput Vis 76(3):257–281
Article Google Scholar
Eisert P, Wiegand T, Girod B (2000) Model-aided coding: a new approach to incorporate facial animation into motion-compensated video coding. TCSVT 10 (3):344–358
Google Scholar
Gao M, Wang Q, Zhao D, Gao W (2015) Arithmetic coding using hierarchical dependency context model for h. 264/avc video coding. Multimed Tools Appl 1–20
Grassberger P. (1997) The pruned-enriched rosenbluth method: simulations of theta polymers of chain length up to 1, 000, 000. Phys Rev E 56:3682–3693
Gokturk S et al (2001) A data-driven model for monocular face tracking. In: IEEE international conference on computer vision, vol 2. IEEE Computer Society Press, Lihue, pp 701–708
Hu C, Xu Z, Liu Y, Mei L (2015) Video structural description technology for the new generation video surveillance systems. Front Comput Sci 9(6):980–989
Article Google Scholar
Jepson A, Fleet D, El-Maraghi T (2003) Robust online appearance models for visual tracking. TPAMI 25(10):1296–1311
Article Google Scholar
Juang CF, Chen TC, Cheng WY (2011) Speedup of implementing fuzzy neural networks with high-dimensional inputs through parallel processing on graphic processing units. TFS 19(4):717–728
Google Scholar
Kampmann M (2002) Automatic 3-d face mode adaption for model-based coding of videophone sequences. IEEE Trans Circuits Syst Video Technol 13(3):172–182
Article Google Scholar
Kanade T, Cohn J, Tian Y (2000) Comprehensive database for facial expression analysis. Grenoble, France, pp 46–53
Google Scholar
Koray B, Elena N, Zancanaro M (2007) Xface open source project and smil-agent scripting language for creating and animating embodied conversational agents. In: Proceedings of the 15th international conference on multimedia. ACM, Augsburg, pp 1013–1016
La Cascia M, Sclaroff S, Athitsos V (2000) Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3D models. TPAMI 22(4):322–336
Article Google Scholar
Levin A, Rav-Acha A, Lischinski D (2007) Spectral matting. In: IEEE conference on computer vision and pattern recognition, 2007. CVPR ’07, pp 1–8
Liao WK, Medioni G (2008) 3D face tracking and expression inference from a 2D sequence using manifold learning. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008, pp 1–8
Liao W-K, Fidaleo D, Medioni G (2007) Integrating multiple visual cues for robust real-time 3d face tracking. In: AMFG, pp 109–123
Lui YM, Beveridge J, Whitley L (2010) Adaptive appearance model and condensation algorithm for robust face tracking. IEEE Trans Syst Man Cybern Syst Hum 40(3):437–448
Article Google Scholar
Marcos S, Garcia Bermejo J, Zalama E (2008) A realistic facial animation suitable for human-robot interfacing. In: IEEE/RSJ international conference on intelligent robots and systems, 2008. IROS 2008, pp 3810–3815
Matthews I, Xiao J, Baker S (2007) 2d vs. 3d deformable face models: representational power, construction, and real-time fitting. Int J Comput Vis 75 (1):93–113
Article Google Scholar
Samuel M, Jaime G-G-B, Zalama E (2010) A realistic, virtual head for human-computer interaction. Interact Computers 22:176–192
Article Google Scholar
Schwalb M, Ewerth R, Freisleben B (2009) Fast motion estimation on graphics hardware for h.264 video encoding. IEEE Trans Multimedia 11(1):1–10
Article Google Scholar
Strom J (2002) Model-based head tracking and coding. Master’s thesis. Link?ping University, Sweden
Google Scholar
Sung J, Kanade T, Kim D (2008) Pose robust face tracking by combining active appearance models and cylinder head models. Int J Comput Vis 80(2)
Wang W, Yan X, Xie Y, Qin J, Pang WM, Heng PA (2009) A physically-based modeling and simulation framework for facial animation. In: 5th international conference on image and graphics, 2009. ICIG ’09, pp 521–526
Ward K, Bertails F, Kim TY, Marschner S, Cani MP, Lin M (2007) A survey on hair modeling: Styling, simulation, and rendering. 13(2):213–234
Waters K (1987) A muscle model for animating three dimensional facial expression. Comput Graph (SIGGRAPH’87) CG 22(4):17–24
Waters FIPK (1996) Computer facial animation
Wen Z, Huang T (2003) Capturing subtle facial motions in 3d face tracking. In: Proceedings of IEEE international conference on computer vision, vol 2. Nice, France, pp 1343–1350
Yacoob Y, Davis L (2005) Detection, analysis and matching of hair. In: 10th IEEE international conference on computer vision, 2005. ICCV 2005, vol 1, pp 741–748
Yu J, Wang ZF (2015) A video, text, and speech-driven realistic 3-D virtual head for human–machine interface. IEEE Trans Cybern 45(5):991–1002
Article Google Scholar
Yuankui H, Wang ZF (2006) A low-dimensional illumination space representation of human faces for arbitrary lighting conditions. Int Conf Pattern Recog:1147–1150
Yuankui H, Ying Z, Wang ZF (2005) Reconstruction of 3d face from a single 2d image for face recognition. In: IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance, pp 46–53
Zhang W, Qiang W, Tang X (2008) Real time feature based 3-d deformable face tracking. In: 10th European conference on computer vision, pp 720–732
Zhou S, Chellappa R, Moghaddam B (2004) Visual tracking and recognition using appearance-adaptive models in particle filters. 13(11):1491–1506
Zhou S, RC, Mogghaddam B (2004) Visual tracking and recognition using appearance-adaptive models in particle filters. IEEE Trans Image Process TIP 76 (3):257–281
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Automation, University of Science and Technology of China, Hefei, China
Jun Yu, Changwei Luo, Lingyun Yu, Lingyan Li & Zengfu Wang
Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, China
Zengfu Wang

Authors

Jun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Changwei Luo
View author publications
You can also search for this author in PubMed Google Scholar
Lingyun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lingyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Zengfu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Yu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, J., Luo, C., Yu, L. et al. Facial video coding/decoding at ultra-low bit-rate: a 2D/3D model-based approach. Multimed Tools Appl 75, 12021–12041 (2016). https://doi.org/10.1007/s11042-016-3368-4

Download citation

Received: 15 October 2015
Revised: 01 February 2016
Accepted: 15 February 2016
Published: 04 March 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s11042-016-3368-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial video coding/decoding at ultra-low bit-rate: a 2D/3D model-based approach

Abstract

Access this article

Similar content being viewed by others

Py-Feat: Python Facial Expression Analysis Toolbox

Facial expression recognition in videos using hybrid CNN & ConvLSTM

A comprehensive review of facial expression recognition techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Facial video coding/decoding at ultra-low bit-rate: a 2D/3D model-based approach

Abstract

Access this article

Similar content being viewed by others

Py-Feat: Python Facial Expression Analysis Toolbox

Facial expression recognition in videos using hybrid CNN & ConvLSTM

A comprehensive review of facial expression recognition techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation