Abstract
A method for lip tracking intended to support personal verification is presented in this paper. Lip contours are represented by means of quadratic B-splines. The lips are automatically localised in the original image and an elliptic B-spline is generated to start up tracking. Lip localisation exploits grey-level gradient projections as well as chromaticity models to find the lips in an automatically segmented region corresponding to the face area. Tracking proceeds by estimating new lip contour positions according to a statistical chromaticity model for the lips. The current tracker implementation follows a deterministic second order model for the spline motion based on a Lagrangian formulation of contour dynamics. The method has been tested on the M2VTS database[1]. Lips were accurately tracked on sequences consisting of more than hundred frames. localisation
Preview
Unable to display preview. Download preview PDF.
References
S. Pigeon. The M2VTS database. Technical report, UCL — Laboratoire de Télécommunications et Télédétection, Place du Levant, 2-B-1348 Louvain-La-Neuve, Belgium, http://www.tele.ucl.ac.be/M2VTS, 1996.
C. Bregler and Y. Konig.’ Eigenlips’ for robust speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing — Proceedings, volume 2, pages 669–672. IEEE, Piscataway, NJ, USA, 1994.
R. Kaucic, B. Dalton, and A. Blake. Real-time lip tracking for audio-visual speech recognition applications. In Fourth European Conference on Computer Vision (Cambridge, UK, 1996), volume 2, pages 376–386. Cambridge, 1996.
J. Luettin, N. A. Thacker, and W. Beet. Active shape models for visual speech feature extraction. Technical report, University of Sheffield, Sheffield, UK, 1995.
A. Blake and M. Isard. 3d position, attitude and shape input using video tracking of hands and lips. In SIGGRAPH '94 Conference Proceedings (Orlando, Florida, July 24–29, 1994), pages 185–192, 1515 Broadway, 17th floor, New York, NY 10036, USA, 1994.
C. Montacié, M. J. Caraty, R. André-Obrecht, L. J. Boë, P. Deléglise, M. El-Beze, I. Herlin, P. Jourlin, T. Lallouache, B. Leroy, and H. Méloni. Applications multimodales pour interfaces et bornes evoluées (AMIBE). Technical report, Laboratoire des Formes et d'Intelligence Artificielle (LAFORIA), Université Pierre et Marie Curie, Paris, France, 1995.
Y. Moses, D. Reynard, and A. Blake. Determining facial expressions in real time. In IEEE International Conference on Computer Vision, pages 296–301. IEEE, Piscataway, NJ, USA, 1995.
T. F. Cootes and C. J. Taylor. Active shape models: A quantitative evaluation. In J. Illingworth, editor, British Machine Vision Conference, pages 639–648. BMVA Press, 1993.
R. Bartels, J. Beatty, and B. Barsky. An Introduction to Splines for Use in Computer Graphics and Geometry Modeling. Morgan Kaufmann, 1987.
J.D. Foley, A. Dam, S.K. Feiner, and J.F. Hughes. Computer Graphics: Principle and Practice. Addison-Wesley, 1990.
G. Farin. Curves and Surfaces for Computer Aided Geometric Design. Academic Press Ltd., 1993.
M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. In IEEE International Conference on Computer Vision, volume 1, pages 259–268. IEEE, 1987.
A. Blake, R. Curwen, and A. Zisserman. A framework for spatio-temporal control in the tracking of visual contours. IJCV93, 11(2):127–145, 1993.
A. Blake, M. Isard, and D. Reynard. Learning to track the visual motion of contours. In Artificial Intelligence, page in press, 1995.
J. Matas. Colour-based Object Recognition. PhD thesis, University of Surrey, 1995.
P. Duchnowski, M. Hunke, D. Buesching, U. Meier, and A. Waibel. Toward movement-invariant automatic lip-reading and speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing — Proceedings, volume 1, pages 109–112. IEEE, Piscataway, NJ, USA, 1995.
R. Brunelli and T. Poggio. Face recognition: Features versus templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(10):1042–1052, Oct 1993.
G. Galicia and A. Zakhor. Depth based recovery of human facial features from video sequences. In IEEE International Conference on Image Processing, volume 2, pages 603–606, Washington D.C., USA, October 23–26 1995. IEEE Computer Society Press, Los Alamitos, California.
Rafeal C. Gonzalez and Richard E. Woods. Digital Image Processing. Addison-Wesley, 1992.
K. Sobottka and I. Pitas. Localization of facial regions and features in color images. Journal of Pattern Recognition and Image Analysis, 1996.
R. Curwen and A. Blake. Dynamic Contours: Real-time Active Splines, chapter 3, pages 39–57. 1992.
K. Fukunaga. Introduction to Statististical Pattern Recognition. Academic Press, 1990.
A. Gelb (editor). Applied Optimal Estimation. MIT Press, Cambridge, MA, USA, 1974.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sánchez, M.U.R., Matas, J., Kittler, J. (1997). Statistical chromaticity models for lip tracking with B-splines. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015981
Download citation
DOI: https://doi.org/10.1007/BFb0015981
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive