Mouth Region Localization Method Based on Gaussian Mixture Model

Kumatani, Kenichi; Stiefelhagen, Rainer

doi:10.1007/11821045_12

Kenichi Kumatani¹⁹ &
Rainer Stiefelhagen¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4153))

Included in the following conference series:

International Workshop on Intelligent Computing in Pattern Analysis and Synthesis

1264 Accesses
4 Citations

Abstract

This paper presents a new mouth region localization method which uses the Gaussian mixture model (GMM) of feature vectors extracted from mouth region images. The discrete cosine transformation (DCT) and principle component analysis (PCA) based feature vectors are evaluated in mouth localization experiments. The new method is suitable for audio-visual speech recognition. This paper also introduces a new database which is available for audio visual processing. The experimental results show that the proposed system has high accuracy for mouth region localization (more than 95 %) even if the tracking results of preceding frames are unavailable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Vezhnevets, V., Soldatov, S., Degtiareva, A.: Automatic Extraction of Frontal Facial Features. In: Proc. Asian Conf. on Computer Vision, Jeju, vol. 2, pp. 1020–1025 (2004)
Google Scholar
Zhu, X., Fan, J., Elmagarmid, A.K.: Towards Facial Feature Extraction and Verification for Omni-face Detection in Video/images. In: Proc. the IEEE Int. Conf. on Image Processing, New York, vol. 2, pp. 113–116 (2002)
Google Scholar
Tian, Y.-l., Kanade, T., Cohn, J.F.: Lip Tracking by Combining Shape, Color and Motion. In: Proc. Asian Conference on Computer Vision, Taipei, pp. 1040–1045 (2000)
Google Scholar
Baskan, S., Bulut, M.M., Atalay, V.: Projection based Method for Segmentation of Human Face and its Evaluation. Pattern Recognition Letters 23, 1623–1629 (2002)
Article MATH Google Scholar
Wu, H., Yokoyama, T., Pramadihanto, D., Yachida, M.: Face and Facial Feature Extraction from Color Image. In: Proc. Int. Conf. on Automatic Face and Gesture Recognition, Killington, pp. 345–350 (1996)
Google Scholar
Barnard, M., Holden, E.-J., Owens, R.: Lip Tracking using Pattern Matching Snakes. In: Proc. Asian Conf. on Computer Vision, Melbourne, pp. 23–25 (2002)
Google Scholar
Luettin, J.: Visual Speech and Speaker Recognition. PhD thesis, Department of Computer Science, University of Sheffield (1997)
Google Scholar
Lienhart, R., Liang, L., Kuranov, A.: A Detector Tree of Boosted Classifiers for Real-time Object Detection and Tracking. In: Proc. IEEE Int. Conf. on Multimedia and Expo., Baltimore, pp. 277–280 (2003)
Google Scholar
Jiang, J., Potamianos, G., Nock, H.J., Iyengar, G., Neti, C.: Improved Face and Feature Finding for Audio-visual Speech Recognition in Visually Challenging Environments. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Montreal, vol. 5, pp. 873–876 (2004)
Google Scholar
Sung, K.-K., Poggio, T.: Example-based Learning for View-based Face Detection. IEEE Trans. on Pattern Analysis and Machine Intelligence 20, 39–51 (1998)
Article Google Scholar
Potamianos, G., Neti, C., Luettin, J., Matthews, I.: Audio-Visual Automatic Speech Recognition: An Overview. In: Bailly, G., Vatikiotis-Bateson, E., Perrier, P. (eds.) Issues in Visual and Audio-Visual Speech Processing. MIT Press, Cambridge (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Interactive Systems Labs, Universitaet Karlsruhe (TH), Am Fasanengarten 5, 76131, Karlsruhe, Germany
Kenichi Kumatani & Rainer Stiefelhagen

Authors

Kenichi Kumatani
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Stiefelhagen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Xi’an Jiaotong University, 710049, Xi’an, China
Nanning Zheng
Department of Mathematics and Computer Science, University of Münster, Einsteinstrasse 62, D-48149, Münster, Germany
Xiaoyi Jiang
Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xianning West Road 28, 710049, Xi’an, China
Xuguang Lan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumatani, K., Stiefelhagen, R. (2006). Mouth Region Localization Method Based on Gaussian Mixture Model. In: Zheng, N., Jiang, X., Lan, X. (eds) Advances in Machine Vision, Image Processing, and Pattern Analysis. IWICPAS 2006. Lecture Notes in Computer Science, vol 4153. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11821045_12

Download citation

DOI: https://doi.org/10.1007/11821045_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37597-5
Online ISBN: 978-3-540-37598-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics