Skip to main content
Log in

Comparison of colour transforms used in lip segmentation algorithms

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Lip segmentation is a fundamental system component in a range of applications including: automatic lip reading, emotion recognition and biometric speaker identification. The first step in lip segmentation involves applying a colour transform to enhance the contrast between the lips and surrounding skin. However, there is much debate among researchers as to the best transform for this task. As such, this article presents the most comprehensive study to date by evaluating 33 colour transforms for lip segmentation: 21 channels from seven colour space models (RGB, HSV, YCbCr, YIQ, CIEXYZ, CIELUV and CIELAB) and 12 additional colour transforms (8 of which are designed specifically for lip segmentation). The colour transform comparison is extended to determine the best transform to segment the oral cavity. Histogram intersection and Otsu’s discriminant are used to quantify and compare the transforms. Results for lip–skin segmentation validate the experimental approach, as 11 of the top 12 transforms are used for lip segmentation in the literature. The necessity of selecting the correct transform is demonstrated by an increase in segmentation accuracy of up to three times. Hue-based transforms including pseudo hue and hue domain filtering perform the best for lip–skin segmentation, with the hue component of HSV achieving the greatest accuracy of 93.85 %. The a* component of CIELAB performs the best for lip–oral cavity segmentation, while pseudo hue and the LUX transform perform reasonably well for both lip–skin segmentation and lip–oral cavity segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Chellappa, R., Wilson, C.L., Sirohey, S.: Human and machine recognition of faces: a survey. Proc. IEEE 83(5), 705–741 (1995)

    Article  Google Scholar 

  2. Pandzic, I., Escher, M., Thalmann, N.M.: Facial deformations for mpeg-4. In: Computer Animation 98. Proceedings, pp. 56–62. IEEE (1998)

  3. Lewis, T.W., Powers, D.M.W.: Audio-visual speech recognition using red exclusion and neural networks. Austr. Comput. Sci. Commun. 24, 149–156 (2002)

    Google Scholar 

  4. Coianiz, T., Torresani, L., Caprile, B.: 2d deformable models for visual speech analysis. In: NATO Advanced Study Institute, Speechreading by Man and Machine, Citeseer (2002)

  5. Vogt, M.: Fast matching of a dynamic lip model to color video sequences under regular illumination conditions. Nato ASI Subser. F Comput. Syst. Sci. 150, 399–408 (1996)

    Google Scholar 

  6. Koo, H., Song, H.: Facial feature extraction for face modeling program. Int. J. Circuits Syst. Signal Process. 4(4), 169–176 (2010)

    Google Scholar 

  7. Liang, Y.L., Du, M.H.: Lip extraction method based on a component of lab color space. Comput. Eng. 37(3) (2011)

  8. Wang, S., Lau, W., Leung, S., Yan, H.: A real-time automatic lipreading system. In: Proceedings of the International Symposium on Circuits and Systems, IEEE, vol. 2, pp. II-101-4 (2004)

  9. WenJuan, Y., YaLing, L., MingHui, D.: A real-time lip localization and tacking for lip reading. In: 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE), IEEE vol. 6, pp V6-363–V6-366 (2010)

  10. Eveno, N., Caplier, A., Coulon, P.Y.: Key points based segmentation of lips. In: IEEE International Conference on Multimedia and Expo, vol. 2, pp. 125–128. IEEE (2002)

  11. Zhang, X., Mersereau, R.: Lip feature extraction towards an automatic speechreading system. In: International Conference on Image Processing, vol. 3, pp. 226–229. IEEE (2000)

  12. Caplier, A., Stillittano, S., Bouvier, C., Coulon, P.: Lip modelling and segmentation. Lip Segment. Map. Vis. Speech Recognit. 70–127 (2009)

  13. Hurlbert, A.C., Poggio, T.A.: Synthesizing a color algorithm from examples. Science 239(4839), 482–485 (1988)

    Article  Google Scholar 

  14. Lievin, M., Luthon, F.: Nonlinear color space and spatiotemporal mrf for hierarchical segmentation of face features in video. IEEE Trans. Image Process. 13(1), 63–71 (2004)

    Article  Google Scholar 

  15. Goldschen, A.J., Garcia, O.N., Petajan, E.: Continuous optical automatic speech recognition by lipreading. In: 28th Asilomar Conference on Signals, Systems and Computers, vol. 1, pp 572–577. IEEE (1994)

  16. Dahlman, E., Parkvall, S., Beming, P., Bovik, A.C., Fette, B.A., Jack, K., Skold, J., Dowla, F., Chou, P.A., DeCusatis, C.: Communications Engineering Desk Reference. Academic Press, London (2009)

    Google Scholar 

  17. Ford, A., Roberts, A.: Colour Space Conversions. Westminster University, London (1998)

    Google Scholar 

  18. Stork, D.G., Hennecke, M.E.: Speechreading: an overview of image processing, feature extraction, sensory integration and pattern recognition techniques. In: Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, IEEE, pp XVI–XXVI (1996)

  19. Eveno, N., Caplier, A., Coulon, P.Y.: New color transformation for lips segmentation. In: IEEE Fourth Workshop on Multimedia Signal Processing, pp 3–8, iD: 1 (2001)

  20. McClain, M., Brady, K., Brandstein, M., Quatieri, T.: Automated lip-reading for improved speech intelligibility. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. I-701. IEEE (2004)

  21. Hsu, R.L., Abdel-Mottaleb, M., Jain, A.K.: Face detection in color images. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 696–706 (2002)

    Article  Google Scholar 

  22. Thejaswi, N.S., Sengupta, S.: Lip localization and viseme recognition from video sequences. In: Fourteenth National Conference on Communications (2008)

  23. Hamilton, J.: Color Space Conversion. Green Harbor Publications, Cold Spring Harbor (1992)

    Google Scholar 

  24. Talea, H., Yaghmaie, K.: Automatic visual speech segmentation. In: 3rd International Conference on Communication Software and Networks (ICCSN), pp. 184–188. IEEE (2011)

  25. Canzler, U., Dziurzyk, T.: Extraction of non manual features for videobased sign language recognition. In: IAPR Workshop on Machine Vision Applications, pp. 318–321 (2002)

  26. Gong, Y., Sakauchi, M.: Detection of regions matching specified chromatic features. Comput. Vis. Image Underst. 61(2), 263–269 (1995)

    Article  Google Scholar 

  27. Guan, Y.P.: Automatic extraction of lips based on multi-scale wavelet edge detection. Comput. Vis. IET 2(1), 23–33 (2008)

    Article  Google Scholar 

  28. Zhang, J., Tao, H., Wang, L., Zhan, Y., Song, S.: A real-time approach to the lip-motion extraction in video sequence. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 7, pp. 6423–6428. IEEE (2004)

  29. Ohta, Y., Kanade, T., Sakai, T.: Color information for region segmentation. Comput. Graph. Image Process. 13(3), 222–241 (1980)

    Google Scholar 

  30. Watson, A.B., Poirson, A.: Separable two-dimensional discrete hartley transform. JOSA A 3(12), 2001–2004 (1986)

    Google Scholar 

  31. Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)

    Google Scholar 

  32. Demirkaya, O., Asyali, M.H.: Determination of image bimodality thresholds for different intensity distributions. Signal Process. Image Commun. 19(6), 507–516 (2004)

    Article  Google Scholar 

  33. The MathWorks Inc: Matlab user guide, p. 4. MA, Natick (1998)

  34. Martinez, A.: The AR face database, p. 24. Report, CVC Technical (1998)

  35. Ding, L., Martinez, A.: Features versus context: an approach for precise and detailed detection and delineation of faces and facial features. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 2022–2038 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashley D. Gritzman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gritzman, A.D., Rubin, D.M. & Pantanowitz, A. Comparison of colour transforms used in lip segmentation algorithms. SIViP 9, 947–957 (2015). https://doi.org/10.1007/s11760-014-0615-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-014-0615-x

Keywords

Navigation