Skip to main content
Log in

Automatic computation of histogram threshold for lip segmentation using feedback of shape information

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Threshold-based segmentation methods provide a simple and efficient way to implement lip segmentation. However, automatic computation of robust thresholds presents a major challenge. This research proposes an adaptive method for selecting the histogram threshold, based on feedback of shape information. The proposed method reduces unnecessary overhead by first comparing the initial segmentation to a reference lip shape model to decide if optimisation is required. In cases where optimisation is required, the algorithm adjusts the threshold until the segmentation is sufficiently similar to a reference shape model. The algorithm is tested on the AR Face Database by comparing the segmentation accuracy before and after optimisation. The proposed method increases the number of segmentations classified as ‘good’ (overlap above 90 %) by 7.1 % absolute, and significantly improves the segmentation in challenging cases containing facial hair.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Potamianos, G., Neti, C.: Audio-visual speech recognition in challenging environments. In: European Conference on Speech Communication and Technology (EUROSPEECH 2003), Geneva, Switzerland, pp. 1293–1296 (2003)

  2. Gritzman, A.D., Rubin, D.M., Pantanowitz, A.: Comparison of colour transforms used in lip segmentation algorithms. Signal Image Video Process. 9(4), 168–173 (2014)

    Google Scholar 

  3. Wark, T., Sridharan, S., Chandran, V.: An approach to statistical lip modelling for speaker identification via chromatic feature extraction. In: Fourteenth International Conference on Pattern Recognition (ICPR 1998), vol. 1. IEEE, pp. 123–125 (1998)

  4. Chiou, G.I., Hwang, J.-N.: Lipreading from color video. IEEE Trans. Image Process. 6(8), 1192–1195 (1997)

    Article  Google Scholar 

  5. Coianiz, T., Torresani, L., Caprile, B.: 2D deformable models for visual speech analysis. In: Stork, G., Hennecke, M.E. (eds.) NATO Advanced Study Institute: Speechreading by Man and Machine, pp. 391–398. Springer, Berlin (1996)

    Chapter  Google Scholar 

  6. Zhang, X., Mersereau, R.: Lip feature extraction towards an automatic speechreading system. In: 2000 International Conference on Image Processing (ICIP 2000), vol. 3. IEEE, pp. 226–229 (2000)

  7. Caplier, A., Stillittano, S., Bouvier, C., Coulon, P.: Lip modelling and segmentation. In: Liew, A. W.-C., Wang, S. (eds.) Visual Speech Recognition: Lip Segmentation and Mapping, pp. 70–127. Information Science Reference (an imprint of IGI Global), USA (2009)

  8. Pardàs, M., Sayrol, E.: Motion estimation based tracking of active contours. Pattern Recogn. Lett. 22(13), 1447–1456 (2001)

    Article  MATH  Google Scholar 

  9. Werda, S., Mahdi, W., Hamadou, A. B.: Colour and geometric based model for lip localisation: application for lip-reading system. In: 14th International Conference on Image Analysis and Processing (ICIAP 2007). IEEE, pp. 9–14 (2007)

  10. Eveno, N., Caplier, A., Coulon, P.: Accurate and quasi-automatic lip tracking. IEEE Trans. Circuits Syst. Video Technol. 14(5), 706–715 (2004). iD: 1

    Article  Google Scholar 

  11. Yuille, A.L., Hallinan, P.W., Cohen, D.S.: Feature extraction from faces using deformable templates. Int. J. Comput. Vision 8(2), 99–111 (1992)

    Article  Google Scholar 

  12. Kaucic, R., Dalton, B., Blake, A.: Real-time lip tracking for audio-visual speech recognition applications. Comput. Vis. ECCV 1996, 376–387 (1996)

    Article  Google Scholar 

  13. Luettin, J., Thacker, N.A.: Speechreading using probabilistic models. Comput. Vis. Image Underst. 65(2), 163–178 (1997)

    Article  Google Scholar 

  14. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis. 1(4), 321–331 (1988)

    Article  MATH  Google Scholar 

  15. Eveno, N., Caplier, A., Coulon, P.-Y.: Jumping snakes and parametric model for lip segmentation. In: International Conference on Image Processing (ICIP 2003), vol. 2. IEEE, pp. 867–870 (2003)

  16. Matthews, I., Cootes, T.F., Bangham, J.A., Cox, S., Harvey, R.: Extraction of visual features for lipreading. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 198–213 (2002)

    Article  Google Scholar 

  17. Zheng, Z., Jiong, J., Chunjiang, D., Liu, X., Yang, J.: Facial feature localization based on an improved active shape model. Inf. Sci. 178(9), 2215–2223 (2008)

    Article  Google Scholar 

  18. Wang, S., Lau, W., Leung, S., Yan, H.: A real-time automatic lipreading system. In: International Symposium on Circuits and Systems (ISCAS 2004), vol. 2. IEEE, pp. II-101–II-104 (2004)

  19. Saeed, U., Dugelay, J.L.: Combining edge detection and region segmentation for lip contour extraction. Articulated Motion and Deformable Objects, pp. 11–20 (2010)

  20. Bouvier, C., Coulon, P.Y., Maldague, X.: Unsupervised lips segmentation based on ROI optimisation and parametric model. In: 2007 IEEE International Conference on Image Processing, vol. 4. IEEE, pp. IV-301–IV-304 (2007)

  21. Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)

    Google Scholar 

  22. Martinez, A.: The AR face database. CVC Technical Report, vol. 24 (1998)

  23. Ding, L., Martinez, A.: Features versus context: an approach for precise and detailed detection and delineation of faces and facial features. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 2022–2038 (2010)

    Article  Google Scholar 

  24. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1. IEEE, pp. I–511 (2001)

  25. Wang, S., Leung, S., Lau, W.: Lip segmentation by fuzzy clustering incorporating with shape function. In: 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2002), vol. 1. IEEE, pp. I-1077 (2002)

  26. Liew, A.-C., Leung, S.H., Lau, W.H.: Segmentation of color lip images by spatial fuzzy clustering. IEEE Trans. Fuzzy Syst. 11(4), 542–549 (2003)

    Article  Google Scholar 

  27. Eveno, N., Caplier, A., Coulon, P.Y.: New color transformation for lips segmentation. In: 2001 IEEE Fourth Workshop on Multimedia Signal Processing, pp. 3–8 (2001)

  28. Canzler, U., Dziurzyk, T.: Extraction of non manual features for videobased sign language recognition. In: IAPR Workshop on Machine Vision Applications (IAPR MVA 2002), pp. 318–321 (2002)

  29. Kiefer, J.: Sequential minimax search for a maximum. Proc. Am. Math. Soc. 4(3), 502–506 (1953)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The financial assistance of the National Research Foundation (NRF) of South Africa towards this research is hereby acknowledged (Grant No. 97742).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashley D. Gritzman.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gritzman, A.D., Aharonson, V., Rubin, D.M. et al. Automatic computation of histogram threshold for lip segmentation using feedback of shape information. SIViP 10, 869–876 (2016). https://doi.org/10.1007/s11760-015-0834-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-015-0834-9

Keywords

Navigation