Skip to main content

Automatic Recognition of Isolated Vowels Using F0-Normalized Harmonic Features

  • Conference paper
e-Business and Telecommunications (ICETE 2008)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 48))

Included in the following conference series:

  • 820 Accesses

Abstract

Human recognition of isolated vowels is quite robust considering intra and inter-speaker variability. Automatic recognition techniques typically exhibit poor performances, notably in the case of female or child speech because a higher fundamental frequency (F0) generates a sparser sampling of the magnitude spectrum.

In this paper we extend previous results on a perceptually motivated concept of vowel recognition that is based on Perceptual Spectral Clusters (PSC) of harmonic partials. We study the effect of normalizing relevant PSC features by F0 taking as a reference the recognition performance of static features derived from either Linear Prediction (LP) analysis or Mel-Frequency Cepstral Coefficients (MFCC), and using the Mahalanobis distance on a data base of five natural Portuguese vowel sounds uttered by 44 speakers. Test results reveal that the recognition performance of F0-normalized PSC features increases approaching that of MFCC coefficients. These results are significant as PSC related features are amenable to concurrent vowel identification while LP or MFCC-related features are not.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice-Hall, Inc., Englewood Cliffs (1993)

    Google Scholar 

  2. Hess, W.: Pitch Determination of Speech Signals -algorithms and devices. Springer, Heidelberg (1983)

    Google Scholar 

  3. Katz, W.F., Assmann, P.F.: Identification of children’s and adults’ vowels: intrinsic fundamental frequency, fundamental frequency dynamics and presence of noise. Journal of Phonetics (29), 23–51 (2001)

    Google Scholar 

  4. Fant, G.: Acoustic Theory of Speech Production. The Hague (1970)

    Google Scholar 

  5. Syrdal, A.K., Gopal, H.S.: A perceptual model of vowel recognition based on the auditory representation of american english vowels. Journal of the Acoustical Society of America 79(4), 1086–1100 (1986)

    Article  Google Scholar 

  6. Liu, C., Eddins, D.A.: Effects of spectral modulation filtering on vowel identification. Journal of the Acoustical Society of America 124(3), 1704–1715 (2008)

    Article  Google Scholar 

  7. Halberstam, B., Raphael, L.J.: Vowel normalization: the role of fundamental frequency and upper formants. Journal of Phonetics (32), 423–434 (2004)

    Google Scholar 

  8. Assman, P.F., Neary, T.M.: Identification of frequency-shifted vowels. Journal of the Acoustical Society of America 124(5), 3203–3212 (2008)

    Article  Google Scholar 

  9. Petterson, G.E., Barney, H.L.: Control methods used in a study of the vowels. Journal of the Acoustical Society of America 24(2), 175–194 (1952)

    Article  Google Scholar 

  10. Johnson, K.: Speaker normalization in speech perception. In: Pironi, D.B., Remez, R.E. (eds.) The handbook of speech perception. Blackwell Publishing Ltd., Malden (2005)

    Google Scholar 

  11. Cheveigné, A., Kawahara, H.: Missing-data model of vowel identification. Journal of the Acoustical Society of America 105(6), 3497–3508 (1999)

    Article  Google Scholar 

  12. Mollis, M.R.: Evaluating models of vowel perception. Journal of the Acoustical Society of America 118(2), 1062–1071 (2005)

    Article  Google Scholar 

  13. Ferreira, A.J.S.: Static features in real-time recognition of isolated vowels at high pitch. Journal of the Acoustical Society of America 112(4), 2389–2404 (2007)

    Article  Google Scholar 

  14. Slawson, A.W.: Vowel quality and musical timbre as functions of spectrum envelope and fundamental frequency. Journal of the Acoustical Society of America 43(1), 87–101 (1968)

    Article  Google Scholar 

  15. Klatt, D.H.: Prediction of perceived phonetic distance from critical-band spectra - a first step. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1278–1281 (1982)

    Google Scholar 

  16. Ferreira, A.J.S.: New signal features for robust identification of isolated vowels. In: 9th European Conference on Speech Communication and Techology (Interspeech 2005), pp. 345–348 (2005)

    Google Scholar 

  17. Moore, B.C.J.: An Introduction to the Psychology of Hearing. Academic Press, London (1989)

    Google Scholar 

  18. Chistovich, L., Lublinskaja, V.: The center of gravity effect in vowel spectra and critical distance between the formants: psychoacoustical study of perception of vowel-like stimuli. Hearing Research 1, 185–195 (1979)

    Article  Google Scholar 

  19. Ferreira, A., Sinha, D.: Accurate and robust frequency estimation in the ODFT domain. In: 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 16-19, pp. 203–206 (2005)

    Google Scholar 

  20. Zahorian, S.A., Jagharghi, A.J.: Spectral-shape features versus formants as acoustic correlates for vowels. Journal of the Acoustical Society of America 94(4), 1966–1982 (1993)

    Article  Google Scholar 

  21. Ryalls, J.H., Lieberman, P.: Fundamental frequency and vowel perception. Journal of the Acoustical Society of America 72(5), 1631–1634 (1982)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ferreira, A. (2009). Automatic Recognition of Isolated Vowels Using F0-Normalized Harmonic Features. In: Filipe, J., Obaidat, M.S. (eds) e-Business and Telecommunications. ICETE 2008. Communications in Computer and Information Science, vol 48. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05197-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05197-5_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05196-8

  • Online ISBN: 978-3-642-05197-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics