Skip to main content

Detecting Broad Phonemic Class Boundaries from Greek Speech in Noise Environments

  • Conference paper
  • 1043 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4188))

Abstract

In this work, we present the performance evaluation of an implicit approach for the automatic segmentation of continuous speech signals into broad phonemic classes as encountered in Greek language. Our framework was evaluated with clear speech and speech with white, pink, bubble, car and machine gun additive noise. Our framework’s results were very promising since an accuracy of 76.1% was achieved for the case of clear speech (for distances less than 25 msec to the actual segmentation point), without presenting over-segmentation on the speech signal. An average reduction of 4% in the total accuracy of our segmentation framework was observed in the case of wideband distortion additive noise environment.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland P.: The HTK Book, Revised for HTK Version 3.0 (July 2000)

    Google Scholar 

  2. Zissman, M.: Comparison of four Approaches to Automatic Language Identification of Telephone Speech. IEEE Trans. Speech and Audio Proc. SAP-4, 31–44 (1996)

    Article  Google Scholar 

  3. Dutoit, T.: An Introduction to Text-To-Speech Synthesis. In: Text, Speech and Language Technology, vol. 3. Kluwer Academic Publishers, Dordrecht (1997)

    Google Scholar 

  4. van Hemert, J.: Automatic Segmentation of Speech. IEEE Transactions on Signal Processing 39(4) (April 1991)

    Google Scholar 

  5. Aversano, G., Esposito, A., Esposito, A., Marinaro, M.: A new text-independent method for phoneme segmentation. In: Proc. of 44th IEEE Midwest Symp. Circuits and Systems, vol. 2, pp. 516–519 (2001)

    Google Scholar 

  6. Suh, Y., Lee, Y.: Phoneme segmentation of continuous speech using multi-layer perceptron. In: Proc. of ICSLP 1996, pp. 1297–1300 (1996)

    Google Scholar 

  7. Svendsen, T., Kvale, K.: Automatic alignment of phonemic labels in continuous speech. In: Proc. of ICSLP 1990, Kobe, Japan (1990)

    Google Scholar 

  8. Svendsent, T., Soong, F.K.: On the automatic segmentation of speech signals. In: Proc. of ICASSP 1987, Dallas, pp. 77–80 (April 1987)

    Google Scholar 

  9. Grayden, D., Scordilis, M.: Phonemic segmentation of fluent speech. In: Proc. of ICASSP 1994, pp. 73–76 (1994)

    Google Scholar 

  10. Essa, O.: Using prosody in automatic segmentation of speech. In: Proc. of 36th ACM Southeast Regional Conference (1998)

    Google Scholar 

  11. Pellom, B., Hansen, J.: Automatic segmentation of speech recorded in unknown noisy channel characteristics. Speech Communication 25, 97–116 (1998)

    Article  Google Scholar 

  12. Reddy, D.R.: Pitch Period Determination of Speech Sounds. Communication of the ACM 10, 343–348 (1967)

    Google Scholar 

  13. Tsagalidis, A.: http://www.media.uoa.gr/language/

  14. Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proc. of IFA,, vol. 17, pp. 97–110 (1993)

    Google Scholar 

  15. Boersma, P., Weenink, D.: Praat: doing phonetics by computer (2005), Retrieved from: http://www.praat.org/

  16. Deller, J., Proakis, J., Hansen, J.: Discrete-time processing of speech signals. MacMillan Series. Prentice-Hall Publishers, New York (1993)

    Google Scholar 

  17. Zervas, P., Fakotakis, N., Kokkinakis, G.: Development of a prosodic database for Greek speech synthesis. In: Proc. of SPECOM 2005, Patras, Greece, pp. 603–606 (2005)

    Google Scholar 

  18. Varga, A., Steenneken, H., J., M., Tomlinson, M., Jones, D.: The NOISEX 1992 study on the effect of additive noise on automatic speech recognition (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mporas, I., Zervas, P., Fakotakis, N. (2006). Detecting Broad Phonemic Class Boundaries from Greek Speech in Noise Environments. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_60

Download citation

  • DOI: https://doi.org/10.1007/11846406_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-39090-9

  • Online ISBN: 978-3-540-39091-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics