Prosodic Features and Formant Contribution for Arabic Speech Recognition in Noisy Environments

Amrous, Anissa Imen; Debyeche, Mohamed; Amrouche, Abderrahman

doi:10.1007/978-3-642-19644-7_49

Anissa Imen Amrous⁸,
Mohamed Debyeche⁸ &
Abderrahman Amrouche⁸

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 87))

1337 Accesses

Abstract

This paper investigates the contribution of formants and prosodic features like pitch and energy in Arabic speech recognition under real-life conditions. Our speech recognition system based on Hidden Markov Model (HMM) is implemented using the HTK Toolkit. The front-end of the system combines features based on conventional Mel-Frequency Cepstral Coefficient (MFFC), prosodic information and formants. The obtained results show that the resulting multivariate feature vectors lead to a significant improvement of the recognition system performance in noisy environment compared to cepstral system alone.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Statistical Formant Speech Synthesis for Arabic

Article 02 July 2015

An experimental framework for Arabic digits speech recognition in noisy environments

Article 03 February 2017

Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic

Article 22 November 2018

References

Lévy, C., et al.: Comparison of several acoustic modeling techniques and decoding algorithms for embedded speech recognition systems. In: Workshop on DSP in Mobile and Vehicular Systems, Nagoya, Japan (2003)
Google Scholar
Baudoin, G., Jardin, P.: Comparison de techniques paramétrisation spectrale pour la reconnaissance vocale en milieu bruité. Quatorzième colloque gretsi (1993)
Google Scholar
Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. Speech Communication 50, 782–796 (2008)
Article Google Scholar
Ezzaidi, H.: Discrimination Parole/ Musique et étude de nouveaux paramètres et modèles pour un système d’identification du locuteur dans le contexte de conférences téléphoniques, Thèse de doctorat. L’Université du Québec à Chicoutimi ; Département des Sciences Appliquées (2002)
Google Scholar
Deleglise, P., et al.: Asynchronous integration of audio and visual sources in bi-model automatic speech recognition. In: Proceedings of the VIII European Signal Processing Conference, Trieste, Italy (1996)
Google Scholar
Rogozan, A.: Etude de la fusion des données hétérogènes pour la reconnaissance automatique de la parole audio-visuelle. Doctoral thesis. L’université d’Orsay Paris XI (1999)
Google Scholar
Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. Speech Communication 50, 782–796 (2008)
Article Google Scholar
Dehak, N., et al.: Continuous Prosodic Features and Formant Modeling with Joint Factor Analysis for Speaker Verification. In: Proceedings of Interspeech 2007, pp. 1234–1237 (2007)
Google Scholar
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of the IEEE 77(2), 257–286 (1989)
Article Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transaction on Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980)
Article Google Scholar
Fant, G.: Acoustic Theory of Speech Production. Mouton & Co., The Hague (1960)
Google Scholar
Tremain, T.E.: The government standard Linear Predictive Coding algorithm: LPC10. Speech Technology Magazine 1, 40–49 (1982)
Google Scholar
Rabiner, L.R.: On the Use of Autocorrelation Analysis for Pitch Detection. IEEE Transaction on Acoustics, Speech, and Signal Processing 25(1) (1977)
Google Scholar
Youngand, S., Odell, J.: The HTK Book Version 3.3. Speech group, Engineering Department. Cambridge University, Cambridge (2005)
Google Scholar
Amrouche, A.: Reconnaissance automatique de la parole par les modèles connexionnistes. Doctoral thesis, faculté d’électronique et d’informatique, USTHB (2007)
Google Scholar
Varga, A.P., et al.: The NOISEX-92 study on the effect of additive noise on automatic speech recognition. In: NOISEX 1992, CDROM (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Speech Communication and Signal Processing Laboratory (LPCTS), Faculty of Electronics and Computer Sciences, USTHB, P.O. Box 32, Bab Ezzouar, Algiers, Algeria
Anissa Imen Amrous, Mohamed Debyeche & Abderrahman Amrouche

Authors

Anissa Imen Amrous
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Debyeche
View author publications
You can also search for this author in PubMed Google Scholar
Abderrahman Amrouche
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidad de Salamanca, Plaza de la Merced S/N, 37008, Salamanca, Spain
Emilio Corchado
VŠB-TU Ostrava, 17. listopadu 15, 70833, Ostrava, Czech Republic
Václav Snášel
University of Burgos, Avenida Cantaria S/N, 09006, Burgos, Spain
Javier Sedano
Cairo University, 5 Ahmed Zewal St., Orman, Cairo, Egypt
Aboul Ella Hassanien
University of La Coruña, Avda. 19 de Febrero, S/N, A Coruña,, 15403, Ferrol, Spain
José Luis Calvo
Infobright, 47 Colborne Street, Suite 403, M5E1P8, Toronto, Ontario, Canada
Dominik Ślȩzak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amrous, A.I., Debyeche, M., Amrouche, A. (2011). Prosodic Features and Formant Contribution for Arabic Speech Recognition in Noisy Environments. In: Corchado, E., Snášel, V., Sedano, J., Hassanien, A.E., Calvo, J.L., Ślȩzak, D. (eds) Soft Computing Models in Industrial and Environmental Applications, 6th International Conference SOCO 2011. Advances in Intelligent and Soft Computing, vol 87. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19644-7_49

Download citation

DOI: https://doi.org/10.1007/978-3-642-19644-7_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19643-0
Online ISBN: 978-3-642-19644-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Prosodic Features and Formant Contribution for Arabic Speech Recognition in Noisy Environments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Statistical Formant Speech Synthesis for Arabic

An experimental framework for Arabic digits speech recognition in noisy environments

Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Prosodic Features and Formant Contribution for Arabic Speech Recognition in Noisy Environments

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Statistical Formant Speech Synthesis for Arabic

An experimental framework for Arabic digits speech recognition in noisy environments

Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation