A noise-robust front-end for distributed speech recognition in mobile communications

Addou, Djamel; Selouani, Sid-Ahmed; Kifaya, Kaoukeb; Boudraa, Malika; Boudraa, Bachir

doi:10.1007/s10772-009-9025-9

A noise-robust front-end for distributed speech recognition in mobile communications

Published: 26 March 2009

Volume 10, pages 167–173, (2007)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Djamel Addou¹,
Sid-Ahmed Selouani²,
Kaoukeb Kifaya²,
Malika Boudraa¹ &
…
Bachir Boudraa¹

87 Accesses
7 Citations
Explore all metrics

Abstract

This paper investigates a new front-end processing that aims at improving the performance of speech recognition in noisy mobile environments. This approach combines features based on conventional Mel-cepstral Coefficients (MFCCs), Line Spectral Frequencies (LSFs) and formant-like (FL) features to constitute robust multivariate feature vectors. The resulting front-end constitutes an alternative to the DSR-XAFE (XAFE: eXtended Audio Front-End) available in GSM mobile communications. Our results showed that for highly noisy speech, using the paradigm that combines these spectral cues leads to a significant improvement in recognition accuracy on the Aurora 2 task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighting Schemes Based Discriminative Model Combination Technique for Robust Speech Recognition

Shennong: A Python toolbox for audio speech features extraction

Article 07 February 2023

Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

ETSI (2003). Speech processing, transmission and quality aspects (stq); distributed speech recognition; front-end feature extraction algorithm; compression algorithm (Technical Report). ETSI ES 201-108.
Garner, P., & Holmes, W. (1998). On the robust incorporation of formant features into Hidden Markov Models for automatic speech recognition. In Proceedings of IEEE ICASSP (pp. 1–4).
Itakura, F. (1975). Line spectrum representation of linear predictive coefficients of speech signals. Journal of the Acoustical Society of America, 57(1), s35.
Article Google Scholar
ITU recommendation G. 712 (1996). Transmission performance characteristics of pulse code modulation channels.
ITU-T Recommendation G. 723.1 (1996). Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s.
Junqua, J.-C., & Haton, J.-P. (1996). Robustness in automaticrecognition. Dordrecht: Kluwer Academic.
Google Scholar
O’Shaughnessy, D. (2001). Speech communication: human and machine. New York: IEEE Press.
Google Scholar
Rose, R., & Momayyez, P. (2007). Integration of multiple feature sets for reducing ambiguity in automatic speech recognition. Proc. IEEE-ICASSP (pp. 325–328).
Selouani, S.-A., Tolba, H., & Shaughnessy, D. O. (2003). Auditory-based acoustic distinctive features and spectral cues for robust automatic speech recognition in low-SNR car environments. In Proceedings of human language technology conference of the North American Association for Computational Linguistics, CP volume, 91–94, Edmonton.
Selouani, S.-A., Hamam, H., & O’Shaughnessy, D. (2007). A hybrid Genetic-Neural Front-end extension for robust speech recognition over telephone lines. In Lecture notes on computer science (pp. 169–178). Berlin: Springer.
Google Scholar
Soong, F., & Juang, B. (1984). Line Spectrum Pairs (LSP) and speech data compression. In Proceedings of International. Conference on Acoustics, Speech, and Signal Processing, San Diego (pp. 1-10-1/1–10-4).
Tolba, H., Selouani, S.-A., & O’Shaughnessy, D. (2002). Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multi-stream paradigm. In Proc. of the ICASSP (pp. 837–840). Orlando, USA.
Young, S. J. (2006). HTK version 3.4: reference manual and user manual. Cambridge: Cambridge University, Engineering Department Speech Group.
Google Scholar

Download references

Author information

Authors and Affiliations

Speech and Signal Processing Lab., USTHB University of Science and Technology, Algiers, Algeria
Djamel Addou, Malika Boudraa & Bachir Boudraa
LARIHS Lab., Université de Moncton, Shippagan campus, New Brunswick, Canada
Sid-Ahmed Selouani & Kaoukeb Kifaya

Authors

Djamel Addou
View author publications
You can also search for this author inPubMed Google Scholar
Sid-Ahmed Selouani
View author publications
You can also search for this author inPubMed Google Scholar
Kaoukeb Kifaya
View author publications
You can also search for this author inPubMed Google Scholar
Malika Boudraa
View author publications
You can also search for this author inPubMed Google Scholar
Bachir Boudraa
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sid-Ahmed Selouani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Addou, D., Selouani, SA., Kifaya, K. et al. A noise-robust front-end for distributed speech recognition in mobile communications. Int J Speech Technol 10, 167–173 (2007). https://doi.org/10.1007/s10772-009-9025-9

Download citation

Received: 08 March 2009
Accepted: 10 March 2009
Published: 26 March 2009
Issue Date: December 2007
DOI: https://doi.org/10.1007/s10772-009-9025-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A noise-robust front-end for distributed speech recognition in mobile communications

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Weighting Schemes Based Discriminative Model Combination Technique for Robust Speech Recognition

Shennong: A Python toolbox for audio speech features extraction

Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now