Skip to main content
Log in

On the Use of Asymmetric Windows for Robust Speech Recognition

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

This paper deals with the problem of searching for a suitable window for robust speech recognition in noisy conditions. A set of asymmetric windows, so-called DDR c,w , are proposed which are controlled by two parameters, center c and width w. These windows are derived from the DDR window used in the higher-lag autocorrelation spectrum estimation (HASE) method and act over the OSA (One-Sided Autocorrelation) in order to perform spectral estimation. The two parameters, c and w, allow us to control the level of weight given to the first noisy autocorrelation coefficients and to emphasize the important ones. Finally, it is shown that the best window of the proposed set is the DDR 62,200. This window is centered around the average pitch of human speech and it provides a higher speech recognition performance over the Aurora-2 and Aurora-3 databases than those obtained by previously proposed windows.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aurora Project Database: Subset of SpeechDat-Car—Spanish Database European Language Resources Association (ELRA) (2001)

  2. L. Buera, J. Droppo, A. Acero, Speech enhancement using a pitch predictive mode, in Proc. of ICASSP’2008, pp. 4885–4888, April 2008

  3. ETSI ES 201 108 v1.1.3. Distributed Speech Recognition; Front-end Feature Extraction Algorithm; Compression Algorithms. April 2003

  4. J. Hernando, C. Nadeu, Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition. IEEE Trans. Speech Audio Process. 5(1), 80–84 (1997)

    Article  Google Scholar 

  5. D. Mansour, B.H. Juang, The short-time modified coherence representation and noisy speech recognition. IEEE Trans. Audio Speech Signal Process. 37, 795–804 (1989)

    Article  Google Scholar 

  6. J.A. Morales-Cordovilla, A.M. Peinado, V. Sanchez, J.A. Gonzalez, Feature extraction based on pitch-synchronous averaging for robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 19(3), 640–651 (2011)

    Article  Google Scholar 

  7. D. O’Shaughnessy, Invited paper: Automatic speech recognition: History, methods and challenges. Pattern Recognit. 41, 2965–2979 (2006)

    Article  Google Scholar 

  8. D. Pearce, H. Hirsch, The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, in Proc. of ICSLP’2000, vol. 4, pp. 29–32 (2000)

    Google Scholar 

  9. A.M. Peinado, J.C. Segura, Speech Recognition over Digital Channels (Wiley, New York, 2006)

    Book  Google Scholar 

  10. J.G. Proakis, D.G. Manolakis, Digital Signal Processing: Principles, Algorithms and Applications, 3rd edn. (Prentice Hall, New York, 2000)

    Google Scholar 

  11. R. Rozman, D.M. Kodek, Using asymmetric windows in automatic speech recognition. Speech Commun. Jan. (2007)

  12. B. Shannon, K.K. Paliwal, Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition. Speech Commun. 48(1), 1458–1485 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victoria Sánchez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morales-Cordovilla, J.A., Sánchez, V., Gómez, A.M. et al. On the Use of Asymmetric Windows for Robust Speech Recognition. Circuits Syst Signal Process 31, 727–736 (2012). https://doi.org/10.1007/s00034-011-9349-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-011-9349-y

Keywords

Navigation