Abstract
This paper deals with the problem of searching for a suitable window for robust speech recognition in noisy conditions. A set of asymmetric windows, so-called DDR c,w , are proposed which are controlled by two parameters, center c and width w. These windows are derived from the DDR window used in the higher-lag autocorrelation spectrum estimation (HASE) method and act over the OSA (One-Sided Autocorrelation) in order to perform spectral estimation. The two parameters, c and w, allow us to control the level of weight given to the first noisy autocorrelation coefficients and to emphasize the important ones. Finally, it is shown that the best window of the proposed set is the DDR 62,200. This window is centered around the average pitch of human speech and it provides a higher speech recognition performance over the Aurora-2 and Aurora-3 databases than those obtained by previously proposed windows.
Similar content being viewed by others
References
Aurora Project Database: Subset of SpeechDat-Car—Spanish Database European Language Resources Association (ELRA) (2001)
L. Buera, J. Droppo, A. Acero, Speech enhancement using a pitch predictive mode, in Proc. of ICASSP’2008, pp. 4885–4888, April 2008
ETSI ES 201 108 v1.1.3. Distributed Speech Recognition; Front-end Feature Extraction Algorithm; Compression Algorithms. April 2003
J. Hernando, C. Nadeu, Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition. IEEE Trans. Speech Audio Process. 5(1), 80–84 (1997)
D. Mansour, B.H. Juang, The short-time modified coherence representation and noisy speech recognition. IEEE Trans. Audio Speech Signal Process. 37, 795–804 (1989)
J.A. Morales-Cordovilla, A.M. Peinado, V. Sanchez, J.A. Gonzalez, Feature extraction based on pitch-synchronous averaging for robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 19(3), 640–651 (2011)
D. O’Shaughnessy, Invited paper: Automatic speech recognition: History, methods and challenges. Pattern Recognit. 41, 2965–2979 (2006)
D. Pearce, H. Hirsch, The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, in Proc. of ICSLP’2000, vol. 4, pp. 29–32 (2000)
A.M. Peinado, J.C. Segura, Speech Recognition over Digital Channels (Wiley, New York, 2006)
J.G. Proakis, D.G. Manolakis, Digital Signal Processing: Principles, Algorithms and Applications, 3rd edn. (Prentice Hall, New York, 2000)
R. Rozman, D.M. Kodek, Using asymmetric windows in automatic speech recognition. Speech Commun. Jan. (2007)
B. Shannon, K.K. Paliwal, Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition. Speech Commun. 48(1), 1458–1485 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Morales-Cordovilla, J.A., Sánchez, V., Gómez, A.M. et al. On the Use of Asymmetric Windows for Robust Speech Recognition. Circuits Syst Signal Process 31, 727–736 (2012). https://doi.org/10.1007/s00034-011-9349-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-011-9349-y