Skip to main content
Log in

Mitigate the reverberation effect on the speaker verification performance using different methods

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Speech signals recorded in far-field or with a far receiver typically comprise additive noise and reverberation, which cause degradation and distortion in the reliability and intelligibility of speech signal, and the recognition performance of speaker recognition systems, with severe consequences in a wide range of real applications. Channel equalization, i.e. the removal or reduction or other cleaning methods of the channel effects, to some extent, mitigates the mismatching problem at the cost of added distortions to the vulnerable speech signal themselves, and therefore, its effectiveness is limited. Recent research indicates that a new speaker feature, gammatone frequency cepstral coefficients (GFCC), exhibits superior noise and reverberation robustness than other features. This paper proposed two methods to combat the effect of reverberation on speaker verification performance. The first method is using GFCC features as a robust feature to alleviate the effect of reverberation on system performance. While the second method is using multi training to combat the reverberation effect. Speaker verification experiments in the artificial and real reverberant conditions show the efficiency of the proposed methods in terms of decreased equal error rate EER and detection error trade-off DET.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  • Al-Karawi, K. A. (2019). Robustness speaker recognition based on feature space in clean and noisy condition. International Journal of Sensors, Wireless Communications and Control, 9, 1–10.

    Article  Google Scholar 

  • Al-Karawi, K. A., Al-Noori, A. H., Li, F. F., & Ritchings, T. (2015). Automatic speaker recognition system in adverse conditions—Implication of noise and reverberation on system performance. International Journal of Information and Electronics Engineering, 5, 423.

    Article  Google Scholar 

  • Al-Karawi, K. A., & Li, F. (2017). Robust speaker verification in reverberant conditions using estimated acoustic parameters—A maximum likelihood estimation and training on the fly approach. In 2017 Seventh International Conference on Innovative Computing Technology (INTECH) (pp. 52–57).

  • Al-Karawi, K. A., & Mohammed, D. Y. (2019). Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions. International Journal of Speech Technology, 22(4), 1077–1084.

    Article  Google Scholar 

  • Allen, J. B., & Berkley, D. A. (1979). Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America, 65, 943–950.

    Article  Google Scholar 

  • Al-Noori, A. H., Al-Karawi, K. A., & Li, F. F. (2015). Improving robustness of speaker recognition in noisy and reverberant conditions via training. In 2015 European Intelligence and Security Informatics Conference (EISIC) (pp. 180–180).

  • CATT-Acoustic. (2010). v8.0c, Room acoustic modelling software. Retrieved from http://www.catt.se.

  • Chen, Y.-W., & Lin, C.-J. (2006). Combining SVMs with various feature selection strategies. In Feature extraction. (pp. 315–324). Springer.

  • Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., & Dumouchel, P. (2009). Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In Tenth Annual Conference of the International Speech Communication Association.

  • Ganapathy, S., Pelecanos, J., & Omar, M. K. (2011). Feature normalization for speaker verification in room reverberation. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 4836–4839).

  • González-Rodríguez, J., Ortega-García, J., Martín, C., & Hernández, L. (1996). Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. In Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, (pp. 1333–1336).

  • Mammone, R. J., Zhang, X., & Ramachandran, R. P. (1996). Robust speaker recognition: A feature-based approach. IEEE Signal Processing Magazine, 13, 58.

    Article  Google Scholar 

  • Ming, J., Hazen, T. J., Glass, J. R., & Reynolds, D. A. (2007). Robust speaker recognition in noisy conditions. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1711–1723.

    Article  Google Scholar 

  • Ning, W., Ching, P. C., Nengheng, Z., & Tan, L. (2011). Robust speaker recognition using denoised vocal source and vocal tract features. IEEE Transactions on Audio, Speech, and Language Processing, 19, 196–205.

    Article  Google Scholar 

  • Petrick, R., Lohde, K., Wolff, M., & Hoffmann, R. (2007). The harming part of room acoustics in automatic speech recognition.

  • Rose, R. C., & Reynolds, D. A. (1990). Text independent speaker identification using automatic acoustic segmentation. In 1990 International Conference on Acoustics, Speech, and Signal Processing, ICASSP-90 (pp. 293–296).

  • Rossing, T. (2007). Introduction to acoustics. In Springer Handbook of Acoustics. (pp. 1–6). Springer, New York.

  • Sadjadi, S. O., Slaney, M., & Heck, L. (2013). MSR identity toolbox v1. 0: A MATLAB toolbox for speaker-recognition research. Speech and Language Processing Technical Committee Newsletter, 1(4), 1–32.

    Google Scholar 

  • Sehr, A., Habets, E. A., Maas, R., & Kellermann, W. (2010). Towards a better understanding of the effect of reverberation on speech recognition performance. In Proc. IWAENC.

  • Shao, Y., Jin, Z., Wang, D., & Srinivasan, S. (2009). An auditory-based feature for robust speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009 (pp. 4625-4628).

  • Zhao, X., Shao, Y., & Wang, D. (2012). CASA-based robust speaker identification. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1608–1616.

    Article  Google Scholar 

  • Zhao, X., Wang, Y., & Wang, D. (2014). Robust speaker identification in noisy and reverberant conditions.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khamis A. Al-Karawi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Karawi, K.A. Mitigate the reverberation effect on the speaker verification performance using different methods. Int J Speech Technol 24, 143–153 (2021). https://doi.org/10.1007/s10772-020-09780-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-020-09780-1

Keywords

Navigation