Skip to main content
Log in

Speaker recognition based on pre-processing approaches

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper presents two pre-processing methods that can be implemented for noise reduction in speaker recognition systems. These methods are adaptive noise canceller (ANC) and Savitzky-Golay (SG) filter. Also, discrete cosine transform (DCT), discrete wavelet transform (DWT) and discrete sine transform (DST) are considered for consistent feature extraction from noisy speech signals. A neural network with only one hidden layer is used as a classifier. The performances of the proposed noise reduction methods are compared with those of a hybrid method that comprises empirical mode decomposition (EMD) and spectral subtraction and also with spectral subtraction method only. Recognition rate is taken as a performance metric to evaluate the behavior of the system with these enhancement strategies. Simulation results prove that the DCT is the optimum transform with the suggested methods, while the DWT is the best one with the hybrid method and the spectral subtraction method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Abd El-Samie, F.E. (2011). El-Samie ‘information security for automatic speaker identification. Springer Briefs in Electrical and Computer Engineering.

  • Alotaiby, T., Alshebeili, S.A., Alshawi, T., Ahmad, I., Abd El-Samie, F.E. (2014). EEG seizure detection and prediction algorithms: A survey. EURASIP Journal on Advances in Signal Processing.

  • Anand, V., Shah, S., Kumar, S. (2013). Intelligent adaptive filtering for noise cancellation. International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering 2.

  • Awal, Md A, Mostafa, S. S., & Ahmad, M. (2011). Performance analysis of savitzky-golay smoothing filter using ECG signal. IJCIT,01, 90–95.

    Google Scholar 

  • Campbell, J.P. (1997). Speaker recognition: A tutorial, Senior Member, IEEE. In Proceedings of IEEE, Vol. 85.

  • Das, A., Jena, M. R., & Barik, K. K. (2014). Mel-frequency cepstral coefficient (MFCC)—A novel method for speaker recognition. Digital Technologies,1(1), 1–3.

    Google Scholar 

  • Dhubkarya, D. C., Katara, A., & Thenua, R. K. (2012). Simulation of adaptive noise canceller for an ECG signal analysis. International Journal on Signal & Image Processing,03(01), 1–12.

    Google Scholar 

  • Dreyfus, G. (2005). Neural networks: Methodology and applications (pp. 1–83). Berlin: Springer.

    Book  Google Scholar 

  • El-Moneim, S. A., Dessouky, M. I., Nassar, M. A., El-Naby, M. A., & Abd El-Samie, F. E. (2015). Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification. International Journal of Speech Technology,2015(18), 555–564. https://doi.org/10.1007/s10772-015-9293-5.

    Article  Google Scholar 

  • Evans, N.W.D., Mason, J.S., Liu, W.M. and Fauve, B. (2006). An assessment on the fundamental limitations of spectral subtraction. In IEEE (pp. 145–148).

  • Fisusi, A.A., Yesufu, T.K. (2007). Speaker recognition systems: A tutorial. African Journal of Information and Communication Technology, 3(2).

  • Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions of Acoustics, and Signal Processing,29(2), 254–272.

    Article  Google Scholar 

  • Galushkin, A. I. (2007). Neural network theory. Berlin: Springer.

    MATH  Google Scholar 

  • Guiñón, J. L. et al. (2007). Moving average and savitzki-golay smoothing filters using mathcad. In Proceedings of the International Conference on Engineering Education—ICEE.

  • Hossain, M.M., Ahmed, B. and Asrafi, M. (2007). A real time speaker identification using artificial neural network, conference: Computer and information technology, IEEE.

  • Jafari, M.G., and Chambers, J.A. (2003). Adaptive noise cancellation and blind source separation. In 4th International symposium on independent component analysis and blind signal separation (pp. 627–632).

  • Kaladharan, N. (2014). Speech enhancement by spectral subtraction method. International Journal of Computer Applications,96(13), 45–48.

    Article  Google Scholar 

  • Karam, M., Khazaal, H. F., Aglan, H., & Cole, C. (2014). Noise removal in speech processing using spectral subtraction. Journal of Signal and Information Processing,5, 32–41.

    Article  Google Scholar 

  • Kim, D., & Oh, H.-S. (2009). EMD: A package for empirical mode decomposition and hilbert spectrum. The R Journal,1(1), 40–46.

    Article  Google Scholar 

  • Love, B.J., Vining, J., Sun, X. (2004). Automatic speaker recognition using neural network, intro (pp. 1–25). Neural Networks Electrical and Computer Engineering Department, The University of Texas at Austin, Spring.

  • Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing,2, 138–143.

    Google Scholar 

  • Nasr, Marwa A., et al. (2018). Speaker identification based on normalized pitch frequency and Mel frequency cepstral coefficients. International Journal of Speech Technology. https://doi.org/10.1007/s10772-018-9524-7.

    Article  Google Scholar 

  • Pawar, A.P., Choudhari, K.B. (2013). Enhancement of speech in noisy conditions. The International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 2(7).

  • Reynolds, D.A. (2002). An overview of automatic speaker recognition technology. IEEE, pp. 4072–4075.

  • Rilling, G., Flandrin, P. and Goncalves, P. (2003). On empirical mode decomposition and its algorithms. In IEEE-EURASIP workshop on nonlinear signal and image processing.

  • Schafer, R.W. (2011). What is a Savitzky-Golay filter. IEEE Signal Processing Magazine.

  • Shajeesh, K. U., Sachin Kumar, S., & Soman, K. P. (2012). A two stage algorithm for denoising of speech signal. IOSR. Journal of Computer Engineering (IOSRJCE),8, 48–53.

    Article  Google Scholar 

  • Shanmugam, A., et al. (2013a). Adaptive noise cancellation for speech processing in real time environment. International Journal of Engineering Research and Applications (IJERA),3(2), 1102–1106.

    Google Scholar 

  • Shanmugam, et al. (2013b). Adaptive noise cancellation for speech processing in real time environment. International Journal of Engineering Research and Applications (IJERA),3, 1102–1106.

    Google Scholar 

  • Sharma, A., Singh, S. P., & Kumar, V. (2005). Text-independent speaker identification using backpropagation MLP network classifier for a closed set of speakers (pp. 665–669). IEEE International Symposium on Signal Processing and Information Technology: INDIA.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fathi E. Abd El-Samie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

El-Moneim, S.A., EL-Rabaie, ES.M., Nassar, M.A. et al. Speaker recognition based on pre-processing approaches. Int J Speech Technol 23, 435–442 (2020). https://doi.org/10.1007/s10772-019-09659-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-019-09659-w

Keywords

Navigation