Skip to main content
Log in

Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Speech enhancement is a very important pre-processing step in various speech processing applications such as speech recognition, speaker identification, speech coding, and speech synthesis. In this paper, we focus on speech enhancement prior to speaker identification, because the degradations of the speech signals may cause difficulties in hearing, understanding, and speaker recognition. The paper presents a hybrid speech enhancement method based on empirical mode decomposition combined with spectral subtraction to improve the quality of speech signals prior to speaker identification. Simulation results show an improvement in speaker recognition rates with the proposed speech enhancement method as a pre-processing step.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  • Abd El-samie, F. E. (2011). Information security for automatic speaker identification., Springer briefs in electrical and computer engineering New York: Springer.

    Book  Google Scholar 

  • Hossain, Md. M., Ahmed, B. & Asrafi, M. (2007). A real time speaker identification using artificial neural network. Department of Computer Science & Engineering, 1-4244-1551-9/07/$25.00 ©2007 IEEE.

  • Alotaiby, T., Alshebeili, S. A., Alshawi, T., Ahmad, I., & Abd El-Samie, F. E. (2014). EEG seizure detection and prediction algorithms: A survey. EURASIP Journal on Advances in Signal Processing, 2014, 1–21.

    Article  Google Scholar 

  • Campbell, J. P. (1997). Speaker recognition: A tutorial. In Proceedings of the IEEE (Vol. 85).

  • das, A., Jena, M. R., Barik, K. K. (2014). Mel-frequency cepstral coefficient (MFCC)—a novel method for speaker recognition. Digital Technologies, 1(1), 1–3. Available online at http://pubs.sciepub.com/dt/1/1/1©ScienceandEducationPublishing.

  • Evans, N. W. D., Mason, J. S., Liu, W. M. & Fauve, B. (2005). On the fundamental limitations of spectra subtraction: An assessment by automatic speech recognition. Swansea: University of Wales Swansea Singleton Park. http://eegalilee.swan.ac.uk.

  • Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions of Acoustics, and Signal Processing, 29, 254–272.

    Article  Google Scholar 

  • Goel, P., & Garg, A. (2011). Review of spectral subtraction techniques for speech enhancement. Haryana: Electronics and Communication Department, M.M. University, Mullana, Ambala.

  • Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.

    Article  Google Scholar 

  • Karam, M., Khazaal, H. F., Aglan, H., & Cole, C. (2014). Noise removal in speech processing using spectral subtraction. Journal of Signal and Information Processing, 5, 32–41.

    Article  Google Scholar 

  • Kim, D., & Oh, H.-S. (2009). EMD: A package for empirical mode decomposition and hilbert spectrum. The R Journal, 1, 40–46.

    Google Scholar 

  • Kondo, K. (2012). Subjective Quality Measurement of Speech. Berlin: Springer. doi:10.1007/978-3-642-27506-7_2.

    Article  Google Scholar 

  • Love, B. J., Vining, J. & Sun, X. (2004). Automatic speaker recognition using neural networks. EE371D intro. To neural networks. Austin: The University of Texas.

  • Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing 2(3), 138–143, https://sites.google.com/site/journalofcomputing/.

  • Pawar, A. P.,Choudhari, K. B. (2013). Enhancement of speech in noisy conditions. The International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 2(7).

  • Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on (Vol. 4). Orlando, FL: IEEE. Accessed 13–17 May 2002.

  • Rilling, G., Flandrin, P., & Goncalv`es, P. (2003). On empirical mode decomposition and its algorithms. Lyon: Ecole Normale Sup´erieure de Lyon.

    Google Scholar 

  • Samudravijaya, K. (2003). Speech and speaker recognition: A tutorial. Mumbai: Tata Institute of Fundamental Research.

    Google Scholar 

  • Sharma, A., Singh, S. P., Kumar, V. (2005). Text-independent speaker identification using back propagation MLP network classifier for a closed set of speaker. In 2005 IEEE International Symposium on Signal Processing and Information Technology. Allahabad: Indian Institute of Information Technology.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fathi E. Abd El-Samie.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abd El-Moneim, S., Dessouky, M.I., Abd El-Samie, F.E. et al. Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification. Int J Speech Technol 18, 555–564 (2015). https://doi.org/10.1007/s10772-015-9293-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-015-9293-5

Keywords

Navigation