Abstract
Speech enhancement is a very important pre-processing step in various speech processing applications such as speech recognition, speaker identification, speech coding, and speech synthesis. In this paper, we focus on speech enhancement prior to speaker identification, because the degradations of the speech signals may cause difficulties in hearing, understanding, and speaker recognition. The paper presents a hybrid speech enhancement method based on empirical mode decomposition combined with spectral subtraction to improve the quality of speech signals prior to speaker identification. Simulation results show an improvement in speaker recognition rates with the proposed speech enhancement method as a pre-processing step.
Similar content being viewed by others
References
Abd El-samie, F. E. (2011). Information security for automatic speaker identification., Springer briefs in electrical and computer engineering New York: Springer.
Hossain, Md. M., Ahmed, B. & Asrafi, M. (2007). A real time speaker identification using artificial neural network. Department of Computer Science & Engineering, 1-4244-1551-9/07/$25.00 ©2007 IEEE.
Alotaiby, T., Alshebeili, S. A., Alshawi, T., Ahmad, I., & Abd El-Samie, F. E. (2014). EEG seizure detection and prediction algorithms: A survey. EURASIP Journal on Advances in Signal Processing, 2014, 1–21.
Campbell, J. P. (1997). Speaker recognition: A tutorial. In Proceedings of the IEEE (Vol. 85).
das, A., Jena, M. R., Barik, K. K. (2014). Mel-frequency cepstral coefficient (MFCC)—a novel method for speaker recognition. Digital Technologies, 1(1), 1–3. Available online at http://pubs.sciepub.com/dt/1/1/1©ScienceandEducationPublishing.
Evans, N. W. D., Mason, J. S., Liu, W. M. & Fauve, B. (2005). On the fundamental limitations of spectra subtraction: An assessment by automatic speech recognition. Swansea: University of Wales Swansea Singleton Park. http://eegalilee.swan.ac.uk.
Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions of Acoustics, and Signal Processing, 29, 254–272.
Goel, P., & Garg, A. (2011). Review of spectral subtraction techniques for speech enhancement. Haryana: Electronics and Communication Department, M.M. University, Mullana, Ambala.
Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.
Karam, M., Khazaal, H. F., Aglan, H., & Cole, C. (2014). Noise removal in speech processing using spectral subtraction. Journal of Signal and Information Processing, 5, 32–41.
Kim, D., & Oh, H.-S. (2009). EMD: A package for empirical mode decomposition and hilbert spectrum. The R Journal, 1, 40–46.
Kondo, K. (2012). Subjective Quality Measurement of Speech. Berlin: Springer. doi:10.1007/978-3-642-27506-7_2.
Love, B. J., Vining, J. & Sun, X. (2004). Automatic speaker recognition using neural networks. EE371D intro. To neural networks. Austin: The University of Texas.
Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing 2(3), 138–143, https://sites.google.com/site/journalofcomputing/.
Pawar, A. P.,Choudhari, K. B. (2013). Enhancement of speech in noisy conditions. The International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 2(7).
Reynolds, D. A. (2002). An overview of automatic speaker recognition technology. In Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on (Vol. 4). Orlando, FL: IEEE. Accessed 13–17 May 2002.
Rilling, G., Flandrin, P., & Goncalv`es, P. (2003). On empirical mode decomposition and its algorithms. Lyon: Ecole Normale Sup´erieure de Lyon.
Samudravijaya, K. (2003). Speech and speaker recognition: A tutorial. Mumbai: Tata Institute of Fundamental Research.
Sharma, A., Singh, S. P., Kumar, V. (2005). Text-independent speaker identification using back propagation MLP network classifier for a closed set of speaker. In 2005 IEEE International Symposium on Signal Processing and Information Technology. Allahabad: Indian Institute of Information Technology.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abd El-Moneim, S., Dessouky, M.I., Abd El-Samie, F.E. et al. Hybrid speech enhancement with empirical mode decomposition and spectral subtraction for efficient speaker identification. Int J Speech Technol 18, 555–564 (2015). https://doi.org/10.1007/s10772-015-9293-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-015-9293-5