Abstract
Co-channel speech is a combination of speech utterances over a single communication channel. Traditional approach to co-channel speech processing is to attempt to extract the speech of the speaker of interest (target speech) from other (interfering) speech. Usable speech criteria are proposed to extract minimally corrupted speech for speaker identification in co-channel speech. In this paper, we present usable speech extraction method based on pitch information obtained from linear multi-scale decomposition by dyadic wavelet transform and nonlinear multi-scale decomposition by empirical mode decomposition. Detected usable speech are organized into speaker stream, and applied to speaker identification system. The proposed methods are evaluated and compared across various Target to Interferer Ratio (TIR) for speaker identification system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Quatieri, Danisewicz: An approach to co-channel talker interference suppression using a sinusoidal model for speech. IEEE Trans. Acoust. Speech Sig. Process. 38, 56–69 (1990)
Lovekin, J., Yantorno, R.E., Benincasa, S., Wenndt, S., Huggins, M.: Developing usable speech criteria for speaker identification. Proc. ICASSP 421–424 (2001)
Yantorno, R.E.: Method for improving speaker identification by determining usable speech. J. Acoust. Soc. Am. 124 (2008)
Krishnamachari, K.R., Yantorno, R.E., Benincasa, D.S., Wenndt, S.J.: Spectral autocorrelation ratio as a usability measure of speech segments under cochannel conditions. IEEE Int. Symp. Intell. Sig. Process. Commun. Syst. (2000)
Lovekin, J., Krishnamachari, K.R., Yantorno, R.E., Benincasa, D.S., Wenndt, S.J.: Adjacent pitch period comparison (APPC) as a usability measure of speech segments under cochannel conditions. IEEE Intell. Sig. Process. Commun. Syst. 139–142 (2001)
Smolenski, B.Y., Ramachandran, R.P.: Usable speech processing: a filterless approach in the presence of interference. IEEE Circuits Syst. Mag. (2011)
Kizhanatham, Yantorno, R.E.: Peak difference autocorrelation of wavelet transform algorithm based usable speech measure, 7th world multi-conference on systemic, cybernetics, and informatics, (2003)
Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Usable speech detection for speaker identification system under co-channel conditions, international conference on electrical system and automatic control JTEA, Tunisia (2010)
Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Evaluation of a multi-resolution dyadic wavelet transform method for usable speech detection. World Acad. Sci. Eng. Technol. J. WASET 829–833 (2011). pISSN:2010-376X, eISSN:2010-3778
Huang, N.E., Shen, Z., Long, S.R., et al.: The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A 454, 903–995 (1998)
Flandrin, P., Rilling, G., Goncalves, P.: Empirical mode decomposition as a filter bank. IEEE Sig. Process. Lett. 11(2), 112114 (2004)
Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Usable speech detection based on empirical mode decomposition. IET Electron. Lett. 49(7), (2013)
Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Multi-resolution analysis by empirical mode decomposition for usable speech detection, international multi-conference on systems, signals devices, conference on communication signal processing, SSD, Tunisia (2013)
Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Improved, E.M.D., usable speech detection for co-channel speaker identification. Lecture Notes in Computer Science, Advances in Non-Linear Speech Processing, 7911, pp. 184–191; International Conference: Non Linear Speech Processing 2013, (NOLISP 2013). Mons, Belgium (2013)
Hess, W.H.: Pitch Determination of Speech Signal: Algorithms and Devices. Springer, Heidelberg (1983)
Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Usable speech assignment for speaker identification under co-channel situation. Int. J. Comput. Appl. 59(18), 7–11, (2012)
Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 17, 91108 (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Ghezaiel, W., Ben Slimane, A., Ben Braiek, E. (2016). Linear Versus Nonlinear Multi-scale Decomposition for Co-channel Speaker Identification System. In: Esposito, A., et al. Recent Advances in Nonlinear Speech Processing. Smart Innovation, Systems and Technologies, vol 48. Springer, Cham. https://doi.org/10.1007/978-3-319-28109-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-28109-4_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28107-0
Online ISBN: 978-3-319-28109-4
eBook Packages: EngineeringEngineering (R0)