Skip to main content

Linear Versus Nonlinear Multi-scale Decomposition for Co-channel Speaker Identification System

  • Chapter
  • First Online:
Recent Advances in Nonlinear Speech Processing

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 48))

  • 806 Accesses

Abstract

Co-channel speech is a combination of speech utterances over a single communication channel. Traditional approach to co-channel speech processing is to attempt to extract the speech of the speaker of interest (target speech) from other (interfering) speech. Usable speech criteria are proposed to extract minimally corrupted speech for speaker identification in co-channel speech. In this paper, we present usable speech extraction method based on pitch information obtained from linear multi-scale decomposition by dyadic wavelet transform and nonlinear multi-scale decomposition by empirical mode decomposition. Detected usable speech are organized into speaker stream, and applied to speaker identification system. The proposed methods are evaluated and compared across various Target to Interferer Ratio (TIR) for speaker identification system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Quatieri, Danisewicz: An approach to co-channel talker interference suppression using a sinusoidal model for speech. IEEE Trans. Acoust. Speech Sig. Process. 38, 56–69 (1990)

    Google Scholar 

  2. Lovekin, J., Yantorno, R.E., Benincasa, S., Wenndt, S., Huggins, M.: Developing usable speech criteria for speaker identification. Proc. ICASSP 421–424 (2001)

    Google Scholar 

  3. Yantorno, R.E.: Method for improving speaker identification by determining usable speech. J. Acoust. Soc. Am. 124 (2008)

    Google Scholar 

  4. Krishnamachari, K.R., Yantorno, R.E., Benincasa, D.S., Wenndt, S.J.: Spectral autocorrelation ratio as a usability measure of speech segments under cochannel conditions. IEEE Int. Symp. Intell. Sig. Process. Commun. Syst. (2000)

    Google Scholar 

  5. Lovekin, J., Krishnamachari, K.R., Yantorno, R.E., Benincasa, D.S., Wenndt, S.J.: Adjacent pitch period comparison (APPC) as a usability measure of speech segments under cochannel conditions. IEEE Intell. Sig. Process. Commun. Syst. 139–142 (2001)

    Google Scholar 

  6. Smolenski, B.Y., Ramachandran, R.P.: Usable speech processing: a filterless approach in the presence of interference. IEEE Circuits Syst. Mag. (2011)

    Google Scholar 

  7. Kizhanatham, Yantorno, R.E.: Peak difference autocorrelation of wavelet transform algorithm based usable speech measure, 7th world multi-conference on systemic, cybernetics, and informatics, (2003)

    Google Scholar 

  8. Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Usable speech detection for speaker identification system under co-channel conditions, international conference on electrical system and automatic control JTEA, Tunisia (2010)

    Google Scholar 

  9. Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Evaluation of a multi-resolution dyadic wavelet transform method for usable speech detection. World Acad. Sci. Eng. Technol. J. WASET 829–833 (2011). pISSN:2010-376X, eISSN:2010-3778

    Google Scholar 

  10. Huang, N.E., Shen, Z., Long, S.R., et al.: The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A 454, 903–995 (1998)

    Google Scholar 

  11. Flandrin, P., Rilling, G., Goncalves, P.: Empirical mode decomposition as a filter bank. IEEE Sig. Process. Lett. 11(2), 112114 (2004)

    Google Scholar 

  12. Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Usable speech detection based on empirical mode decomposition. IET Electron. Lett. 49(7), (2013)

    Google Scholar 

  13. Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Multi-resolution analysis by empirical mode decomposition for usable speech detection, international multi-conference on systems, signals devices, conference on communication signal processing, SSD, Tunisia (2013)

    Google Scholar 

  14. Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Improved, E.M.D., usable speech detection for co-channel speaker identification. Lecture Notes in Computer Science, Advances in Non-Linear Speech Processing, 7911, pp. 184–191; International Conference: Non Linear Speech Processing 2013, (NOLISP 2013). Mons, Belgium (2013)

    Google Scholar 

  15. Hess, W.H.: Pitch Determination of Speech Signal: Algorithms and Devices. Springer, Heidelberg (1983)

    Google Scholar 

  16. Ghezaiel, W., Ben Slimane, A., Ben Braiek, E.: Usable speech assignment for speaker identification under co-channel situation. Int. J. Comput. Appl. 59(18), 7–11, (2012)

    Google Scholar 

  17. Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 17, 91108 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wajdi Ghezaiel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Ghezaiel, W., Ben Slimane, A., Ben Braiek, E. (2016). Linear Versus Nonlinear Multi-scale Decomposition for Co-channel Speaker Identification System. In: Esposito, A., et al. Recent Advances in Nonlinear Speech Processing. Smart Innovation, Systems and Technologies, vol 48. Springer, Cham. https://doi.org/10.1007/978-3-319-28109-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28109-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28107-0

  • Online ISBN: 978-3-319-28109-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics