Skip to main content

Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation

  • Conference paper
Latent Variable Analysis and Signal Separation (LVA/ICA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6365))

Abstract

This paper presents a new statistical model for speech signals, which consists of a time-invariant dictionary incorporating a set of the power spectral densities of excitation signals and a set of all-pole filters where the gain of each pair of excitation and filter elements is allowed to vary over time. We use this model to develop a combined blind separation and dereverberation method for speech. Reasonably good separations were obtained under a highly reverberant condition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Douglas, S., Sawada, H., Makino, S.: Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters. IEEE Trans. Speech, Audio Process. 13(1), 92–104 (2005)

    Article  Google Scholar 

  2. Smaragdis, P.: Blind separation of convolved mixtures in the frequency domain. Neur. Comp. 22, 21–34 (1998)

    MATH  Google Scholar 

  3. Nakatani, T., Yoshioka, T., Kinoshita, K., Miyoshi, M., Juang, B.-H.: Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation. In: Proc. Int’l. Conf. Acoust., Speech, Signal Process., pp. 85–88 (2008)

    Google Scholar 

  4. Yoshioka, T., Nakatani, T., Miyoshi, M., Okuno, H.G.: Blind separation and dereverberation of speech mixtures by joint optimization. IEEE Trans. Audio, Speech, Language Process (2010) (accepted for publication)

    Google Scholar 

  5. Dégerine, S., Zaïdi, A.: Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach. IEEE Trans. Signal Processing 52(6), 1499–1512 (2004)

    Article  Google Scholar 

  6. Kameoka, H., Kashino, K.: Composite Autoregressive System for Sparse Source-Filter Representation of Speech. In: Proc. 2009 IEEE International Symposium on Circuits and Systems (ISCAS 2009), pp. 2477–2480 (2009)

    Google Scholar 

  7. Benaroya, L., Bimbot, F., Gribonval, R.: Audio source separation with a single sensor. IEEE Trans. Audio Speech Language Processing 14(1), 191–199 (2006)

    Article  Google Scholar 

  8. Févotte, C., Bertin, N., Durrieu, J.-L.: Nonnegative matrix factorization,with the Itakura-Saito divergence. With application to music analysis. Neural Comput. 21(3), 793–830 (2009)

    Article  MATH  Google Scholar 

  9. Ozerov, A., Févotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio, Speech, Language Process. 18(3), 550–563 (2010)

    Article  Google Scholar 

  10. Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)

    Article  Google Scholar 

  11. Sawada, H., Araki, S., Makino, S.: Measuring dependence of binwise separated signals for permutation alignment in frequency-domain BSS. In: Proc. Int’l. Symp. Circ., Syst., pp. 3247–3250 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kameoka, H., Yoshioka, T., Hamamura, M., Le Roux, J., Kashino, K. (2010). Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15995-4_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15994-7

  • Online ISBN: 978-3-642-15995-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics