Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation

Kameoka, Hirokazu; Yoshioka, Takuya; Hamamura, Mariko; Le Roux, Jonathan; Kashino, Kunio

doi:10.1007/978-3-642-15995-4_31

Hirokazu Kameoka²¹,
Takuya Yoshioka²¹,
Mariko Hamamura²¹,
Jonathan Le Roux²¹ &
…
Kunio Kashino²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6365))

Included in the following conference series:

International Conference on Latent Variable Analysis and Signal Separation

3143 Accesses
26 Citations

Abstract

This paper presents a new statistical model for speech signals, which consists of a time-invariant dictionary incorporating a set of the power spectral densities of excitation signals and a set of all-pole filters where the gain of each pair of excitation and filter elements is allowed to vary over time. We use this model to develop a combined blind separation and dereverberation method for speech. Reasonably good separations were obtained under a highly reverberant condition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Douglas, S., Sawada, H., Makino, S.: Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters. IEEE Trans. Speech, Audio Process. 13(1), 92–104 (2005)
Article Google Scholar
Smaragdis, P.: Blind separation of convolved mixtures in the frequency domain. Neur. Comp. 22, 21–34 (1998)
MATH Google Scholar
Nakatani, T., Yoshioka, T., Kinoshita, K., Miyoshi, M., Juang, B.-H.: Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation. In: Proc. Int’l. Conf. Acoust., Speech, Signal Process., pp. 85–88 (2008)
Google Scholar
Yoshioka, T., Nakatani, T., Miyoshi, M., Okuno, H.G.: Blind separation and dereverberation of speech mixtures by joint optimization. IEEE Trans. Audio, Speech, Language Process (2010) (accepted for publication)
Google Scholar
Dégerine, S., Zaïdi, A.: Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach. IEEE Trans. Signal Processing 52(6), 1499–1512 (2004)
Article Google Scholar
Kameoka, H., Kashino, K.: Composite Autoregressive System for Sparse Source-Filter Representation of Speech. In: Proc. 2009 IEEE International Symposium on Circuits and Systems (ISCAS 2009), pp. 2477–2480 (2009)
Google Scholar
Benaroya, L., Bimbot, F., Gribonval, R.: Audio source separation with a single sensor. IEEE Trans. Audio Speech Language Processing 14(1), 191–199 (2006)
Article Google Scholar
Févotte, C., Bertin, N., Durrieu, J.-L.: Nonnegative matrix factorization,with the Itakura-Saito divergence. With application to music analysis. Neural Comput. 21(3), 793–830 (2009)
Article MATH Google Scholar
Ozerov, A., Févotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio, Speech, Language Process. 18(3), 550–563 (2010)
Article Google Scholar
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Article Google Scholar
Sawada, H., Araki, S., Makino, S.: Measuring dependence of binwise separated signals for permutation alignment in frequency-domain BSS. In: Proc. Int’l. Symp. Circ., Syst., pp. 3247–3250 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa, 243-0198, Japan
Hirokazu Kameoka, Takuya Yoshioka, Mariko Hamamura, Jonathan Le Roux & Kunio Kashino

Authors

Hirokazu Kameoka
View author publications
You can also search for this author in PubMed Google Scholar
Takuya Yoshioka
View author publications
You can also search for this author in PubMed Google Scholar
Mariko Hamamura
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Le Roux
View author publications
You can also search for this author in PubMed Google Scholar
Kunio Kashino
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, Universitè d’Evry Val d’Essone, 40 rue du Pelvoux, 91020, Courcouronnes, France
Vincent Vigneron
Laboratoire I3S, Les Algorithmes - Euclide-B, BP 121, Université de Nice-Sophia Antipolis, 2000 Route des Lucioles, 06903, Sophia Antipolis Cedex, France
Vicente Zarzoso
School of Engineering, Dept. of Telecommunications, ISITSchool of Engineering, Dept. of Telecommunications, ISITV, Université de Toulon, Avenue George Pompidou, BP 56, La Valette du Var, Cedex, 83162, France
Eric Moreau
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes cedex, France
Rémi Gribonval
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes Cedex, France
Emmanuel Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kameoka, H., Yoshioka, T., Hamamura, M., Le Roux, J., Kashino, K. (2010). Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-15995-4_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15994-7
Online ISBN: 978-3-642-15995-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics