The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation

Araki, Shoko; Ozerov, Alexey; Gowreesunker, Vikrham; Sawada, Hiroshi; Theis, Fabian; Nolte, Guido; Lutter, Dominik; Duong, Ngoc Q. K.

doi:10.1007/978-3-642-15995-4_15

Shoko Araki²¹,
Alexey Ozerov²²,
Vikrham Gowreesunker²³,
Hiroshi Sawada²¹,
Fabian Theis²⁴,
Guido Nolte²⁵,
Dominik Lutter²⁴ &
…
Ngoc Q. K. Duong²²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6365))

Included in the following conference series:

International Conference on Latent Variable Analysis and Signal Separation

3262 Accesses
23 Citations

Abstract

This paper introduces the audio part of the 2010 community-based Signal Separation Evaluation Campaign (SiSEC2010). Seven speech and music datasets were contributed, which include datasets recorded in noisy or dynamic environments, in addition to the SiSEC2008 datasets. The source separation problems were split into five tasks, and the results for each task were evaluated using different objective performance criteria. We provide an overview of the audio datasets, tasks and criteria. We also report the results achieved with the submitted systems, and discuss organization strategies for future campaigns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

The 2015 Signal Separation Evaluation Campaign

The 2016 Signal Separation Evaluation Campaign

The 2018 Signal Separation Evaluation Campaign

References

Cooke, M.P., Hershey, J., Rennie, S.: Monaural speech separation and recognition challenge. Computer Speech and Language 24, 1–15 (2010)
Article Google Scholar
Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.: First stereo audio source separation evaluation campaign: Data, algorithms and results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007)
Chapter Google Scholar
Vincent, E., Araki, S., Bofill, P.: The signal separation evaluation campaign: A community-based approach to large-scale evaluation. In: Adali, T., Jutten, C., Romano, J.M.T., Barros, A.K. (eds.) ICA 2009. LNCS, vol. 5441, pp. 734–741. Springer, Heidelberg (2009)
Chapter Google Scholar
Vincent, E., Gribonval, R., Plumbley, M.D.: Oracle estimators for the benchmarking of source separation algorithms. Signal Processing 87(8), 1933–1950 (2007)
Article MATH Google Scholar
Wang, D.L.: On ideal binary mask as the computational goal of auditory scene analysis. In: Speech Separation by Humans and Machines. Springer, Heidelberg (2005)
Google Scholar
Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE Trans. on Audio, Speech and Language Processing 14(4), 1462–1469 (2006)
Article Google Scholar
Emiya, V., Vincent, E., Harlander, N., Hohmann, V.: Subjective and objective quality assessment of audio source separation. IEEE Trans. on Audio, Speech and Language Processing (submitted)
Google Scholar
Huber, R., Kollmeier, B.: PEMO-Q – a new method for objective audio quality assessment using a model of auditory perception. IEEE Trans. on Audio, Speech, and Language Processing 14(6), 1902–1911 (2006)
Article Google Scholar
Ozerov, A., Vincent, E., Bimbot, F.: A general modular framework for audio source separation. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 33–40. Springer, Heidelberg (2010)
Google Scholar
Rickard, S.: The DUET blind source separation algorithm. In: Blind Speech Separation, Springer, Heidelberg (2007)
Google Scholar
Sawada, H., Araki, S., Makino, S.: Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. on Audio, Speech and Language Processing (in press)
Google Scholar
Arberet, S., Ozerov, A., Duong, N.Q.K., Vincent, E., Gribonval, R., Bimbot, F., Vandergheynst, P.: Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation. In: Proc. ISSPA (2010)
Google Scholar
Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using local observed covariance and auditory-motivated time-frequency representation. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 73–80. Springer, Heidelberg (2010)
Google Scholar
Vu, D.H.T., Haeb-Umbach, R.: Blind speech separation employing directional statistics in an expectation maximization framework. In: Proc. ICASSP, pp. 241–244 (2010)
Google Scholar
Ono, N., Miyabe, S.: Auxiliary-function-based independent component analysis for super-gaussian sources. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 165–172. Springer, Heidelberg (2010)
Google Scholar
Sawada, H., Araki, S., Makino, S.: Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS. In: Proc. ISCAS, pp. 3247–3250 (2007)
Google Scholar
Schobben, D., Torkkola, K., Smaragdis, P.: Evaluation of blind signal separation methods. In: Proc. ICA, pp. 261–266 (1999)
Google Scholar
Koldovsky, Z., Tichavsky, P., Malek, J.: Time-domain blind audio source separation method producing separating filters of generalized feedforward structure. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 17–24. Springer, Heidelberg (2010)
Google Scholar
Koldovsky, Z., Tichavsky, P., Malek, J.: Subband blind audio source separation using a time-domain algorithm and tree-structured QMF filter bank. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 25–32. Springer, Heidelberg (2010)
Google Scholar
Loesch, B., Yang, B.: Blind source separation based on time-frequency sparseness in the presence of spatial aliasing. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 1–8. Springer, Heidelberg (2010)
Google Scholar
Nesta, F., Svaizer, P., Omologo, M.: Convolutive BSS of short mixtures by ICA recursively regularized across frequencies. IEEE Trans. on Audio, Speech, and Language Processing 14 (2006)
Google Scholar
Yoshioka, T., Nakatani, T., Miyoshi, M., Okuno, H.G.: Blind separation and dereverberation of speech mixtures by joint optimization. IEEE Trans. on Audio, Speech, and Language Processing (2010) (in press)
Google Scholar
Málek, J., Koldovský, Z., Tichavský, P.: Adaptive time-domain blind separation of speech signals. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 9–16. Springer, Heidelberg (2010)
Google Scholar
Loesch, B., Yang, B.: Adaptive segmentation and separation of determined convolutive mixtures under dynamic conditions. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 41–48. Springer, Heidelberg (2010)
Google Scholar
Araki, S., Sawada, H., Makino, S.: Blind speech separation in a meeting situation. In: Proc. ICASSP, pp. 41–45 (2007)
Google Scholar
Ozerov, A., Fevotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. on Audio, Speech and Language Processing 18(3), 550–563 (2010)
Article Google Scholar
Bonada, J., Loscos, A., Vinyes Raso, M.: Demixing commercial music productions via human-assisted time-frequency masking. In: Proc. AES (2006)
Google Scholar
Spiertz, M., Gnann, V.: Unsupervised note clustering for multichannel blind source separation. In: Proc. LVA/ICA (2010) (submitted)
Google Scholar
Even, J., Saruwatari, H., Shikano, K., Takatani, T.: Speech enhancement in presence of diffuse background noise: Why using blind signal extraction? In: Proc. ICASSP, pp. 4770–4773 (2010)
Google Scholar
Okamoto, R., Takahashi, Y., Saruwatari, H., Shikano, K.: MMSE STSA estimator with nonstationary noise estimation based on ICA for high-quality speech enhancement. In: Proc. ICASSP, pp. 4778–4781 (2010)
Google Scholar
Takahashi, Y., Takatani, T., Osako, K., Saruwatari, H., Shikano, K.: Blind spatial subtraction array for speech enhancement in noisy environment. IEEE Trans. on Audio, Speech and Language Processing 17(4), 650–664 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

NTT Communication Science Labs., NTT Corporation, Japan,
Shoko Araki & Hiroshi Sawada
INRIA, Centre Inria Rennes - Bretagne Atlantique, France,
Alexey Ozerov & Ngoc Q. K. Duong
DSPS R&D Center, Texas Instruments Inc., USA,
Vikrham Gowreesunker
IBIS, Helmholtz Zentrum München, Germany,
Fabian Theis & Dominik Lutter
Fraunhofer Institute FIRST IDA, Germany
Guido Nolte

Authors

Shoko Araki
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Ozerov
View author publications
You can also search for this author in PubMed Google Scholar
Vikrham Gowreesunker
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Sawada
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Theis
View author publications
You can also search for this author in PubMed Google Scholar
Guido Nolte
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Lutter
View author publications
You can also search for this author in PubMed Google Scholar
Ngoc Q. K. Duong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, Universitè d’Evry Val d’Essone, 40 rue du Pelvoux, 91020, Courcouronnes, France
Vincent Vigneron
Laboratoire I3S, Les Algorithmes - Euclide-B, BP 121, Université de Nice-Sophia Antipolis, 2000 Route des Lucioles, 06903, Sophia Antipolis Cedex, France
Vicente Zarzoso
School of Engineering, Dept. of Telecommunications, ISITSchool of Engineering, Dept. of Telecommunications, ISITV, Université de Toulon, Avenue George Pompidou, BP 56, La Valette du Var, Cedex, 83162, France
Eric Moreau
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes cedex, France
Rémi Gribonval
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes Cedex, France
Emmanuel Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Araki, S. et al. (2010). The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-15995-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15994-7
Online ISBN: 978-3-642-15995-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics