Abstract
This paper introduces the audio part of the 2010 community-based Signal Separation Evaluation Campaign (SiSEC2010). Seven speech and music datasets were contributed, which include datasets recorded in noisy or dynamic environments, in addition to the SiSEC2008 datasets. The source separation problems were split into five tasks, and the results for each task were evaluated using different objective performance criteria. We provide an overview of the audio datasets, tasks and criteria. We also report the results achieved with the submitted systems, and discuss organization strategies for future campaigns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cooke, M.P., Hershey, J., Rennie, S.: Monaural speech separation and recognition challenge. Computer Speech and Language 24, 1–15 (2010)
Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.: First stereo audio source separation evaluation campaign: Data, algorithms and results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007)
Vincent, E., Araki, S., Bofill, P.: The signal separation evaluation campaign: A community-based approach to large-scale evaluation. In: Adali, T., Jutten, C., Romano, J.M.T., Barros, A.K. (eds.) ICA 2009. LNCS, vol. 5441, pp. 734–741. Springer, Heidelberg (2009)
Vincent, E., Gribonval, R., Plumbley, M.D.: Oracle estimators for the benchmarking of source separation algorithms. Signal Processing 87(8), 1933–1950 (2007)
Wang, D.L.: On ideal binary mask as the computational goal of auditory scene analysis. In: Speech Separation by Humans and Machines. Springer, Heidelberg (2005)
Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE Trans. on Audio, Speech and Language Processing 14(4), 1462–1469 (2006)
Emiya, V., Vincent, E., Harlander, N., Hohmann, V.: Subjective and objective quality assessment of audio source separation. IEEE Trans. on Audio, Speech and Language Processing (submitted)
Huber, R., Kollmeier, B.: PEMO-Q – a new method for objective audio quality assessment using a model of auditory perception. IEEE Trans. on Audio, Speech, and Language Processing 14(6), 1902–1911 (2006)
Ozerov, A., Vincent, E., Bimbot, F.: A general modular framework for audio source separation. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 33–40. Springer, Heidelberg (2010)
Rickard, S.: The DUET blind source separation algorithm. In: Blind Speech Separation, Springer, Heidelberg (2007)
Sawada, H., Araki, S., Makino, S.: Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. on Audio, Speech and Language Processing (in press)
Arberet, S., Ozerov, A., Duong, N.Q.K., Vincent, E., Gribonval, R., Bimbot, F., Vandergheynst, P.: Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation. In: Proc. ISSPA (2010)
Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using local observed covariance and auditory-motivated time-frequency representation. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 73–80. Springer, Heidelberg (2010)
Vu, D.H.T., Haeb-Umbach, R.: Blind speech separation employing directional statistics in an expectation maximization framework. In: Proc. ICASSP, pp. 241–244 (2010)
Ono, N., Miyabe, S.: Auxiliary-function-based independent component analysis for super-gaussian sources. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 165–172. Springer, Heidelberg (2010)
Sawada, H., Araki, S., Makino, S.: Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS. In: Proc. ISCAS, pp. 3247–3250 (2007)
Schobben, D., Torkkola, K., Smaragdis, P.: Evaluation of blind signal separation methods. In: Proc. ICA, pp. 261–266 (1999)
Koldovsky, Z., Tichavsky, P., Malek, J.: Time-domain blind audio source separation method producing separating filters of generalized feedforward structure. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 17–24. Springer, Heidelberg (2010)
Koldovsky, Z., Tichavsky, P., Malek, J.: Subband blind audio source separation using a time-domain algorithm and tree-structured QMF filter bank. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 25–32. Springer, Heidelberg (2010)
Loesch, B., Yang, B.: Blind source separation based on time-frequency sparseness in the presence of spatial aliasing. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 1–8. Springer, Heidelberg (2010)
Nesta, F., Svaizer, P., Omologo, M.: Convolutive BSS of short mixtures by ICA recursively regularized across frequencies. IEEE Trans. on Audio, Speech, and Language Processing 14 (2006)
Yoshioka, T., Nakatani, T., Miyoshi, M., Okuno, H.G.: Blind separation and dereverberation of speech mixtures by joint optimization. IEEE Trans. on Audio, Speech, and Language Processing (2010) (in press)
Málek, J., Koldovský, Z., Tichavský, P.: Adaptive time-domain blind separation of speech signals. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 9–16. Springer, Heidelberg (2010)
Loesch, B., Yang, B.: Adaptive segmentation and separation of determined convolutive mixtures under dynamic conditions. In: Vigneron, V. (ed.) LVA/ICA 2010. LNCS, vol. 6365, pp. 41–48. Springer, Heidelberg (2010)
Araki, S., Sawada, H., Makino, S.: Blind speech separation in a meeting situation. In: Proc. ICASSP, pp. 41–45 (2007)
Ozerov, A., Fevotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. on Audio, Speech and Language Processing 18(3), 550–563 (2010)
Bonada, J., Loscos, A., Vinyes Raso, M.: Demixing commercial music productions via human-assisted time-frequency masking. In: Proc. AES (2006)
Spiertz, M., Gnann, V.: Unsupervised note clustering for multichannel blind source separation. In: Proc. LVA/ICA (2010) (submitted)
Even, J., Saruwatari, H., Shikano, K., Takatani, T.: Speech enhancement in presence of diffuse background noise: Why using blind signal extraction? In: Proc. ICASSP, pp. 4770–4773 (2010)
Okamoto, R., Takahashi, Y., Saruwatari, H., Shikano, K.: MMSE STSA estimator with nonstationary noise estimation based on ICA for high-quality speech enhancement. In: Proc. ICASSP, pp. 4778–4781 (2010)
Takahashi, Y., Takatani, T., Osako, K., Saruwatari, H., Shikano, K.: Blind spatial subtraction array for speech enhancement in noisy environment. IEEE Trans. on Audio, Speech and Language Processing 17(4), 650–664 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Araki, S. et al. (2010). The 2010 Signal Separation Evaluation Campaign (SiSEC2010): Audio Source Separation. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-15995-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15994-7
Online ISBN: 978-3-642-15995-4
eBook Packages: Computer ScienceComputer Science (R0)