Loading [MathJax]/extensions/MathMenu.js
On Training Speech Separation Models With Various Numbers of Speakers | IEEE Journals & Magazine | IEEE Xplore

On Training Speech Separation Models With Various Numbers of Speakers


Abstract:

Many monaural speech separation models assume that the exact number of speakers is known in advance, which is not applicable to many real-world scenarios. To deal with an...Show More

Abstract:

Many monaural speech separation models assume that the exact number of speakers is known in advance, which is not applicable to many real-world scenarios. To deal with an unknown number of speakers, previous approaches either iteratively separate one speech at a time, or employ a more relaxed assumption that the maximum number of speakers is known a priori and set the number of outputs accordingly. When the number of speakers in the mixture is smaller than the number of outputs in the latter case, the extra outputs that are not mapped onto signals in the input mixture are trained to produce predefined target signals such as the silence or the input mixture. In this letter, we propose to ignore the extra outputs in training instead of evaluating the cost with a certain target for separation models with a fixed number of output channels. We also introduce a method to select valid output signals. Experimental results showed that assigning any type of predefined targets degraded separation performance compared with ignoring the extra outputs.
Published in: IEEE Signal Processing Letters ( Volume: 30)
Page(s): 1202 - 1206
Date of Publication: 31 August 2023

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.