Self-Supervised Speaker Verification with Adaptive Threshold and Hierarchical Training | IEEE Conference Publication | IEEE Xplore

Self-Supervised Speaker Verification with Adaptive Threshold and Hierarchical Training


Abstract:

In self-supervised speaker verification, the quality of generated pseudo labels becomes a bottleneck for the performance. This work introduces a dynamic threshold within ...Show More

Abstract:

In self-supervised speaker verification, the quality of generated pseudo labels becomes a bottleneck for the performance. This work introduces a dynamic threshold within the iterative DIstillation with NO labels (DINO) framework. We employ a Gaussian Mixture Model (GMM) to model the loss distribution of the training data. The GMM has two components: one represents samples with reliable labels, and the other with un-reliable ones. These components help us determine a thresh-old for retaining samples with reliable labels. Furthermore, to take advantage of the different sensitivity of network layers to label noise, we further introduce hierarchical training to reduce the negative impact of unreliable labels. Compared to the baseline with a fixed threshold, our two strategies result in an 8.9% relative improvement on the Vox-O trial of the Voxceleb1 evaluation dataset.
Date of Conference: 14-19 April 2024
Date Added to IEEE Xplore: 18 March 2024
ISBN Information:

ISSN Information:

Conference Location: Seoul, Korea, Republic of

Contact IEEE to Subscribe

References

References is not available for this document.