Leveraging Non-Causal Knowledge via Cross-Network Knowledge Distillation for Real-Time Speech Enhancement | IEEE Journals & Magazine | IEEE Xplore

Leveraging Non-Causal Knowledge via Cross-Network Knowledge Distillation for Real-Time Speech Enhancement


Abstract:

To improve real-time speech enhancement (SE) while maintaining efficiency, researchers have adopted knowledge distillation (KD). However, when the same network type as th...Show More

Abstract:

To improve real-time speech enhancement (SE) while maintaining efficiency, researchers have adopted knowledge distillation (KD). However, when the same network type as the real-time SE student model is used as a teacher model, the performance of the teacher model can be unsatisfactory, thereby limiting the effectiveness of KD. To overcome this limitation, we propose cross-network non-causal knowledge distillation (CNNC-Distill). CNNC-Distill enables knowledge transfer between networks of different types, allowing the use of a teacher model with a different network type compared to the real-time SE student model. To maximize the KD effect, a non-real-time SE model unconstrained by causality conditions is adopted as the teacher model. CNNC-Distill transfers the non-causal knowledge of the non-real-time SE teacher model to a real-time SE student model using feature and output distillation. We also introduce a time-domain network, RT-SENet, used as the real-time SE student model. Results on the Valentini dataset show the efficiency of RT-SENet and the significant performance improvement achieved by CNNC-Distill.
Published in: IEEE Signal Processing Letters ( Volume: 31)
Page(s): 1129 - 1133
Date of Publication: 16 April 2024

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.