ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Impairment Representation Learning for Speech Quality Assessment

Lianwu Chen, Xinlei Ren, Xu Zhang, Xiguang Zheng, Chen Zhang, Liang Guo, Bing Yu

Non-intrusive speech quality assessment has been a crucial task for speech processing. In recent years, methods based on deep neural network have achieved the start-of-the-art performance for non-intrusive speech quality assessment. However, the scarcity of annotated data is usually the main challenge for training robust speech quality assessment networks. In this paper, we proposed an impairment representation learning approach to pre-train the network on a large amount of simulated data without MOS annotation. Then we further fine-tune the pre-trained model for the MOS prediction task on annotated data. The experimental results show that the proposed pre-training methods can significantly improve the performance for speech quality assessment, especially when the annotated training data is limited. Besides, the proposed method significantly outperforms the baseline system of ConferencingSpeech 2022 Challenge.


doi: 10.21437/Interspeech.2022-11295

Cite as: Chen, L., Ren, X., Zhang, X., Zheng, X., Zhang, C., Guo, L., Yu, B. (2022) Impairment Representation Learning for Speech Quality Assessment. Proc. Interspeech 2022, 3323-3327, doi: 10.21437/Interspeech.2022-11295

@inproceedings{chen22t_interspeech,
  author={Lianwu Chen and Xinlei Ren and Xu Zhang and Xiguang Zheng and Chen Zhang and Liang Guo and Bing Yu},
  title={{Impairment Representation Learning for Speech Quality Assessment}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={3323--3327},
  doi={10.21437/Interspeech.2022-11295}
}