Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures

Lin, Ju; Xie, Yanlu; Zhang, Jinsong

doi:10.21437/Interspeech.2016-1162

Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures

Ju Lin, Yanlu Xie, Jinsong Zhang

Automatic evaluation of tonal production plays an important role in a tonal language Computer-Assisted Pronunciation Training (CAPT) system. In this paper, we propose an automatic evaluation method for non-native Mandarin tones. The method applied multi-level confidence measures generated from Deep Neural Network (DNN). The confidence measures consisted of Log Posterior Ratios (LPR), Average Frame-level Log Posteriors (AFLP) and Segment-level Log Posteriors (SLP). The LPR was calculated between the correct tone model and competing tone models. The AFLP and LPR were obtained from frame-level scores. And the SLP was directly derived from segment-level scores. The multi-level confidence measures were modeled with a support vector machine (SVM) classifier. For comparison, three experiments were conducted according to different features: AFLP+LPR, SLP only and AFLP+LPR+SLP. The experimental results showed that the performance of the system which used multi-level confidence measures was the best, achieving a FRR of 5.63% and a DA of 82.45%, which demonstrated the efficiency of the proposed method.

doi: 10.21437/Interspeech.2016-1162

Cite as: Lin, J., Xie, Y., Zhang, J. (2016) Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures. Proc. Interspeech 2016, 2666-2670, doi: 10.21437/Interspeech.2016-1162

@inproceedings{lin16_interspeech,
  author={Ju Lin and Yanlu Xie and Jinsong Zhang},
  title={{Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={2666--2670},
  doi={10.21437/Interspeech.2016-1162}
}