Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion

Diener, Lorenz; Schultz, Tanja

doi:10.21437/Interspeech.2018-2080

Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion

Lorenz Diener, Tanja Schultz

This paper presents an analysis of the influence of various system parameters on the output quality of our neural network based real-time EMG-to-Speech conversion system. This EMG-to-Speech system allows for the direct conversion of facial surface electromyographic signals into audible speech in real time, allowing for a closed-loop setup where users get direct audio feedback. Such a setup opens new avenues for research and applications through co-adaptation approaches. In this paper, we evaluate the influence of several parameters on the output quality, such as time context, EMG-Audio delay, network-, training data- and Mel spectrogram size. The resulting output quality is evaluated based on the objective output quality measure STOI.

doi: 10.21437/Interspeech.2018-2080

Cite as: Diener, L., Schultz, T. (2018) Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion. Proc. Interspeech 2018, 3162-3166, doi: 10.21437/Interspeech.2018-2080

@inproceedings{diener18_interspeech,
  author={Lorenz Diener and Tanja Schultz},
  title={{Investigating Objective Intelligibility in Real-Time EMG-to-Speech Conversion}},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3162--3166},
  doi={10.21437/Interspeech.2018-2080}
}