Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data

Toutios, Asterios; Sorensen, Tanner; Somandepalli, Krishna; Alexander, Rachel; Narayanan, Shrikanth S.

doi:10.21437/Interspeech.2016-596

Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data

Asterios Toutios, Tanner Sorensen, Krishna Somandepalli, Rachel Alexander, Shrikanth S. Narayanan

This paper presents a methodology for articulatory synthesis of running speech in American English driven by real-time magnetic resonance imaging (rtMRI) mid-sagittal vocal-tract data. At the core of the methodology is a time-domain simulation of the propagation of sound in the vocal tract developed previously by Maeda. The first step of the methodology is the automatic derivation of air-tissue boundaries from the rtMRI data. These articulatory outlines are then modified in a systematic way in order to introduce additional precision in the formation of consonantal vocal-tract constrictions. Other elements of the methodology include a previously reported set of empirical rules for setting the time-varying characteristics of the glottis and the velopharyngeal port, and a revised sagittal-to-area conversion. Results are promising towards the development of a full-fledged text-to-speech synthesis system leveraging directly observed vocal-tract dynamics.

doi: 10.21437/Interspeech.2016-596

Cite as: Toutios, A., Sorensen, T., Somandepalli, K., Alexander, R., Narayanan, S.S. (2016) Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data. Proc. Interspeech 2016, 1492-1496, doi: 10.21437/Interspeech.2016-596

@inproceedings{toutios16_interspeech,
  author={Asterios Toutios and Tanner Sorensen and Krishna Somandepalli and Rachel Alexander and Shrikanth S. Narayanan},
  title={{Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={1492--1496},
  doi={10.21437/Interspeech.2016-596}
}