Improving Naturalness and Controllability of Sequence-to-Sequence Speech Synthesis by Learning Local Prosody Representations | IEEE Conference Publication | IEEE Xplore