Automatic Prosody Labeling Using Multiple Models for Japanese

Ryuki TACHIBANA
Tohru NAGANO
Gakuto KURATA
Masafumi NISHIMURA
Noboru BABAGUCHI

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E90-D    No.11    pp.1805-1812
Publication Date: 2007/11/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e90-d.11.1805
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
Keyword: 
prosody recognition,  mora accent,  prosodic phrase boundary,  text-to-speech synthesis,  

Full Text: PDF(346.2KB)>>
Buy this Article



Summary: 
Automatic prosody labeling is the task of automatically annotating prosodic labels such as syllable stresses or break indices into speech corpora. Prosody-labeled corpora are important for speech synthesis and automatic speech understanding. However, the subtleness of physical features makes accurate labeling difficult. Since errors in the prosodic labels can lead to incorrect prosody estimation and unnatural synthetic sound, the accuracy of the labels is a key factor for text-to-speech (TTS) systems. In particular, mora accent labels relevant to pitch are very important for Japanese, since Japanese is a pitch-accent language and Japanese people have a particularly keen sense of pitch accents. However, the determination of the mora accents of Japanese is a more difficult task than English stress detection in a way. This is because the context of words changes the mora accents within the word, which is different from English stress where the stress is normally put at the lexical primary stress of a word. In this paper, we propose a method that can accurately determine the prosodic labels of Japanese using both acoustic and linguistic models. A speaker-independent linguistic model provides mora-level knowledge about the possible correct accentuations in Japanese, and contributes to reduction of the required size of the speaker-dependent speech corpus for training the other stochastic models. Our experiments show the effectiveness of the combination of models.


open access publishing via