ISCA Archive SSW 2016
ISCA Archive SSW 2016

Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration features

Mahsa Sadat Elyasi Langarani, Jan van Santen

Prosodic phrase boundaries (PBs) are a key aspect of spoken communication. In automatic PB detection, it is common to use local acoustic features, textual features, or a combination of both. Most approaches – regardless of features used – succeed in detecting major PBs (break score “4” in ToBI annotation, typically involving a pause) while detection of intermediate PBs (break score “3” in ToBI annotation) is still challenging. In this study we investigate the detection of intermediate, “pauseless” PBs using prosodic models, using a new corpus characterized by strong prosodic dynamics and an existing (CMU) corpus. We show how using duration and fundamental frequency modeling can improve detection of these PBs, as measured by the F1 score, compared to Festival, which only uses textual features to detect PBs. We believe that this study contributes to our understanding of the prosody of phrase breaks.


doi: 10.21437/SSW.2016-1

Cite as: Elyasi Langarani, M.S., van Santen, J. (2016) Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration features. Proc. 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9), 1-6, doi: 10.21437/SSW.2016-1

@inproceedings{elyasilangarani16_ssw,
  author={Mahsa Sadat {Elyasi Langarani} and Jan {van Santen}},
  title={{Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration features}},
  year=2016,
  booktitle={Proc. 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9)},
  pages={1--6},
  doi={10.21437/SSW.2016-1}
}