Loading [a11y]/accessibility-menu.js
Alignment of Lyrics With Accompanied Singing Audio Based on Acoustic-Phonetic Vowel Likelihood Modeling | IEEE Journals & Magazine | IEEE Xplore

Alignment of Lyrics With Accompanied Singing Audio Based on Acoustic-Phonetic Vowel Likelihood Modeling


Abstract:

This study addresses the task of aligning lyrics with accompanied singing recordings. With a vowel-only representation of lyric syllables, our approach evaluates likeliho...Show More

Abstract:

This study addresses the task of aligning lyrics with accompanied singing recordings. With a vowel-only representation of lyric syllables, our approach evaluates likelihood scores of vowel types with glottal pulse shapes and formant frequencies extracted from a small set of singing examples. The proposed vowel likelihood model is used in conjunction with a prior model of frame-wise syllable sequence in determining an optimal evolution of syllabic position. In lyrics alignment experiments, we optimized numerical parameters on two independent development sets and then tested the optimized system on two other datasets. New objective performance measures are introduced in the evaluation to provide further insight into the quality of alignment. Use of glottal pulse shapes and formant frequencies is shown by a controlled experiment to account for a 0.07 difference in average normalized alignment error. Another controlled experiment demonstrates that, with a difference of 0.03, F0-invariant glottal pulse shape gives a lower average normalized alignment error than does F0-invariant spectrum envelope, the latter being assumed by MFCC-based timbre models.
Page(s): 1998 - 2008
Date of Publication: 27 July 2016

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.