Abstract
This paper presents the results obtained for the EVALITA 2011 Forced Alignment on Spontaneous Speech task, including some aspects explored for the generation of the lexicon. A classical system was used for the alignment and several tests were performed to determine the impact of frame shift size and the use of speaker adaptation on the accuracy of the alignment. Good segmentation results were obtained, the proposed system outperforming the other teams’ systems. Furthermore, phonetic change rules were determined on the train set and employed in the alignment process, improving significantly the performance of the system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wu, S.L., Kingsbury, B., Morgan, N., Greenberg, S.: Incorporating information from syllable-length time scales into automatic speech recognition. In: Proc. of IEEE ICASSP, pp. 721–724 (1998)
Kominek, J., Bennet, C., Black, A.W.: Evaluating and correcting phoneme segmentation for unit slection synthesis. In: Proc. of EUROSPEECH 2003, pp. 313–316 (2003)
Schuppler, B., Ernestus, M., Scharenborg, O., Boves, L.: Acoustic reduction in conversational Dutch: A quantitative analysis based on automatically generated segmental transcriptions. Journal of Phonetics 39, 96–109 (2011)
Greenberg, S., Chang, S.: Linguistic dissection of switchboard-corpus automatic speech recognition systems. In: Proc. ISCA Workshop on Automatic Speech Recognition: Challenges for the New Millennium, pp. 195–202 (2000)
Angelini, B., Brugnara, F., Falavigna, D., Giuliani, D., Gretter, R., Omologo, M.: Automatic segmentation and labeling of English and Italian speech databases. In: Proc. of EUROSPEECH 1993, pp. 653–656 (1993)
Cangemi, F., Cutugno, F., Ludusan, B., Seppi, D., Van Compernolle, D.: ASSI - automatic speech segmentation for Italian: tools, models, evaluation and applications. In: Proc. of the 7th AISV Conference (2011)
Savy, R., Cutugno, F.: CLIPS: diatopic, diamesic and diaphasic variations of spoken Italian. In: Proc. of the 5th Corpus Linguistics Conference (2009)
Demuynck, K., Roelens, J., Van Compernolle, D., Wambacq, P.: SPRAAK: an open source SPeech Recognition and Automatic Annotation Kit. In: Proc. of INTERSPEECH 2008, pp. 495–499 (2008)
WR2ST, http://www.clips.unina.it/
Sclite software package, http://www.nist.gov/speech/tools/
Chen, L., Liu, Y., Harper, M., Maia, E., McRoy, S.: Evaluating factors impacting the accuracy of forced alignments in a multimodal corpus. In: Proc. of Language Resource and Evaluation Conference, pp. 759–762 (2004)
Cosi, P., Gretter, R., Tesser, F.: Festival parla italiano. In: Proc. of GFS 2000 (2000)
Tajchman, G., Foster, E., Jurafsky, D.: Building multiple pronunciation models for novel words using exploratory computational phonology. In: Proc. of EUROSPEECH 1995, pp. 2247–2250 (1995)
Van Bael, C., van den Heuvel, H., Strik, H.: Investigating speech style specific pronunciation variation in large spoken language corpora. In: Proc. of INTERSPEECH 2004, pp. 2793–2796 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ludusan, B. (2013). UNINA System for the EVALITA 2011 Forced Alignment Task. In: Magnini, B., Cutugno, F., Falcone, M., Pianta, E. (eds) Evaluation of Natural Language and Speech Tools for Italian. EVALITA 2012. Lecture Notes in Computer Science(), vol 7689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35828-9_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-35828-9_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35827-2
Online ISBN: 978-3-642-35828-9
eBook Packages: Computer ScienceComputer Science (R0)