Mining Intonation Corpora Using Knowledge Driven Sequential Clustering

Escudero-Mancebo, David; Cardeñoso-Payo, Valentín

doi:10.1007/11874850_40

David Escudero-Mancebo²¹ &
Valentín Cardeñoso-Payo²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4140))

Included in the following conference series:

951 Accesses

Abstract

This work presents a mining methodology designed to cope with the usual data scarcity problems of intonation corpora which arises from the high variability of prosodic information. The methodology is an adaptation of a basic agglomerative clustering technique, guided by a set of domain constraints. The peculiarities of the text-to-speech intonation modelling problem are considered in order to fix the initial configuration of the cluster and the criteria to merge classes and stopping their splitting. The scarcity problem poses the need to apply a sequential selection mechanism of prosodic features, in order to obtain the initial set of classes in the cluster. A searching strategy to select the best class among a set of alternatives is proposed, which provides useful prediction models for accurate synthetic intonation. Visualization of final classes by means of a modified decision tree brings graphical cues about contrastable prosodic information of the intonation corpus.

This work has been partially sponsored by Spanish Government (MCYT project TIC2003-08382-C05-03) and by Consejería de Educación (JCYL project VA053A05).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Prototype of the Software System for Study, Training and Analysis of Speech Intonation

Sequential Three-Way Decisions in Efficient Classification of Piecewise Stationary SpeechSignals

Unsupervised Learning of Word Segmentation: Does Tone Matter?

References

Aguado, P.D., Wimmer, K., Bonafonte, A.: Joint extraction and prediction of fujisaki’s intonation model parameters. In: Proceedings of Eurospeech 2005 (2005)
Google Scholar
Allen, J., Hunnicutt, M.S., Klatt, D.: From Text to Speech: The MITalk System. Cambridge University Press, Cambridge (1987)
Google Scholar
Botinis, A., Granstrom, B., Moebius, B.: Developments and Paradigms in Intonation Research. Speech Communications 33, 263–296 (2001)
Article MATH Google Scholar
Cardeoso, V., Escudero, D.: A strategy to solve data scarcity problems in corpus based intonation modelling. In: Proceedings of ICASSP 2004 (2004)
Google Scholar
Escudero, D.: Modelado Estadstico de Entonacin con Funciones de Bzier: Aplicaciones a la Conversin Texto Voz. PhD thesis, Dpto. de Informtica, Universidad de Valladolid, Espaa (2002)
Google Scholar
Escudero, D., Cardeoso, V., Bonafonte, A.: Corpus based extraction of quantitative prosodic parameters of stress groups in spanish. In: Proceedings of ICASSP 2002, Mayo (2002)
Google Scholar
Escudero, D., Cardeoso, V.: Optimized selection of intonation dictionaries in corpus based intonation modelling. In: Proceedings of Eurospeech (September 2005)
Google Scholar
Gerhard, D.: Pitch extraction and fundamental frequency: History and current techniques. Technical Report TR-CS 2003-06, Department of Computer Science, University of Regina, Regina, Saskatchewan, CANADA (November 2003)
Google Scholar
Hart, J., Collier, R., Cohen, A.: A perceptual study of intonation. An experimental approach to speech melody. Cambridge University Press, Cambridge (1990)
Book Google Scholar
Hermes, D.J.: Measuring the perceptual similarity of pitch contours. Journal of Speech, Language, and Hearing Research 41, 73–82 (1994)
Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)
Article Google Scholar
Joskisch, O., Mixdorff, H., Kruschke, H., Kordon, U.: Learning the parameters of quantitative prosody models. In: Proceedings of ICSLP 2000 (2000)
Google Scholar
Navarro-Toms, T.: Manual de Entonacin Espaola. Madrid, Guadarrama (1944)
Google Scholar
Sakai, S.: Additive modeling of english f0 contours for speech synthesis. In: Proceedings of ICASSP 2005 (2005)
Google Scholar
Shriberg, E., Ferrer, L., Kajarekar, S., Venkataraman, A., Stolcke, A.: Modeling Prosodic Feature Sequences for Speaker Recognition. Speech Communication 46(3-4), 455–472 (2005)
Article Google Scholar
Shriberg, E., Stolcke, A., Hakkani, D., Tur, G.: Prosody-Based Automatic Segmentation into Sentences and Topics. Speech Communication 32(1-2), 127–154 (2000)
Article Google Scholar
Sosa, J.M.: La Entonacin del Espaol. Ctedra (1999)
Google Scholar
Sproat, R.: Multilingual Text-to-Speech Synthesis. Kluwer, Dordrecht (1998)
Google Scholar
Taylor, P.: Analysis and Synthesis of Intonation using the Tilt Model. Journal of Acoustical Society of America 107(3), 1697–1714 (2000)
Article Google Scholar
Webb, A.: Statistical Pattern Recognition, 2nd edn. Wiley, Chichester (2002)
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Valladolid, Valladolid, 47071, Spain
David Escudero-Mancebo & Valentín Cardeñoso-Payo

Authors

David Escudero-Mancebo
View author publications
You can also search for this author in PubMed Google Scholar
Valentín Cardeñoso-Payo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Laboratório de Técnicas Inteligentes (LTI) Escola Politécnica (EP), Universidade de São Paulo (USP),
Jaime Simão Sichman
Dep. de Informática, Universidade de Lisboa, Campo Grande, 1749-016, Lisboa, Portugal
Helder Coelho
Institute of Mathematics and Computer Science, Department of Computer Science, University of São Paulo,, Av. Trabalhador Sao-Carlense, 400, Centro, CP: 668, 13560-970, São Carlos, SP, Brazil
Solange Oliveira Rezende

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Escudero-Mancebo, D., Cardeñoso-Payo, V. (2006). Mining Intonation Corpora Using Knowledge Driven Sequential Clustering. In: Sichman, J.S., Coelho, H., Rezende, S.O. (eds) Advances in Artificial Intelligence - IBERAMIA-SBIA 2006. IBERAMIA SBIA 2006 2006. Lecture Notes in Computer Science(), vol 4140. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11874850_40

Download citation

DOI: https://doi.org/10.1007/11874850_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45462-5
Online ISBN: 978-3-540-45464-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics