Skip to main content

Cross-Lingual English Spanish Tonal Accent Labeling Using Decision Trees and Neural Networks

  • Conference paper
Advances in Nonlinear Speech Processing (NOLISP 2011)

Abstract

In this paper we present an experimental study on how corpus-based automatic prosodic information labeling can be transferred from a source language to a different target language. The Spanish ESMA corpus is used to train models for the identification of the prominent words. Then, the models are used to identify the accented words of the English Boston University Radio News Corpus (BURNC). The inverse process (training the models with English data and testing with the Spanish corpus) is also contrasted with the results obtained in the conventional scenario: training and testing using the same corpus. We got up to 82.7% correct annotation rates in cross-lingual experiments, which contrast slightly with the accuracy obtained in a mono-lingual single speaker scenarios (86.6% for Spanish and 80.5% for English). Speaker independent monolingual recognition experiments have been also performed with the BURNC corpus, leading to cross-speakers results that go from 69.3% to 84.2% recognition rates. As these results are comparable to the ones obtained in the cross-lingual scenario we conclude that the new approach we defend has to face up with similar challenges as the ones presented in speaker independent scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ananthakrishnan, S., Narayanan, S.: Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence. IEEE Transactions on Audio, Speech, and Language Processing 16(1), 216–228 (2008)

    Article  Google Scholar 

  2. Bonafonte, A., Moreno, A.: Documentation of the upc-esma spanish database. Tech. rep., TALP Research Center, Universitat Politecnica de Catalunya, Barcelona, Spain (2008)

    Google Scholar 

  3. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)

    MATH  Google Scholar 

  4. Escudero, D., Cardeñoso, V.: Applying data mining techniques to corpus based prosodic modeling speech. Speech Communication 49, 213–229 (2007)

    Article  Google Scholar 

  5. Escudero-Mancebo, D., Vivaracho Pascual, C., González Ferreras, C., Cardeñoso-Payo, V., Aguilar, L.: Analysis of inconsistencies in cross-lingual automatic ToBI tonal accent labeling. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 41–48. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Gonzalez, C., Vivaracho, C., Escudero, D., Cardenoso, V.: On the Automatic ToBI Accent Type Identification from Data. In: Interspeech 2010 (2010)

    Google Scholar 

  7. Gori, M.: Are multilayer perceptrons adequate for pattern recognition and verification? IEEE Trans. on Pattern Analysis and Machine Intelligence 20(11), 1121–1132 (1998)

    Article  Google Scholar 

  8. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1), 10–18 (2009)

    Article  Google Scholar 

  9. Meteer, M., Schwartz, R.M., Weischedel, R.M.: Post: Using probabilities in language processing. In: IJCAI, pp. 960–965 (1991)

    Google Scholar 

  10. Ostendorf, M., Price, P., Shattuck, S.: The boston university radio news corpus. Tech. rep., Boston University (1995)

    Google Scholar 

  11. Prieto, P., Rosedano, P.: Transcription of Intonation of the Spanish Language. LINCOM Studies in Phonetics, vol. 06 (2010)

    Google Scholar 

  12. Rangarajan Sridhar, V., Bangalore, S., Narayanan, S.: Exploiting Acoustic and Syntactic Features for Automatic Prosody Labeling in a Maximum Entropy Framework. IEEE Transactions on Audio, Speech, and Language Processing 16(4), 797–811 (2008)

    Article  Google Scholar 

  13. Syrdal, A.K., Hirshberg, J., McGory, J., Beckman, M.: Automatic ToBI prediction and alignment to speed manual labeling of prosody. Speech Communication (33), 135–151 (2001)

    Google Scholar 

  14. Vivaracho-Pascual, Simon-Hurtado, A.: Improving ann performance for imbalanced data sets by means of the ntil technique. In: IEEE International Joint Conference on Neural Networks (July 18-23, 2010)

    Google Scholar 

  15. Wightman, C., Ostendorf, M.: Automatic labeling of prosodic patterns. IEEE Transactions on Speech and Audio Processing 2(4), 469–481 (1994)

    Article  Google Scholar 

  16. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Escudero-Mancebo, D., Aguilar, L., González Ferreras, C., Vivaracho Pascual, C., Cardeñoso-Payo, V. (2011). Cross-Lingual English Spanish Tonal Accent Labeling Using Decision Trees and Neural Networks. In: Travieso-González, C.M., Alonso-Hernández, J.B. (eds) Advances in Nonlinear Speech Processing. NOLISP 2011. Lecture Notes in Computer Science(), vol 7015. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25020-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25020-0_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25019-4

  • Online ISBN: 978-3-642-25020-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics