Skip to main content

Cross-Language Acoustic Modeling for Macedonian Speech Technology Applications

  • Conference paper
  • 1018 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 207))

Abstract

This paper presents a cross-language development method for speech recognition and synthesis applications for Macedonian language. Unified system for speech recognition and synthesis trained on German language data was used for acoustic model bootstrapping and adaptation. Both knowledge-based and data-driven approaches for source and target language phoneme mapping were used for initial transcription and labeling of small amount of recorded speech. The recognition experiments on the source language acoustic model with target language dataset showed significant recognition performance degradation. Acceptable performance was achieved after Maximum a posteriori (MAP) model adaptation with limited amount of target language data, allowing suitable use for small to medium vocabulary speech recognition applications. The same unified system was used again to train new separate acoustic model for HMM based synthesis. Qualitative analysis showed, despite the low quality of the available recordings and sub-optimal phoneme mapping, that HMM synthesis produces perceptually good and intelligible synthetic speech.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Vu, N.T., Kraus, F., Schultz, T.: Rapid building of an ASR system for Under-Resourced Languages based on Multilingual Unsupervised training. In: Interspeech 2011, Florence, Italy, August 28 (2011)

    Google Scholar 

  2. Schultz, T., Waibel, A.: Experiments on Cross-language Acoustic Modeling. In: Proceedings of the 7th European Conference on Speech Communication and Technology, Eurospeech 2001, Aalborg, Denmark, p. 2721 (2001)

    Google Scholar 

  3. Le, V.B., Besacier, L.: First steps in fast acoustic modeling for a new target language: application to Vietnamese. In: ICASSP 2005, Philadelphia, USA, March 19-23, vol. 1, pp. 821–824 (2005)

    Google Scholar 

  4. Martin, T., Sridharan, S.: Cross-language acoustic model refinement for the Indonesian language. In: International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 865–868 (March 2005)

    Google Scholar 

  5. Lööf, J., Gollan, C., Ney, H.: Cross-language Bootstrapping for Unsupervised Acoustic Model Training: Rapid Development of a Polish Speech Recognition System. In: Interspeech, pp. 88–91 (September 2009)

    Google Scholar 

  6. Le, V.B., Besacier, L., Schultz, T.: Acoustic-Phonetic Unit Similarities for Context Dependent Acoustic Model Portability. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 (2006)

    Google Scholar 

  7. Chungurski, S., Kraljevski, I., Mihajlov, D., Arsenovski, S.: Concatenative speech synthesizers and speech corpus for Macedonian language. In: 30th International Conference on Information Technology Interfaces, Dubrovnik, Croatia, June 23-26, pp. 669–674 (2008)

    Google Scholar 

  8. Hoffmann, R., Eichner, M., Wolff, M.: Analysis of verbal and nonverbal acoustic signals with the Dresden UASR system. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) COST Action 2102. LNCS (LNAI), vol. 4775, pp. 200–218. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  9. Strecha, G., Wolff, M.: Speech synthesis using HMM based diphone inventory encoding for low-resource devices. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 22-27, pp. 5380–5383 (2011)

    Google Scholar 

  10. Bub, T., Schwinn, J.: VERBMOBIL: The Evolution of a Complex Large Speech-to-Speech Translation System. In: Int. Conf. on Spoken Language Processing, Philadelphia, PA, USA, vol. 4, pp. 2371–2374 (October 1996)

    Google Scholar 

  11. Gauvain, J.-L., Lee, C.-H.: Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains. IEEE Transactions on Speech and Audio Processing 2(2), 291–298 (1994)

    Article  Google Scholar 

  12. Imai, S., Sumita, K., Furuichi, C.: Mel log spectrum approximation (MLSA) filter for speech synthesis. Trans. IECE J66-A, 122–129 (1983)

    Google Scholar 

  13. Tokuda, K., et al.: Speech parameter generation algorithms for HMM-based speech synthesis. In: ICASSP. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Istanbul, June 5-9, vol. III, pp. 1315–1318. IEEE Computer Society Press, Los Alamitos (2000)

    Google Scholar 

  14. Hoffmann, R., Hirschfeld, D., Jokisch, O., Kordon, U., Mixdorff, H., Mehnert, D.: Evaluation of a multilingual TTS system with respect to the prosodic quality. In: Proc. 14th Intern. Congress of Phonetic Sciences (ICPhS), San Francisco, USA, August 1-7, pp. 2307–2310 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivan Kraljevski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kraljevski, I., Strecha, G., Wolff, M., Jokisch, O., Chungurski, S., Hoffmann, R. (2013). Cross-Language Acoustic Modeling for Macedonian Speech Technology Applications. In: Markovski, S., Gusev, M. (eds) ICT Innovations 2012. ICT Innovations 2012. Advances in Intelligent Systems and Computing, vol 207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37169-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37169-1_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37168-4

  • Online ISBN: 978-3-642-37169-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics