Skip to main content
Log in

Statistical Translation of Text and Speech: First Results with the RWTH System

  • Published:
Machine Translation

Abstract

In this paper, we describe a first version of a system for statisticaltranslation and present experimental results. The statistical translationapproach uses two types of information: a translation model and a languagemodel. The language model used is a standard bigram model. The translationmodel is decomposed into lexical and alignment models. After presenting the details of the alignment model, we describe the search problem and present a dynamic programming-based solution for the special case of monotone alignments.So far, the system has been tested on two limited-domain tasks for which abilingual corpus is available: the EuTrans traveller task (Spanish–English,500-word vocabulary) and the Verbmobil task (German–English, 3000-wordvocabulary). We present experimental results on these tasks. In addition to the translation of text input, we also address the problem of speech translation and suitable integration of the acoustic recognition process and the translation process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alshawi, H. and F. Xiang: 1997, ‘English-to-Mandarin Speech Translation with Head Transducers’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, Spain, pp. 54–60.

  • Amengual, J. C., J. M. Benedí, F. Casacuberta, A. Castaño, A. Castellanos, D. Llorens, A. Marzal, F. Prat, E. Vidal, and J. M. Vilar: 1997, ‘Using Categories in the EUTRANS System’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, pp. 44–53.

  • Baum, L. E.: 1972, ‘An Inequality and Associated Maximization Technique in Statistical Estimation of Probabilistic Functions of a Markov Process’, Inequalities 3, 1–8.

    Google Scholar 

  • Berger, A. L., P. F. Brown, J. Cocke, S. A. Della Pietra, V. J. Della Pietra, J. R. Gillett, J. D. Lafferty, R. L. Mercer, H. Printz, and L. Ures: 1994, ‘The Candide System for Machine Translation’, Proceedings of the ARPA Human Language Technology Workshop, Plainsboro, NJ, pp. 152–157.

  • Brown, P. F., S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer: 1993, ‘The Mathematics of StatisticalMachine Translation: Parameter Estimation’, Computational Linguistics 19, 263–311.

    Google Scholar 

  • Dagan, I., K. W. Church, and W. A. Gale: 1993, ‘Robust Bilingual Word Alignment for Machine Aided Translation’, Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, Columbus, OH, pp. 1–8.

  • Fung, P. and K. W. Church: 1994, ‘K-vec: A New Approach for Aligning Parallel Texts’, COLING 94: The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 1096–1102.

  • Horiguchi, K. and A. Franz: 1997, ‘A Formal Basis for Spoken Language Translation by Analogy’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, Spain, pp. 32–39.

  • Jelinek, F.: 1976, ‘Speech Recognition by Statistical Methods’, Proceedings of the IEEE 64, 532–556.

    Google Scholar 

  • Kay, M. and M. Röscheisen: 1993, ‘Text-Translation Alignment’, Computational Linguistics 19, 121–142.

    Google Scholar 

  • Lavie, A., L. Levin, A.Waibel, D. Gates, M. Gavalda, and L. Mayfield: 1996, ‘JANUS Multi-Lingual Translation of Spontaneous Speech in a Limited Domain’, Expanding MT Horizons: Proceedings of the Second Conference of the Association for Machine Translation in the Americas, Montreal, Quebec, pp. 252–255.

  • Ney, H., D. Mergel, A. Noll, and A. Paeseler: 1992, ‘Data Driven Search Organization for Continuous Speech Recognition in the SPICOS System’, IEEE Transactions on Signal Processing SP-40, 272–281.

    Google Scholar 

  • Nießen, S., S. Vogel, H. Ney, and C. Tillmann: 1998, ‘A DP Based Search Algorithm for Statistical Machine Translation’, '98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, pp. 960–967.

  • Och, F. J. and C. Tillmann: 1998, Unpublished Report. RWTH Aachen, July 1998.

  • Och, F. J. and H. Weber: 1998, ‘Improving Statistical Natural Language Translation with Categories and Rules’, '98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, pp. 985–989.

  • Reithinger, N., R. Engel, M. Kipp, and M. Klesen: 1996, Predicting Dialogue Acts for a Speech-To-Speech Translation System, Verbmobil-Report 151, DFKI Saarbrücken; available via ftp from http://verbmobil.dfki.de/cgi-bin/verbmobil/htbin/doc-access.cgi.

  • Sawaf, H.: 1997, Satzumstellendes syntaktisches Parsing zur stochastischen Ñbersetzung spontaner Sprache [Syntactic parsing with word-order changes for stochastic translation of spontaneous speech], diploma thesis, RWTH Aachen, Germany, December 1997.

    Google Scholar 

  • Tillmann, C., S. Vogel, H. Ney, A. Zubiaga, and H. Sawaf: 1997, ‘Accelerated DP Based Search for Statistical Translation’, Fifth European Conference on Speech Communication and Technology, Rhodos, Greece, pp. 2667–2670.

  • Vidal, E.: 1997, ‘Finite-State Speech-to-Speech Translation’, IEEE International Conference on Acoustics, Speech and Signal Processing, Munich, Germany, pp. 111–114.

  • Vilar, J. M., E. Vidal, and J. C. Amengual: 1996, ‘Learning Extended Finite State Models for Language Translation’, Proceedings of Extended Finite State Models Workshop of the 12 th European Conference on Artificial Intelligence, Budapest, Hungary, pp. 92–96.

  • Vogel, S., H. Ney, and C. Tillmann: 1996, ‘HMM-Based Word Alignment in Statistical Translation’, COLING-96: The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 836–841.

  • Wahlster, W.: 1993: ‘Verbmobil Translation of Face-To-Face Dialogs’, The Fourth Machine Translation Summit MT Summit IV Proceedings “International Cooperation for Global Communication”, Kobe, Japan, pp. 127–135.

  • Wessel, F., W. Macherey, and R. Schlüter: 1998, ‘Using Probabilities as Confidence Measures’, International Conference on Acoustics, Speech and Signal Processing, Seattle,WA, pp. 225–228.

  • Wu, D.: 1996, ‘A Polynomial-Time Algorithm for Statistical Machine Translation’, 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA, pp. 152–158.x

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tillimann, C., Vogel, S., Ney, H. et al. Statistical Translation of Text and Speech: First Results with the RWTH System. Machine Translation 15, 43–73 (2000). https://doi.org/10.1023/A:1011168216856

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011168216856

Navigation