Statistical Translation of Text and Speech: First Results with the RWTH System

Tillimann, Christoph; Vogel, Stephan; Ney, Hermann; Sawaf, Hassan

doi:10.1023/A:1011168216856

Statistical Translation of Text and Speech: First Results with the RWTH System

Published: June 2000

Volume 15, pages 43–73, (2000)
Cite this article

Machine Translation

Christoph Tillimann¹,
Stephan Vogel¹,
Hermann Ney¹ &
…
Hassan Sawaf¹

68 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we describe a first version of a system for statisticaltranslation and present experimental results. The statistical translationapproach uses two types of information: a translation model and a languagemodel. The language model used is a standard bigram model. The translationmodel is decomposed into lexical and alignment models. After presenting the details of the alignment model, we describe the search problem and present a dynamic programming-based solution for the special case of monotone alignments.So far, the system has been tested on two limited-domain tasks for which abilingual corpus is available: the EuTrans traveller task (Spanish–English,500-word vocabulary) and the Verbmobil task (German–English, 3000-wordvocabulary). We present experimental results on these tasks. In addition to the translation of text input, we also address the problem of speech translation and suitable integration of the acoustic recognition process and the translation process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to

Article 07 February 2024

Semantic memory: A review of methods, models, and current challenges

Article 03 September 2020

In Conversation with Artificial Intelligence: Aligning language Models with Human Values

Article Open access 19 April 2023

References

Alshawi, H. and F. Xiang: 1997, ‘English-to-Mandarin Speech Translation with Head Transducers’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, Spain, pp. 54–60.
Amengual, J. C., J. M. Benedí, F. Casacuberta, A. Castaño, A. Castellanos, D. Llorens, A. Marzal, F. Prat, E. Vidal, and J. M. Vilar: 1997, ‘Using Categories in the EUTRANS System’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, pp. 44–53.
Baum, L. E.: 1972, ‘An Inequality and Associated Maximization Technique in Statistical Estimation of Probabilistic Functions of a Markov Process’, Inequalities 3, 1–8.
Google Scholar
Berger, A. L., P. F. Brown, J. Cocke, S. A. Della Pietra, V. J. Della Pietra, J. R. Gillett, J. D. Lafferty, R. L. Mercer, H. Printz, and L. Ures: 1994, ‘The Candide System for Machine Translation’, Proceedings of the ARPA Human Language Technology Workshop, Plainsboro, NJ, pp. 152–157.
Brown, P. F., S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer: 1993, ‘The Mathematics of StatisticalMachine Translation: Parameter Estimation’, Computational Linguistics 19, 263–311.
Google Scholar
Dagan, I., K. W. Church, and W. A. Gale: 1993, ‘Robust Bilingual Word Alignment for Machine Aided Translation’, Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, Columbus, OH, pp. 1–8.
Fung, P. and K. W. Church: 1994, ‘K-vec: A New Approach for Aligning Parallel Texts’, COLING 94: The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 1096–1102.
Horiguchi, K. and A. Franz: 1997, ‘A Formal Basis for Spoken Language Translation by Analogy’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, Spain, pp. 32–39.
Jelinek, F.: 1976, ‘Speech Recognition by Statistical Methods’, Proceedings of the IEEE 64, 532–556.
Google Scholar
Kay, M. and M. Röscheisen: 1993, ‘Text-Translation Alignment’, Computational Linguistics 19, 121–142.
Google Scholar
Lavie, A., L. Levin, A.Waibel, D. Gates, M. Gavalda, and L. Mayfield: 1996, ‘JANUS Multi-Lingual Translation of Spontaneous Speech in a Limited Domain’, Expanding MT Horizons: Proceedings of the Second Conference of the Association for Machine Translation in the Americas, Montreal, Quebec, pp. 252–255.
Ney, H., D. Mergel, A. Noll, and A. Paeseler: 1992, ‘Data Driven Search Organization for Continuous Speech Recognition in the SPICOS System’, IEEE Transactions on Signal Processing SP-40, 272–281.
Google Scholar
Nießen, S., S. Vogel, H. Ney, and C. Tillmann: 1998, ‘A DP Based Search Algorithm for Statistical Machine Translation’, '98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, pp. 960–967.
Och, F. J. and C. Tillmann: 1998, Unpublished Report. RWTH Aachen, July 1998.
Och, F. J. and H. Weber: 1998, ‘Improving Statistical Natural Language Translation with Categories and Rules’, '98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Quebec, pp. 985–989.
Reithinger, N., R. Engel, M. Kipp, and M. Klesen: 1996, Predicting Dialogue Acts for a Speech-To-Speech Translation System, Verbmobil-Report 151, DFKI Saarbrücken; available via ftp from http://verbmobil.dfki.de/cgi-bin/verbmobil/htbin/doc-access.cgi.
Sawaf, H.: 1997, Satzumstellendes syntaktisches Parsing zur stochastischen Ñbersetzung spontaner Sprache [Syntactic parsing with word-order changes for stochastic translation of spontaneous speech], diploma thesis, RWTH Aachen, Germany, December 1997.
Google Scholar
Tillmann, C., S. Vogel, H. Ney, A. Zubiaga, and H. Sawaf: 1997, ‘Accelerated DP Based Search for Statistical Translation’, Fifth European Conference on Speech Communication and Technology, Rhodos, Greece, pp. 2667–2670.
Vidal, E.: 1997, ‘Finite-State Speech-to-Speech Translation’, IEEE International Conference on Acoustics, Speech and Signal Processing, Munich, Germany, pp. 111–114.
Vilar, J. M., E. Vidal, and J. C. Amengual: 1996, ‘Learning Extended Finite State Models for Language Translation’, Proceedings of Extended Finite State Models Workshop of the 12 ^th European Conference on Artificial Intelligence, Budapest, Hungary, pp. 92–96.
Vogel, S., H. Ney, and C. Tillmann: 1996, ‘HMM-Based Word Alignment in Statistical Translation’, COLING-96: The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 836–841.
Wahlster, W.: 1993: ‘Verbmobil Translation of Face-To-Face Dialogs’, The Fourth Machine Translation Summit MT Summit IV Proceedings “International Cooperation for Global Communication”, Kobe, Japan, pp. 127–135.
Wessel, F., W. Macherey, and R. Schlüter: 1998, ‘Using Probabilities as Confidence Measures’, International Conference on Acoustics, Speech and Signal Processing, Seattle,WA, pp. 225–228.
Wu, D.: 1996, ‘A Polynomial-Time Algorithm for Statistical Machine Translation’, 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA, pp. 152–158.x

Download references

Author information

Authors and Affiliations

Lehrstuhl für Informatik VI, Rheinisch-Westfälische Technische Hochschule Aachen, Ahornstraβ e 55, 52056, Aachen, Germany
Christoph Tillimann, Stephan Vogel, Hermann Ney & Hassan Sawaf

Authors

Christoph Tillimann
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Vogel
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Ney
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Sawaf
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tillimann, C., Vogel, S., Ney, H. et al. Statistical Translation of Text and Speech: First Results with the RWTH System. Machine Translation 15, 43–73 (2000). https://doi.org/10.1023/A:1011168216856

Download citation

Issue Date: June 2000
DOI: https://doi.org/10.1023/A:1011168216856

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical Translation of Text and Speech: First Results with the RWTH System

Abstract

Access this article

Similar content being viewed by others

Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to

Semantic memory: A review of methods, models, and current challenges

In Conversation with Artificial Intelligence: Aligning language Models with Human Values

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Statistical Translation of Text and Speech: First Results with the RWTH System

Abstract

Access this article

Similar content being viewed by others

Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to

Semantic memory: A review of methods, models, and current challenges

In Conversation with Artificial Intelligence: Aligning language Models with Human Values

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation