Skip to main content
Log in

Nine Issues in Speech Translation

  • Published:
Machine Translation

Abstract

This paper sketches research in nine areas related to spoken language translation: interactive disambiguation (two demonstrations of highly interactive, broad-coverage speech translation are reported); system architecture; data structures; the interface between speech recognition and analysis; the use of natural pauses for segmenting utterances; example-based machine translation; dialogue acts; the tracking of lexical co-occurrences; and the resolution of translation mismatches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aberdeen, John, Sam Bayer, Sasha Caskey, Laurie Damianos, Alan Goldschen, Lynette Hirschman, Dan Loehr and Hugo Trappe: 1999, ‘Implementing Practical Dialogue Systems with the DARPA Communicator Architecture’, IJCAI-99 Workshop on Knowledge and Reasoning in Practical Dialogue Systems, Stockholm, Sweden, pp. 81–86.

  • Alexandersson, Jan, Norbert Reithinger and Elisabeth Maier: 1997, ‘Insights into the Dialogue Processing of VERBMOBIL’, Fifth Conference on Applied Natural Language Processing, Washington, DC, pp. 33–40.

  • Barnett, Jim, Kevin Knight, Inderjeet Mani and Elaine Rich: 1990, ‘Knowledge and Natural Language Processing’, Communications of the ACM 33(8), 50–71.

    Google Scholar 

  • Black, Alan: 1997, ‘Predicting the Intonation of Discourse Segments from Examples in Dialogue Speech’, in Y. Sagisaka, N. Campbell and N. Higuchi (eds), Computing Prosody, Springer Verlag, Berlin, pp. 117–128.

    Google Scholar 

  • Black, Ezra, Roger Garside and Geoffrey Leech: 1993, Statistically-driven Computer Grammars of English: The IBM/Lancaster Approach, Rodopi, Amsterdam.

    Google Scholar 

  • Blanchon, Hervé: 1996, ‘A Customizable Interactive Disambiguation Methodology and Two Implementations to Disambiguate French and English Input’, in C. Boitet (1996a), pp. 190–200.

  • Boitet, Christian (ed.): 1996a, Proceedings of MIDDIM-96 Post-COLING Seminar on Interactive Disambiguation, Le Col de Porte, France.

    Google Scholar 

  • Boitet, Christian: 1996b, ‘Dialogue-based Machine Translation for Monolinguals and Future Selfexplaining Documents’, in C. Boitet (1996a), pp. 75–85.

  • Boitet, Christian and Mark Seligman: 1994, ‘The “Whiteboard” Architecture: A Way to Integrate Heterogeneous Components of NLP Systems’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 426–430 Brown, Ralph D.: 1996, ‘Example-based Machine Translation in the Pangloss System’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 169–174.

  • Dohsaka, K.: 1990, ‘Identifying the Referents of Zero-pronouns in Japanese Based on Pragmatic Constraint Interpretation’, 90, Stockholm, Sweden, pp. 240–245.

  • Erman, Lee D. and Victor R. Lesser: 1990, ‘The Hearsay-II Speech Understanding System: A Tutorial’, in A. Waibel and K.-F. Lee (eds), Readings in Speech Recognition, Morgan Kaufmann, San Mateo, CA, pp. 235–245.

    Google Scholar 

  • Fano, Robert M.: 1961, Transmission of Information: A Statistical Theory of Communications, MIT Press, Cambridge, MA.

    Google Scholar 

  • Fillmore, Charles J., Paul Kay and Catherine O'Connor: 1988, ‘Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone’, Language 64, 501–538.

    Google Scholar 

  • Flanagan, Mary: 1997, ‘Machine Translation of Interactive Texts’, In MT Summit VI, Machine Translation: Past Present Future, San Diego, CA, p. 50.

  • Frederking, Robert, Alexander Rudnicky, and Christopher Hogan: 1997, ‘Interactive Speech Translation in the DIPLOMAT Project’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, Spain, pp. 61–66.

  • Furukawa Ryo, Yato Fumihiro and Loken-Kim Kyung-ho: 1993, Denwakaiwa o Maruchimedia Kaiwa no Tokuchōbunseki [Multimedia Dialogue Feature Analysis of Telephone Conversations].Technical Report TR-IT-0020, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.

    Google Scholar 

  • Furuse, Osamu and Hitoshi Iida: 1996, ‘Incremental Translation Using Constituent Boundary Patterns’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 412–417.

  • Görz, Günther, Marcus Kesseler, Jörg Spilker and Hans Weber: 1996, ‘Research on Architectures for Integrated Speech/Language Systems in Verbmobil’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 484–489.

  • Grosz, Barbara J., Aravind K. Joshi and Scott Weinstein: 1983, ‘Providing a Unified Account of Definite Noun Phrases in Discourse’, 21st Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, pp. 44–50.

  • Hearst, Marti A.: 1994, ‘Multi-paragraph Segmentation of Expository Text’, 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, pp. 9–16.

  • Hosaka, Junko, Mark Seligman and Harald Singer: 1994, ‘Pause as a Phrase Demarcator for Speech and Language Processing’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 987–991.

  • Ichikawa, A., M. Araki, Y. Horiuchi, M. Ishizaki, S. Itabashi, T. Itoh, H. Kashioka, K. Kato, H. Kikuchi, H. Koiso, T. Kumagai, A. Kurematsu, K. Maekawa, S. Nakazato, M. Tamoto, S. Tutiya, Y. Yamashita and T. Yoshimura: 1999, ‘Evaluation of Annotation Schemes for Japanese Discourse’, in M. Walker (1999), pp. 26–34.

  • Iida, Hiroshi, Eichiro Sumita and Osamu Furuse: 1996, ‘spoken Language Translation Method Using Examples’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 1074–1077.

  • Iwadera, T., M. Ishizaki and T. Morimoto: 1995, ‘Recognizing an Interactional Structure and Topics of Task-oriented Dialogues’, Proceedings of the European Workshop on Spoken Dialogue Systems, Vigsø, Denmark, pp. 41–44.

  • Jokinen, Kristiina, Hideki Tanaka and Akio Yokoo: 1998, ‘Context Management with Topics for Spoken Dialogue Systems’, 98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Canada, pp. 631–637.

  • Joshi, Aravind K. and Scott Weinstein: 1981, ‘Control of Inference: Role of some Aspects of Discourse Structure-Centering’, Seventh International Joint Conference on Artificial Intelligence (IJCAI-81), Vancouver, BC, pp. 385–387.

  • Julia, L., L. Neumeyer, M. Charafeddine, A. Cheyer, and J. Dowding: 1997, ‘HTTP://WWW.SPEECH.SRI.COM/DEMOS/ATIS.HTML’, Working notes of the AAAI'97 Spring Symposium Workshop on Natural Language Processing for the Web, Stanford, CA, pp. 72–76.

  • Jurafsky, Daniel: 1993, A Cognitive Model of Sentence Interpretation: The Construction Grammar Approach. Technical Report TR–93–077. International Computer Science Institute and Department of Linguistics, University of California, Berkeley.

    Google Scholar 

  • Kay, Paul: 1990, ‘Even’, Linguistics and Philosophy 13, 59–216.

    Google Scholar 

  • Knott, Alistair: 1996, A Data-driven Methodology for Motivating a Set of Coherence Relations, Ph.D. thesis, Department of Artificial Intelligence, University of Edinburgh.

  • Knott, Alistair and Robert Dale: 1995, ‘Using Linguistic Phenomena to Motivate a Set of Rhetorical Relations’, Discourse Processes 18, 35–62.

    Google Scholar 

  • Kompe, R., A. Kiessling, H. Niemann, E. Noeth, A. Batliner, S. Schachtl, R. Ruland and H. U.Block: 1997, ‘Improving Parsing of Spontaneous Speech with the Help of Prosodic Boundaries’, 97), Munich, Germany, pp. 811–814.

  • Kowalski, Piotr, Burton Rosenberg and Jeffrey Krause: 1995, ‘Information Transcript’, Biennale de Lyon d‘Art Contemporain’, Lyon, France.

  • Kozima, Hideki and Teiji Furugori: 1994, ‘Segmenting Narrative Text into Coherent Scenes’, Literary and Linguistic Computing 9, 13–19.

    Google Scholar 

  • Lenat, Douglas B. and R. V. Guha: 1990, Building Large Knowledge-based Systems, Addison-Wesley, Reading, MA.

    Google Scholar 

  • Loken-Kim, Kyung-ho, Fumihiro Yato, Kazuhiko Kurihara, Laurel Fais and Ryo Furukawa: 1993, AMUSE-ATR Multi-media Simulation Environment. Technical Report TR-IT-0018, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.

    Google Scholar 

  • Mahesh, Kavi (ed.): 1997, Natural Language Processing for the World Wide Web: Papers from the 1997 AAAI Spring Symposium, The AAAI Press, Cambrdige, MA.

    Google Scholar 

  • Miike, Seiji, Koichi Hasebe, Harold Somers and Shin-ya Amano: 1988, ‘Experiences with an On-line Translating Dialogue System’, The 26th Annual Meeting of the Association for Computational Linguistics, Buffalo, NY, pp. 155–162.

  • Morimoto, T., T. Takezawa, F. Yato, S. Sagayama, T. Tashiro, M. Nagata and A. Kurematsu: 1993, ‘ATR's Speech Translation System: ASURA’, European Conference on Speech Communication and Technology, Berlin, Germany, pp. 1295–1298.

  • Morris, Jane and Graeme Hirst: 1991, ‘Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text’, Computational Linguistics 17, 21–48.

    Google Scholar 

  • Murata, Masaki and Makoto Nagao: 1993, ‘Determination of Referential Property and Number of Nouns in Japanese Sentences for Machine Translation into English’, '93-MT in the Next Generation, Kyoto, Japan, pp. 218–225.

  • Nadas, Arthur: 1985, ‘On Turing's Formula forWord Probabilities’, IEEE Transactions on Acoustics, Speech and Signal Processing 33, 1414–1416.

    Google Scholar 

  • Nagao, Makoto: 1984, ‘A Framework of a Mechanical Translation between Japanese and English by Analogy Principle’, in A. Elithorn and R. Banerji (eds), Artificial and Human Intelligence, North-Holland, Amsterdam, pp. 173–180.

    Google Scholar 

  • Nagata, Masaaki and Tsuyoshi Morimoto: 1993, ‘An Experimental Statistical Dialogue Model to Predict the Speech Act Type of the Next Utterance’, Proceedings of ISSD-93, International Symposium on Spoken Dialogue-New Directions in Human and Man-machine Communication, Tokyo, Japan, pp. 83–86.

  • Nomoto, Tadashi and Yoshihiko Nitta: 1994, ‘A Grammatico-statistical Approach to Discourse Partitioning’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 1145–1150.

  • Ohno Susumu and Hamanishi Masando: 1981, Kadokawa Ruigo Shin-jiten [Kadokawa New Word Category Dictionary], Kadokawa Shoten, Tōkyō.

    Google Scholar 

  • Pyra, Marianne: 1995, Using Internet Relay Chat, Que Corporation, Indianapolis, IN.

    Google Scholar 

  • Reithinger, Norbert: 1995, ‘Some Experiments in Speech Act Prediction’, in Johanna Moore and Marilyn Walker (eds), Empirical Methods in Discourse: Interpretation & Generation: Papers from the 1995 AAAI Symposium, AAAI Press, Cambridge, MA, pp. 126–131.

    Google Scholar 

  • Reithinger, Norbert and Martin Klesen: 1997, ‘Dialogue Act Classification Using Language Models’, Proceedings of the 5th European Conference on Speech Communication and Technology (Eurospeech), Rhodes, Greece, pp. 2235–2238.

  • Sato, Satoshi: 1991, Example-based Machine Translation, Doctoral thesis (in Japanese), Kyoto University, Japan.

    Google Scholar 

  • Schütze, Hinrich: 1998, ‘Automatic Word Sense Discrimination’, Computational Linguistics 24, 97–124.

    Google Scholar 

  • Searle, J.: 1969, Speech Acts, Cambridge University Press, Cambridge, England.

    Google Scholar 

  • Seligman, Mark: 1991, Generating Discourses from Networks Using an Inheritance-based Grammar, Dissertation, Department of Linguistics, University of California, Berkeley.

    Google Scholar 

  • Seligman, Mark: 1994a, CO-OC: Semi-automatic Production of Resources for Tracking Morphological and Semantic Co-occurrences in Spontaneous Dialogues. Technical Report TR-IT-0084, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.

    Google Scholar 

  • Seligman, Mark: 1994b, CNTR: Basic Functions for Centering Experiments with ASURA. Technical Report TR-IT-0085, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.

    Google Scholar 

  • Seligman, Mark: 1997, ‘Interactive Real-time Translation via the Internet’, in K. Mahesh (1997), pp. 142–148.

  • Seligman, Mark, Jan Alexandersson and Kristiina Jokinen: 1999, ‘Tracking Morphological and Semantic Co-occurrences in Spontaneous Dialogues’, IJCAI-99 Workshop on Knowledge and Reasoning in Practical Dialogue Systems, Stockholm, Sweden, pp. 105–111.

  • Seligman, Mark and Christian Boitet: 1993, ‘A “Whiteboard” Architecture for Automatic Speech Translation’, Proceedings of ISSD-93, International Symposium on Spoken Dialogue-New Directions in Human and Man-machine Communication, Tokyo, Japan, pp. 243–246.

  • Seligman, Mark, Christian Boitet and Boubaker Meddeb-Hamrouni: 1998a, ‘Transforming Lattices into Non-deterministic Automata with Optional Null Arcs’, 98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Canada, pp. 1205–1211.

  • Seligman, Mark, Laurel Fais and Mutsuko Tomokiyo: 1995, A Bilingual Set of Communicative Act Labels for Spontaneous Dialogues. Technical Report TR-IT-0081, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.

    Google Scholar 

  • Seligman, Mark, Mary Flanagan and Sophie Toole: 1998b, ‘Dictated Input for Broad-coverage Speech Translation’, Association for Machine Translation in the Americas (AMTA-98), Workshop on Embedded MT Systems: Design, Construction, and Evaluation of Systems with an MT Component, Langhorne, PA.

  • Seligman, Mark, Junko Hosaka and Harald Singer: 1997, ‘“Pause Units” and Analysis of Spontaneous Japanese Dialogues: Preliminary Studies’, in E. Meier, M. Mast and S. Luperfoy (eds), Dialogue Processing in Spoken Language Systems, Springer, Berlin, pp. 110–112.

    Google Scholar 

  • Seligman, Mark, Masami Suzuki and Tsuyoshi Morimoto: 1993. Semantic-level Transfer in Japanese-German Speech Translation: Some Experiences. Technical Report NLC93–13, Institute of Electronics, Information, and Communication Engineers (IEICE), Japan.

    Google Scholar 

  • Sidner, Candace: 1979, Toward a Computational Theory of Definite Anaphora Comprehension in English. Technical Report AI-TR-537, MIT, Cambridge, MA.

    Google Scholar 

  • Sobashima, Yauhiro and Hitoshi Iida: 1995, ‘A Multi-dimensional Analogy-based, Contextdependent, Bottom-up Parsing Method for Spoken Dialogues’, Third Natural Language Processing Pacific Rim Symposium NLPRS'95, Seoul, Korea, pp. 586–591.

  • Sobashima Yasuhiro and Mark Seligman: 1994, ‘Yōrei to no tagenteki ruijido keisan ni motodzuku bunmyaku izon no kobun kaiseki hō’, [Parsing Method for Example-based Analysis Integrating Multiple Knowledge Sources], Shadan hōjin jōhō shori gakkai dai49 kai zenkoku taikai kōen ronbun shō, Vol. 3, Sapporo, Japan, pp. 103–104.

  • Stock, Oliviero, Rino Falcone and Patrizia Insinnamo: 1989, ‘Bi-directional Charts: A Potential Technique for Parsing Spoken Natural Language Sentences’, Computer Science and Language 3, 219–237.

    Google Scholar 

  • Sumita, Eichiro and Hitoshi Iida: 1992, ‘Example-based Transfer of Adnominal Particles into English’, IEICE Transactions on Information Systems E75-D(4), 585–594.

    Google Scholar 

  • Takeda, Shingo and Norihisa Doi: 1994, ‘Centering in Japanese: A Step Towards Better Interpretation of Pronouns and Zero-pronouns’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 1151–1156.

  • Takezawa, Toshiyuki, Fumiaki Sugaya and Akio Yokoo: 1999, ‘ATR-MATRIX: A Spontaneous Speech Translation System between English and Japanese’, ATR Journal 2, 29–33.

    Google Scholar 

  • Veling, Anne and Peter van der Weerd: 1999, ‘Conceptual Grouping in Word Co-occurrence Networks’, IJCAI-99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 694–699.

  • Wahlster, W.: 1993, VERBMOBIL: Translation of Face-to-Face Dialogs. Research Report RR–93–34, German Research Center for Artificial Intelligence (DFKI GmbH), Saarbrücken, Germany.

    Google Scholar 

  • Walker, Marilyn (ed.): 1999, ‘99 Workshop: Towards Standards and Tools for Discourse Tagging, College Park, MD.

  • Zajac, Remi and Mark Casper: 1997, ‘The Temple Web Translator’, in K. Mahesh (1997), pp. 149–154.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Seligman, M. Nine Issues in Speech Translation. Machine Translation 15, 149–186 (2000). https://doi.org/10.1023/A:1011180928513

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011180928513

Navigation