Abstract
This paper sketches research in nine areas related to spoken language translation: interactive disambiguation (two demonstrations of highly interactive, broad-coverage speech translation are reported); system architecture; data structures; the interface between speech recognition and analysis; the use of natural pauses for segmenting utterances; example-based machine translation; dialogue acts; the tracking of lexical co-occurrences; and the resolution of translation mismatches.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aberdeen, John, Sam Bayer, Sasha Caskey, Laurie Damianos, Alan Goldschen, Lynette Hirschman, Dan Loehr and Hugo Trappe: 1999, ‘Implementing Practical Dialogue Systems with the DARPA Communicator Architecture’, IJCAI-99 Workshop on Knowledge and Reasoning in Practical Dialogue Systems, Stockholm, Sweden, pp. 81–86.
Alexandersson, Jan, Norbert Reithinger and Elisabeth Maier: 1997, ‘Insights into the Dialogue Processing of VERBMOBIL’, Fifth Conference on Applied Natural Language Processing, Washington, DC, pp. 33–40.
Barnett, Jim, Kevin Knight, Inderjeet Mani and Elaine Rich: 1990, ‘Knowledge and Natural Language Processing’, Communications of the ACM 33(8), 50–71.
Black, Alan: 1997, ‘Predicting the Intonation of Discourse Segments from Examples in Dialogue Speech’, in Y. Sagisaka, N. Campbell and N. Higuchi (eds), Computing Prosody, Springer Verlag, Berlin, pp. 117–128.
Black, Ezra, Roger Garside and Geoffrey Leech: 1993, Statistically-driven Computer Grammars of English: The IBM/Lancaster Approach, Rodopi, Amsterdam.
Blanchon, Hervé: 1996, ‘A Customizable Interactive Disambiguation Methodology and Two Implementations to Disambiguate French and English Input’, in C. Boitet (1996a), pp. 190–200.
Boitet, Christian (ed.): 1996a, Proceedings of MIDDIM-96 Post-COLING Seminar on Interactive Disambiguation, Le Col de Porte, France.
Boitet, Christian: 1996b, ‘Dialogue-based Machine Translation for Monolinguals and Future Selfexplaining Documents’, in C. Boitet (1996a), pp. 75–85.
Boitet, Christian and Mark Seligman: 1994, ‘The “Whiteboard” Architecture: A Way to Integrate Heterogeneous Components of NLP Systems’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 426–430 Brown, Ralph D.: 1996, ‘Example-based Machine Translation in the Pangloss System’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 169–174.
Dohsaka, K.: 1990, ‘Identifying the Referents of Zero-pronouns in Japanese Based on Pragmatic Constraint Interpretation’, 90, Stockholm, Sweden, pp. 240–245.
Erman, Lee D. and Victor R. Lesser: 1990, ‘The Hearsay-II Speech Understanding System: A Tutorial’, in A. Waibel and K.-F. Lee (eds), Readings in Speech Recognition, Morgan Kaufmann, San Mateo, CA, pp. 235–245.
Fano, Robert M.: 1961, Transmission of Information: A Statistical Theory of Communications, MIT Press, Cambridge, MA.
Fillmore, Charles J., Paul Kay and Catherine O'Connor: 1988, ‘Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone’, Language 64, 501–538.
Flanagan, Mary: 1997, ‘Machine Translation of Interactive Texts’, In MT Summit VI, Machine Translation: Past Present Future, San Diego, CA, p. 50.
Frederking, Robert, Alexander Rudnicky, and Christopher Hogan: 1997, ‘Interactive Speech Translation in the DIPLOMAT Project’, Spoken Language Translation: Proceedings of a Workshop Sponsored by the Association for Computational Linguistics and by the European Network in Language and Speech (ELSNET), Madrid, Spain, pp. 61–66.
Furukawa Ryo, Yato Fumihiro and Loken-Kim Kyung-ho: 1993, Denwakaiwa o Maruchimedia Kaiwa no Tokuchōbunseki [Multimedia Dialogue Feature Analysis of Telephone Conversations].Technical Report TR-IT-0020, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.
Furuse, Osamu and Hitoshi Iida: 1996, ‘Incremental Translation Using Constituent Boundary Patterns’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 412–417.
Görz, Günther, Marcus Kesseler, Jörg Spilker and Hans Weber: 1996, ‘Research on Architectures for Integrated Speech/Language Systems in Verbmobil’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 484–489.
Grosz, Barbara J., Aravind K. Joshi and Scott Weinstein: 1983, ‘Providing a Unified Account of Definite Noun Phrases in Discourse’, 21st Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, pp. 44–50.
Hearst, Marti A.: 1994, ‘Multi-paragraph Segmentation of Expository Text’, 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, pp. 9–16.
Hosaka, Junko, Mark Seligman and Harald Singer: 1994, ‘Pause as a Phrase Demarcator for Speech and Language Processing’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 987–991.
Ichikawa, A., M. Araki, Y. Horiuchi, M. Ishizaki, S. Itabashi, T. Itoh, H. Kashioka, K. Kato, H. Kikuchi, H. Koiso, T. Kumagai, A. Kurematsu, K. Maekawa, S. Nakazato, M. Tamoto, S. Tutiya, Y. Yamashita and T. Yoshimura: 1999, ‘Evaluation of Annotation Schemes for Japanese Discourse’, in M. Walker (1999), pp. 26–34.
Iida, Hiroshi, Eichiro Sumita and Osamu Furuse: 1996, ‘spoken Language Translation Method Using Examples’, COLING-96, The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp. 1074–1077.
Iwadera, T., M. Ishizaki and T. Morimoto: 1995, ‘Recognizing an Interactional Structure and Topics of Task-oriented Dialogues’, Proceedings of the European Workshop on Spoken Dialogue Systems, Vigsø, Denmark, pp. 41–44.
Jokinen, Kristiina, Hideki Tanaka and Akio Yokoo: 1998, ‘Context Management with Topics for Spoken Dialogue Systems’, 98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Canada, pp. 631–637.
Joshi, Aravind K. and Scott Weinstein: 1981, ‘Control of Inference: Role of some Aspects of Discourse Structure-Centering’, Seventh International Joint Conference on Artificial Intelligence (IJCAI-81), Vancouver, BC, pp. 385–387.
Julia, L., L. Neumeyer, M. Charafeddine, A. Cheyer, and J. Dowding: 1997, ‘HTTP://WWW.SPEECH.SRI.COM/DEMOS/ATIS.HTML’, Working notes of the AAAI'97 Spring Symposium Workshop on Natural Language Processing for the Web, Stanford, CA, pp. 72–76.
Jurafsky, Daniel: 1993, A Cognitive Model of Sentence Interpretation: The Construction Grammar Approach. Technical Report TR–93–077. International Computer Science Institute and Department of Linguistics, University of California, Berkeley.
Kay, Paul: 1990, ‘Even’, Linguistics and Philosophy 13, 59–216.
Knott, Alistair: 1996, A Data-driven Methodology for Motivating a Set of Coherence Relations, Ph.D. thesis, Department of Artificial Intelligence, University of Edinburgh.
Knott, Alistair and Robert Dale: 1995, ‘Using Linguistic Phenomena to Motivate a Set of Rhetorical Relations’, Discourse Processes 18, 35–62.
Kompe, R., A. Kiessling, H. Niemann, E. Noeth, A. Batliner, S. Schachtl, R. Ruland and H. U.Block: 1997, ‘Improving Parsing of Spontaneous Speech with the Help of Prosodic Boundaries’, 97), Munich, Germany, pp. 811–814.
Kowalski, Piotr, Burton Rosenberg and Jeffrey Krause: 1995, ‘Information Transcript’, Biennale de Lyon d‘Art Contemporain’, Lyon, France.
Kozima, Hideki and Teiji Furugori: 1994, ‘Segmenting Narrative Text into Coherent Scenes’, Literary and Linguistic Computing 9, 13–19.
Lenat, Douglas B. and R. V. Guha: 1990, Building Large Knowledge-based Systems, Addison-Wesley, Reading, MA.
Loken-Kim, Kyung-ho, Fumihiro Yato, Kazuhiko Kurihara, Laurel Fais and Ryo Furukawa: 1993, AMUSE-ATR Multi-media Simulation Environment. Technical Report TR-IT-0018, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.
Mahesh, Kavi (ed.): 1997, Natural Language Processing for the World Wide Web: Papers from the 1997 AAAI Spring Symposium, The AAAI Press, Cambrdige, MA.
Miike, Seiji, Koichi Hasebe, Harold Somers and Shin-ya Amano: 1988, ‘Experiences with an On-line Translating Dialogue System’, The 26th Annual Meeting of the Association for Computational Linguistics, Buffalo, NY, pp. 155–162.
Morimoto, T., T. Takezawa, F. Yato, S. Sagayama, T. Tashiro, M. Nagata and A. Kurematsu: 1993, ‘ATR's Speech Translation System: ASURA’, European Conference on Speech Communication and Technology, Berlin, Germany, pp. 1295–1298.
Morris, Jane and Graeme Hirst: 1991, ‘Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text’, Computational Linguistics 17, 21–48.
Murata, Masaki and Makoto Nagao: 1993, ‘Determination of Referential Property and Number of Nouns in Japanese Sentences for Machine Translation into English’, '93-MT in the Next Generation, Kyoto, Japan, pp. 218–225.
Nadas, Arthur: 1985, ‘On Turing's Formula forWord Probabilities’, IEEE Transactions on Acoustics, Speech and Signal Processing 33, 1414–1416.
Nagao, Makoto: 1984, ‘A Framework of a Mechanical Translation between Japanese and English by Analogy Principle’, in A. Elithorn and R. Banerji (eds), Artificial and Human Intelligence, North-Holland, Amsterdam, pp. 173–180.
Nagata, Masaaki and Tsuyoshi Morimoto: 1993, ‘An Experimental Statistical Dialogue Model to Predict the Speech Act Type of the Next Utterance’, Proceedings of ISSD-93, International Symposium on Spoken Dialogue-New Directions in Human and Man-machine Communication, Tokyo, Japan, pp. 83–86.
Nomoto, Tadashi and Yoshihiko Nitta: 1994, ‘A Grammatico-statistical Approach to Discourse Partitioning’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 1145–1150.
Ohno Susumu and Hamanishi Masando: 1981, Kadokawa Ruigo Shin-jiten [Kadokawa New Word Category Dictionary], Kadokawa Shoten, Tōkyō.
Pyra, Marianne: 1995, Using Internet Relay Chat, Que Corporation, Indianapolis, IN.
Reithinger, Norbert: 1995, ‘Some Experiments in Speech Act Prediction’, in Johanna Moore and Marilyn Walker (eds), Empirical Methods in Discourse: Interpretation & Generation: Papers from the 1995 AAAI Symposium, AAAI Press, Cambridge, MA, pp. 126–131.
Reithinger, Norbert and Martin Klesen: 1997, ‘Dialogue Act Classification Using Language Models’, Proceedings of the 5th European Conference on Speech Communication and Technology (Eurospeech), Rhodes, Greece, pp. 2235–2238.
Sato, Satoshi: 1991, Example-based Machine Translation, Doctoral thesis (in Japanese), Kyoto University, Japan.
Schütze, Hinrich: 1998, ‘Automatic Word Sense Discrimination’, Computational Linguistics 24, 97–124.
Searle, J.: 1969, Speech Acts, Cambridge University Press, Cambridge, England.
Seligman, Mark: 1991, Generating Discourses from Networks Using an Inheritance-based Grammar, Dissertation, Department of Linguistics, University of California, Berkeley.
Seligman, Mark: 1994a, CO-OC: Semi-automatic Production of Resources for Tracking Morphological and Semantic Co-occurrences in Spontaneous Dialogues. Technical Report TR-IT-0084, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.
Seligman, Mark: 1994b, CNTR: Basic Functions for Centering Experiments with ASURA. Technical Report TR-IT-0085, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.
Seligman, Mark: 1997, ‘Interactive Real-time Translation via the Internet’, in K. Mahesh (1997), pp. 142–148.
Seligman, Mark, Jan Alexandersson and Kristiina Jokinen: 1999, ‘Tracking Morphological and Semantic Co-occurrences in Spontaneous Dialogues’, IJCAI-99 Workshop on Knowledge and Reasoning in Practical Dialogue Systems, Stockholm, Sweden, pp. 105–111.
Seligman, Mark and Christian Boitet: 1993, ‘A “Whiteboard” Architecture for Automatic Speech Translation’, Proceedings of ISSD-93, International Symposium on Spoken Dialogue-New Directions in Human and Man-machine Communication, Tokyo, Japan, pp. 243–246.
Seligman, Mark, Christian Boitet and Boubaker Meddeb-Hamrouni: 1998a, ‘Transforming Lattices into Non-deterministic Automata with Optional Null Arcs’, 98: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, Canada, pp. 1205–1211.
Seligman, Mark, Laurel Fais and Mutsuko Tomokiyo: 1995, A Bilingual Set of Communicative Act Labels for Spontaneous Dialogues. Technical Report TR-IT-0081, ATR Interpreting Telecommunications Laboratories, Kyoto, Japan.
Seligman, Mark, Mary Flanagan and Sophie Toole: 1998b, ‘Dictated Input for Broad-coverage Speech Translation’, Association for Machine Translation in the Americas (AMTA-98), Workshop on Embedded MT Systems: Design, Construction, and Evaluation of Systems with an MT Component, Langhorne, PA.
Seligman, Mark, Junko Hosaka and Harald Singer: 1997, ‘“Pause Units” and Analysis of Spontaneous Japanese Dialogues: Preliminary Studies’, in E. Meier, M. Mast and S. Luperfoy (eds), Dialogue Processing in Spoken Language Systems, Springer, Berlin, pp. 110–112.
Seligman, Mark, Masami Suzuki and Tsuyoshi Morimoto: 1993. Semantic-level Transfer in Japanese-German Speech Translation: Some Experiences. Technical Report NLC93–13, Institute of Electronics, Information, and Communication Engineers (IEICE), Japan.
Sidner, Candace: 1979, Toward a Computational Theory of Definite Anaphora Comprehension in English. Technical Report AI-TR-537, MIT, Cambridge, MA.
Sobashima, Yauhiro and Hitoshi Iida: 1995, ‘A Multi-dimensional Analogy-based, Contextdependent, Bottom-up Parsing Method for Spoken Dialogues’, Third Natural Language Processing Pacific Rim Symposium NLPRS'95, Seoul, Korea, pp. 586–591.
Sobashima Yasuhiro and Mark Seligman: 1994, ‘Yōrei to no tagenteki ruijido keisan ni motodzuku bunmyaku izon no kobun kaiseki hō’, [Parsing Method for Example-based Analysis Integrating Multiple Knowledge Sources], Shadan hōjin jōhō shori gakkai dai49 kai zenkoku taikai kōen ronbun shō, Vol. 3, Sapporo, Japan, pp. 103–104.
Stock, Oliviero, Rino Falcone and Patrizia Insinnamo: 1989, ‘Bi-directional Charts: A Potential Technique for Parsing Spoken Natural Language Sentences’, Computer Science and Language 3, 219–237.
Sumita, Eichiro and Hitoshi Iida: 1992, ‘Example-based Transfer of Adnominal Particles into English’, IEICE Transactions on Information Systems E75-D(4), 585–594.
Takeda, Shingo and Norihisa Doi: 1994, ‘Centering in Japanese: A Step Towards Better Interpretation of Pronouns and Zero-pronouns’, COLING 94, The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 1151–1156.
Takezawa, Toshiyuki, Fumiaki Sugaya and Akio Yokoo: 1999, ‘ATR-MATRIX: A Spontaneous Speech Translation System between English and Japanese’, ATR Journal 2, 29–33.
Veling, Anne and Peter van der Weerd: 1999, ‘Conceptual Grouping in Word Co-occurrence Networks’, IJCAI-99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 694–699.
Wahlster, W.: 1993, VERBMOBIL: Translation of Face-to-Face Dialogs. Research Report RR–93–34, German Research Center for Artificial Intelligence (DFKI GmbH), Saarbrücken, Germany.
Walker, Marilyn (ed.): 1999, ‘99 Workshop: Towards Standards and Tools for Discourse Tagging, College Park, MD.
Zajac, Remi and Mark Casper: 1997, ‘The Temple Web Translator’, in K. Mahesh (1997), pp. 149–154.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Seligman, M. Nine Issues in Speech Translation. Machine Translation 15, 149–186 (2000). https://doi.org/10.1023/A:1011180928513
Issue Date:
DOI: https://doi.org/10.1023/A:1011180928513