Abstract
The interlingual approach to machine translation (MT) is used successfully in multilingual translation. It aims to achieve the translation task in two independent steps. First, meanings of the source-language sentences are represented in an intermediate language-independent (Interlingua) representation. Then, sentences of the target language are generated from those meaning representations. Arabic natural language processing in general is still underdeveloped and Arabic natural language generation (NLG) is even less developed. In particular, Arabic NLG from Interlinguas was only investigated using template-based approaches. Moreover, tools used for other languages are not easily adaptable to Arabic due to the language complexity at both the morphological and syntactic levels. In this paper, we describe a rule-based generation approach for task-oriented Interlingua-based spoken dialogue that transforms a relatively shallow semantic interlingual representation, called interchange format (IF), into Arabic text that corresponds to the intentions underlying the speaker’s utterances. This approach addresses the handling of the problems of Arabic syntactic structure determination, and Arabic morphological and syntactic generation within the Interlingual MT approach. The generation approach is developed primarily within the framework of the NESPOLE! (NEgotiating through SPOken Language in E-commerce) multilingual speech-to-speech MT project. The IF-to-Arabic generator is implemented in SICStus Prolog. We conducted evaluation experiments using the input and output from the English analyzer that was developed by the NESPOLE! team at Carnegie Mellon University. The results of these experiments were promising and confirmed the ability of the rule-based approach in generating Arabic translation from the Interlingua taken from the travel and tourism domain.
Similar content being viewed by others
References
Akiba Y, Federico M, Kando N, Nakaiwa H, Paul M, Tsujii J (2004) Overview of the IWSLT04 evaluation campaign. In: Proceedings of the international workshop on spoken language translation, Kyoto, Japan, pp 1–12
Al-Sughaiyer I, Al-Kharashi I (2004) Arabic morphological analysis techniques: a comprehensive survey. J Am Soc Inform Sci Technol 55(3): 189–213
Arnold D, Balkan L, Meijer S, Humphreys L, Sadler L (1994) Machine translation: an introductory guide. Blackwell-NCC, London
Attia M (2008) Handling Arabic morphological and syntactic ambiguities within the LFG framework with a view to machine translation. Ph.D. Thesis, University of Manchester, Manchester, UK
Beesley K (1996) Arabic finite-state morphological analysis and generation. In: COLING-96: the 16th international conference on computational linguistics, vol 1, Copenhagen, Denmark, pp 89–94
Buckwalter T (2002) Buckwalter Arabic morphological analyzer version 1.0, Linguistic Data Consortium, LDC Catalog No.: LDC2002L49. University of Pennsylvania, Philadelphia, PA
Cavalli-Sforza V, Soudi A, Mitamura T (2000) Arabic morphology generation using a concatenative strategy. In: ANLP 2000, 6th Applied Natural Language Processing Conference, Seattle, WA, pp 86–93
Dorr B (1993) Machine translation. MIT Press, Cambridge, MA
Dorr B, Hovy E, Levin L (2004) Machine translation: interlingual methods. In: Brown K (eds) Encyclopedia of language and linguistics. Elsevier, Oxford, UK
El-Desouki A, Abd Elgawwad A, Saleh M (1996) A proposed algorithm for English-Arabic machine translation system. In: Proceeding of the 1st KFUPM workshop on information and computer sciences (WICS): machine translation, Dhahran, Saudi Arabia, pp 32–39
El-Saka T, Rafea A, Rafea M, Madkour M (1999) English to Arabic knowledge base translation tool. In: Proceedings of the 7th international conference on artificial intelligence applications (ICAIA), Cairo, Egypt, pp 66–72
Gavaldà M (2004) SOUP: a parser for real-world spontaneous speech. In: New developments in parsing technology, vol 23, Chap. 17. Kluwer Academic Publishers, Norwell, MA, pp 339—350 (Also published in the Proceedings of the 6th international workshop on parsing technologies (IWPT-2000), Trento, Italy)
Geist RJ (1971) An introduction to transformation grammar. Macmillan, New York, NY
Guessoum A, Zantout R (2007) Arabic morphological generation and its impact on the quality of the machine translation to Arabic. In: Soudi A, Bosch A, Neumann G (eds) Arabic computational morphology: knowledge-based and empirical methods, text and language technology. Springer, New York, pp 287–302
Habash N (2004) Large-scale lexeme-based Arabic morphological generation. In: Proceedings of Traitement Automatique du Langage Naturel (TALN-04), Fez, Morocco, pp 45–51
Habash N, Dorr B, Monz C (2006) Challenges in building an Arabic GHMT system with SMT components. In: AMTA 2006, Proceedings of the 7th conference of the association for machine translation in the Americas, visions for the future of machine translation, Cambridge, MA, pp 56–65
Hiroshi U, Meiying Z (1993) Interlingua for multilingual machine translation. In: Proceedings of MT Summit IV, Kobe, Japan, pp 157–169
Hutchins J (2003) Machine translation: general overview. In: Mitkov R (eds) The Oxford handbook of computational linguistics, Chap. 27. Oxford University Press, Oxford, pp 501–511
Hutchins J, Somers H (1992) An introduction to machine translation. Academic Press, London, UK
Ibrahim M (1991) A fast and expert machine translation system involving Arabic language. Ph.D. Thesis, Cranfield Institute of Technology, Cranfield, UK
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the association for computational linguistics (ACL), demo and poster sessions, Prague, Czech Republic, pp 177–180
Koehn P, Och F, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the joint human language technology conference and the annual meeting of the North American chapter of the association for computational linguistics (HLT-NAACL), Edmonton, Canada, pp 127–133
Langley C (2003) Domain action classification and argument parsing for interlingua-based spoken language translation. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PA
Lavie A, Langley C, Waibel A, Lazzari G, Pianesi F, Coletti P, Balducci F, Taddei L (2001a) Architecture and design considerations in NESPOLE!: a speech translation system for E-commerce applications. In: Proceedings of human language technology conference (HLT 2001), San Diego, CA, pp 15–22
Lavie A, Levin L, Schultz T, Langley C, Han B, Tribble A, Gates D, Wallace D, Peterson K (2001b) Domain portability in speech-to-speech translation. In: Proceedings of human language technology conference (HLT 2001), San Diego, CA, pp 23–29
Lazzari G (2003) Evaluation of the NESPOLE! showcase-2a system. NESPOLE! Project deliverable D18. Available at http://nespole.itc.it/public/deliverables/D18_final.doc
Leavitt J (1994) MORPHE: A morphological rule compiler. Technical Report, CMU-CMT-94-MEMO, Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Levin L, Gates D, Lavie A, Waibel A (1998) An Interlingua based on domain actions for machine translation of task-oriented dialogues. In: Proceedings of The 5th international conference on spoken language processing (CSLP’98), vol 4, Sydney, Australia, pp 1155–1158
Levin L, Lavie A, Woszczyna M, Gates D, Gavalda M, Koll D, Waibel A (2000) The Janus III translation system: speech-to-speech translation in multiple domains. Mach Trans 15(1–2): 3–25
Levin L, Gates D, Wallace D, Peterson K, Lavie A (2002) Balancing expressiveness and simplicity in an Interlingua for task-based dialogue. In: Proceedings of the workshop on speech-to-speech translation: algorithms and systems, Association for Computational Linguistics, Philadelphia, PA, pp 52–59
Levin L, Gates D, Wallace D, Peterson K, Pianta E, Mana N (2003b) The NESPOLE! interchange format, project deliverable D13. Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Levin L, Langley C, Lavie A, Gates D, Wallace D, Peterson K (2003a) Domain specific speech acts for spoken language translation. In : The Proceedings of 4th SIGDIAL workshop on discourse and dialogue (SIGDIAL-2003), Association for Computational Linguistics, Sapporo, Japan, pp 44–49
Mace J (1998) Arabic grammar: a reference guide. Edinburgh University Press, Edinburgh, UK
Metze F, McDonough J, Soltau H, Lavie A, Levin L, Langley C, Schultz T, Waibel A, Cattoni R, Lazzari G, Mana N, Pianesi F, Pianta E, (2002) Enhancing the usability and performance of NESPOLE!: a real-world speech-to-speech translation system. In: Proceedings of human language technology conference (HLT 2002), San Diego, CA, pp 269–274
Mitamura T, Nyberg E, (1992) Hierarchical lexical structure and interpretive mapping in. In: Proceedings of the fifteenth [sic] international conference on computational linguistics COLING-92, Nantes, France, pp 1254–1258
Mokhtar H, Darwish N, Rafea A (2000) An automated system for English–Arabic translation of scientific texts (SEATS). In : Proceedings of MT2000: machine translation and multilingual applications in the new millennium, the British Computer Society (BCS), London, pp 1–5
Och F, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: 40th annual meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, pp 295–302
Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: 40th annual meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, pp 311–318
Patch K (2003) PDA translates speech. Technology Research News. Available at http://www.trnmag.com/Stories/2003/121703/PDA_translates_speech_121703.html
Pease C, Boushaba A (1996) Towards an automatic translation of medical terminology and texts into Arabic. In: Proceedings of the translation in the Arab world, King Fahd Advanced School of Translation, Tangier, Morocco, pp 18–23
Quah CK (2006) Translation and technology. Palgrave MacMillan, Basingstoke, UK
Ryding K (2005) Reference grammar of modern standard Arabic. Cambridge University Press, Cambridge, UK
Rafea A, Shaalan K (1993) Lexical analysis of inflected Arabic words using exhaustive search of an augmented transition network. Softw Pract Exper 23(6): 567–588
Rafea A, Sabry M, El-Ansary R, Samir S (1992) Al-Mutargem: a machine translator for Middle East news. In: Proceedings of the 3rd international conference and exhibition on multi-lingual computing, The Centre for Middle Eastern and Islamic Studies, University of Durham, Durham, UK, pp 53–60
Riezler S, Maxwell J (2006) Grammatical machine translation. In : Proceedings of human language technology conference of the North American chapter of the association for computational linguistics annual meeting (HLT-NAACL’06), New York, NY, pp 248–255
Salem Y, Hensman A, Nolan B (2008) Towards Arabic to English machine translation. Acad J Inst Technol Blanchardstown (Dublin, Ireland) 17:20–31. Available online via http://informatics.itbresearch.ie/~ysalem/pdf/ITB%20Journal-May-2008-v5.pdf
Shaalan K, Rafea M, Rafea A (1998) KROL: a knowledge representation object language on top of Prolog. Expert Syst Appl 15: 33–46
Shaalan K, Rafea A, Abdel Monem A, Baraka H (2004) Machine translation of English noun phrases into Arabic. Int J Comput Process Oriental Lang (IJCPOL) 17(2): 121–134
Shaalan K (2005) An Intelligent Computer Assisted Language Learning System for Arabic Learners. Comput Assist Lang Learn 18: 81–108
Shaalan K (2005) Arabic GramCheck: a grammar checker for Arabic. Softw Pract Exper 35(7): 643–665
Shaalan K, Abdel Monem A, Rafea A (2006) Arabic morphological generation from Interlingua: a rule-based approach. In: Shi Z, Shimohara K, Feng D (eds) Intelligent information processing III, International Federation for information processing (IFIP). Springer, Boston, MA, pp 441–451
Shaalan K, Abdel Monem A, Rafea A, Baraka H, (2006b) Mapping Interlingua representations to feature structures of Arabic sentences. The challenge of Arabic for NLP/MT, International Conference, the British Computer Society, London, pp 149–159
Shaalan K, Abdel Monem A, Rafea A, Baraka H (2007) Generating Arabic text from Interlingua. In: Proceedings of the 2nd workshop on computational approaches to Arabic script-based languages (CAASL-2), Linguistic Institute, Stanford, CA, pp 137–144
Soudi A, Cavalli-Sforza V, Jamari A (2002) A prototype English-to-Arabic Interlingua-based MT system. In: Proceedings of the workshop on Arabic language resources and evaluation: status and prospects, 3rd international conference on language resources and evaluation (LREC 2002), Las Palmas de Gran Canaria, Spain, pp 18–25
Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the international conference on spoken language processing, vol 2. Denver, USA, pp 901–904
Theune M (2003) Natural language generation for dialogue: system survey. Language Engineering Group, University of Twente, Twente, The Netherlands
Tomita M, Nyberg E (1988) Generation kit and transformation kit, version 3.2, user’s manual. Technical Report, Carnegie Mellon Center for Machine Translation, Pittsburgh, PA
Trujillo A (1999) Translation engines: techniques for machine translation. Springer Verlag, London, UK
Vauquois B (1968) A survey of formal grammars and algorithms for recognition and transformation in machine translation. In: Proceedings of international federation for information processing congress 68, Edinburgh, UK, pp a4–260
Waibel A, Badran A, Black A, Frederking R, Gates D, Lavie A, Levin L, Lenzo K, Tomokiyo L, Reichert J, Schultz T, Wallace D, Woszczyna M, Zhang J (2003a) Speechalator: two-way speech-to-speech translation on a consumer PDA. In: Proceedings of EUROSPEECH 2003, Geneva, Switzerland, pp 369–372
Waibel A, Badran A, Black A, Frederking R, Gates D, Lavie A, Levin L, Lenzo K, Tomokiyo L, Reichert J, Schultz T, Wallace D, Woszczyna M, Zhang J (2003b) Speechalator: two-way speech-to-speech translation in your hand. In: Proceedings of the joint human language technology conference and the annual meeting of the north American chapter of the association for computational linguistics (HLT-NAACL), Edmonton, Canada, pp 29–30
White JS, O’Connell T, O’Mara F (1994) The ARPA MT evaluation methodologies: evolution, lessons, and future approaches. In: Proceedings of the first conference of the association for machine translation in the Americas (AMTA), Columbia, MD, pp 193–205
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abdel Monem, A., Shaalan, K., Rafea, A. et al. Generating Arabic text in multilingual speech-to-speech machine translation framework. Machine Translation 22, 205–258 (2008). https://doi.org/10.1007/s10590-009-9054-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-009-9054-9