Skip to main content
Log in

Morphology in Machine Translation Systems: Efficient Integration of Finite State Transducers and Feature Structure Descriptions

  • Published:
Machine Translation

Abstract

We present a finite state morphology system augmented with typed feature structures as weights on transitions. This mechanism allows the use of highly efficient finite state approaches for morphological analysis and generation, while providing the rich linguistic descriptions often used in Machine Translation systems. Using a semiring interpretation, the weight of a morphological analysis result represents the possible linguistic interpretations of an input word, while the resulting character string itself represents the lemma of the input. Long-distance phenomena and infixation can be handled in an easy and elegant manner, simultaneously providing a seamless interface to subsequent linguistic processing modules. Two extensions to the basic model are discussed: the incorporation of lexical knowledge into the finite state transducer and a transformation that renders unification-based finite state models as efficient as those employing other weight structures. The model is applied to morphological operations in a Persian–English Machine Translation system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Kenneth R. Beesley Karttunen Lauri (2003) Finite State Morphology CSLI Publications Stanford, CA

    Google Scholar 

  • Joan. Bresnan (Eds) (1982) The Mental Representation of Grammatical Relations MIT Press Cambridge, MA

    Google Scholar 

  • Bob. Carpenter (1992) The Logic of Typed Feature Structures Cambridge University Press Cambridge, MA

    Google Scholar 

  • Cohen-Sygal, Yael, Dale Gerdemann, and Shuly Wintner: (2003), ‘Computational Implementation of Non-Concatenative Morphology’ in Proceedings of the Workshop on Finite State Methods in Natural Language Processing, Budapest, Hungary, pp. 59–66.

  • Fissaha, Sisay and Johann Haller: (2003), ‘Amharic Verb Lexicon in the Context of Machine Translation’ in TALN 2003: 10e conférence sur le traitement automatique des langues, Batz-sur-Mer, France, Vol. 2, pp. 183–192.

  • Ilie, Lucian and Sheng Yu: (2002), ‘Constructing NFAs by Optimal Use of Positions in Regular Expressions’, in Alberto Apostolico and Masayuki Takeda (eds) Combinatorial Pattern Matching, 13th Annual Symposium, CPM 2002, Fukuoka, Japan, Berlin: Springer, pp. 279–288.

  • Ronald M. Kaplan Kay Martin (1994) ArticleTitle‘Regular Models of Phonological Rule Systems’ Computational Linguistics. 20 331–378

    Google Scholar 

  • Kiraz, George Anton: (1994), ‘Multi-Tape Two-Level Morphology: a Case Study in Semitic Non-linear Morphology’ in Coling 94: The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 180–186.

  • Kiraz, George Anton: (1997), ‘Compiling Rule Formalisms with Rule Features into Finite- State Automata’ in 35th Annual Meeting of the Association of Computational Linguistics and 8th Conference of the European Chapter of the Association of Computational Linguistics, Madrid, Spain, pp. 329–336.

  • K. Knight (1989) ArticleTitle’Unification: A Multi-Disciplinary Survey’ ACM Computing Surveys. 21 98–124 Occurrence Handle10.1145/62029.62030

    Article  Google Scholar 

  • Koskenniemi, Kimmo: (1983), ‘Two-level Model for Morphological Analysis’ in Proceedings of the 8th International Conference on Artificial Intelligence, IJCAI’83, Karlsruhe, West Germany, pp. 683–685.

  • Krieger, Hans-Ulrich, Hannes Pirker and John Nerbonne: (1993), ‘Feature-Based Allomorphy’ in 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 140–147.

  • Megerdoomian, Karine: (2000), ‘Unification-Based Persian Morphology’ in CICLing-2000: Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, pp. 135–149.

  • Mehryar. Mohri (1997) ArticleTitle‘Finite-State Transducers in Language and Speech Processing’ Computational Linguistics 23 269–311

    Google Scholar 

  • Mehryar. Mohri (2000) ArticleTitle‘Minimization Algorithms for Sequential Transducers’ Theoretical Computer Science 234 177–201 Occurrence Handle10.1016/S0304-3975(98)00115-7

    Article  Google Scholar 

  • Mehryar Mohri (2002) ArticleTitle‘Generic ε-Removal and Input ε-Normalization for Weighted Transducers’ International Journal of Foundations of Computer Science 13 129–143 Occurrence Handle10.1142/S0129054102000996

    Article  Google Scholar 

  • Mehryar. Mohri Pereira. Fernando Riley. Michael (2000) ArticleTitle’The Design Principles of a Weighted Finite-State Transducer Library’ Theoretical Computer Science 231 17–32 Occurrence Handle10.1016/S0304-3975(99)00014-6 Occurrence HandleMR1732685

    Article  MathSciNet  Google Scholar 

  • Mohri, Mehryar and Michael Riley: (2001), ‘A Weight Pushing Algorithm for Large Vocabulary Speech Recognition’. in Proceedings of EUROSPEECH-2001: 7th European Conference on Speech Communication and Technology, Aalborg, Denmark, pp. 1603–1606.

  • Piskorski Jakub. (2002). DFKI Finite-State Machine Toolkit, Technical Report, Deutsches Forschungszentrum für Künstliche Intelligenz, Saarbrücken, Germany.

  • Carl. Pollard A. Sag. Ivan (1994) Head-Driven Phrase Structure Grammar University of Chicago Press Chicago, IL

    Google Scholar 

  • D. Ritchie Graeme Pulman. Stephen G. Black. Alan W. Russell. Graham J. (1987) ArticleTitle‘A Computational Framework for Lexical Description’ Computational Linguistics 13 290–307

    Google Scholar 

  • Harald. Trost (1991) ArticleTitle’A Morphological Component for the Recognition and Generation of Word Forms in Natural Language Understanding Systems: Integrating Two-Level Morphology and Feature Unification’ Applied Artificial Intelligence 4 411–457

    Google Scholar 

  • Trost, Harald and Johannes Matiasek: (1994), ‘Morphology with a Null-Interface’ in Coling 94: The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 141–147.

  • Zajac, Rémi: (1998), ‘Feature Structures, Unification and Finite-State Transducers,’ in International Workshop on Finite State Methods in Natural Language Processing, Ankara, Turkey, pp. 101–109.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan W. Amtrup.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Amtrup, J.W. Morphology in Machine Translation Systems: Efficient Integration of Finite State Transducers and Feature Structure Descriptions. Mach Translat 18, 217–238 (2003). https://doi.org/10.1007/s10590-004-2476-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-004-2476-5

Keywords

Navigation