Abstract
Minority languages must endeavour to keep up with and avail of language technology advances if they are to prosper in the modern world. Finite state technology is mature, stable and robust. It is scalable and has been applied successfully in many areas of linguistic processing, notably in phonology, morphology and syntax. In this paper, the design, implementation and evaluation of a morphological analyser and generator for Irish using finite state transducers is described. In order to produce a high-quality linguistic resource for NLP applications, a complete set of inflectional morphological rules for Irish is handcrafted, as is the initial test lexicon. The lexicon is then further populated semi-automatically using both electronic and printed lexical resources. Currently we achieve coverage of 89% on unrestricted text. Finally we discuss a number of methodological issues in the design of NLP resources for minority languages.
Similar content being viewed by others
References
An Roinn Oideachais: 1986, Foclóir Póca English–Irish/Irish–English Dictionary [Pocket dictionary], Bailé Atha Cliath: An Gúum.
K. Beesley L. Karttunen (2003) Finite State Morphology: Xerox Tools and Techniques CSLI Stanford, CA
Bráithre Críostaí: 1999, Graiméar Gaeilge na mBráithre Críostaí [The Christian Brothers’ Irish Grammar], 2nd ed., Baile Átha Cliath: An Gúum.
G.L. Campbell (2000) Compendium of the World’s Languages EditionNumber2 Routledge London
InstitutionalAuthorNameChristian Brothers (1988) New Irish Grammar Fallons Dublin
Daciuk, J.: n.d., Finite state utilities. juggernaut.eti.pg.gda.pl/~jandac/fsa.html (consulted 1/10/2003).
J. Fife (1993) ‘Introduction’ M.J. Ball J. Fife (Eds) The Celtic Languages. Routledge London 3–25
ITÉ (Institiúid Teangeolaíochta Éireann): n.d., Corpas Náisiúnta na Gaeilge, [National Corpus of Irish]. www.ite.ie/corpus (consulted 1/10/2003).
Mohri, M., F. C. N. Pereira and M. D. Riley: 2003, AT&T FSM LibraryTM– Finite-State Machine Library, www.research.att.com/sw/tools/fsm/ (consulted 1/10/2003).
B. Ó Cuív (1987) ‘Sandhi phenomena in Irish’ H. Andersen (Eds) Sandhi Phenomena in the Languages of Europe Mouton de Gruyter Berlin 395–414
Ó Droighneáin M. (1991). An Sloinnteoir Gaeilge agus an tAinmneoir [Irish Surnames and Names]. Baile Átha Cliath: Coiscéim.
Ó Siochfhradha, N.: 1998, Foclóir Gaeilge/Béarla – Béarla/Gaeilge [Irish/English – English/Irish Dictionary], Baile Átha Cliath: An Comhlacht Oideachais, Cló Thalbóid.
Rannóg an Aistriúcháin: 1958, Gramadach na Gaeilge agus Litriú na Gaeilge: An Caighdeán Oifigiúuil [Irish Grammar and Irish Spelling: The Official Standard]. Baile Átha Cliath: Oifig an tSoláthair.
P. Russell (1995) An Introduction to the Celtic Languages Longman London
SIL International: 2004, PC-KIMMO, A morphological parser. www.sil.org/pckimmo/ (consulted 1/10/2003).
Stenson, N.: 1981, Studies in Irish Syntax, Tübingen: Gunter Narr.
Uý Dhonnchadha, E.: 2003, ‘Finite-State Morphology and Irish’, in Proceedings of the Workshop on Finite-State Methods in Natural Language Processing, 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, pp. 43–49.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dhonnchadha, U., Pháidín, C.N. & Genabith, J.V. Design, Implementation and Evaluation of an Inflectional Morphology Finite State Transducer for Irish. Mach Translat 18, 173–193 (2003). https://doi.org/10.1007/s10590-004-2480-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-004-2480-9