Abstract
NooJ is a linguistic development environment that allows formalizing complex linguistic phenomena such as compound words generation, processing as well as analysis. We will take advantage of NooJ’s linguistic engine strength in order to create a new large coverage terminological compound word’s dictionary for Modern Standard Arabic language. Classifying and annotating Arabic compound words would have a major impact on the disambiguation of applications working with Arabic texts. The diverse analyzers, based on morphological aspect, are not able to recognize multiword expressions. Morphological analyzers usually separate compound expressions into single terms. Therefore recognizing the entire compound words is essential to preserve the semantic of texts and to provide a crucial resource for a better analysis and understanding of Arabic language.
Our work is composed of three sections. First, we will deal with a literature review on Arabic compound expression’s categories which aims to dress a detailed topology. The structural variability of multiword expressions in Arabic language will be studied in order to measure the degree of morphological, lexical and grammatical flexibility of multiword expressions. Then, we will discuss the electronic thematic dictionary of compound Arabic expressions and give detailed description of our methodology and guidelines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Nakagawa, H., Mori, T.: A simple but powerful automatic term extraction method. In: COLING-02 on COMPUTERM 2002: Second International Workshop on Computational Terminology, vol. 14, pp. 1–7. Association for Computational Linguistics (2002)
Silberztein, M.: Les groupes nominaux productifs et les noms composés lexicalizes. In: Lingvisticae Investigationes XVII: 2. John Benjamins B.V., Amsterdam (1993)
Bounhas, I., Slimani, Y.: A hybrid approach for Arabic multi-word term extraction. In: International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009, pp. 1–8. IEEE (2009)
Attia, M.: Handling Arabic morphological and syntactic ambiguity within the LFG framework with a view to machine translation. Thèse de doctorat, University of Manchester (2008)
Mesfar, S.: Analyse Morpho-syntaxique Automatique et Reconnaissance Des Entités Nommées En Arabe Standard. Thesis, Graduate School―Languages, Space, Time, Societies, Paris, France (2008)
Mesfar, S.: Towards a cascade of morpho-syntactic tools for Arabic natural language processing. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 150–162. Springer, Heidelberg (2010)
Silberztein, M.: NooJ’s dictionaries. In: The Proceedings of the 2nd Language and Technology Conference, Poznan (2005)
Mesfar, S.: Analyse morpho-syntaxique et reconnaissance des entités nommées en arabe standard. Thèse, Université de franche-comté, France (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Annex :
Annex :
NooJ’s syntactic categories:
Syntactic codes | |
---|---|
<ADJ> | Adjective |
<V> | Verb |
<N> | Noun |
<ADV> | Adverb |
<CONJ> | Conjunction |
<PREP> | Preposition |
<PREF> | Prefix |
<PRON> | Pronoun |
<REL> | Relative pronoun |
<PART> | Particle |
<E> | Empty caracter |
<P> | Ponctuation |
Inflectional codes | |
<s> | Singular |
<p> | Plurial |
<m> | Male |
<f> | Female |
Semantic codes | |
<CmpdElem> | Component of a MWE |
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Najar, D., Mesfar, S., Ghezela, H.B. (2016). A Large Terminological Dictionary of Arabic Compound Words. In: Okrut, T., Hetsevich, Y., Silberztein, M., Stanislavenka, H. (eds) Automatic Processing of Natural-Language Electronic Texts with NooJ. NooJ 2015. Communications in Computer and Information Science, vol 607. Springer, Cham. https://doi.org/10.1007/978-3-319-42471-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-42471-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42470-5
Online ISBN: 978-3-319-42471-2
eBook Packages: Computer ScienceComputer Science (R0)