Skip to main content

A Large Terminological Dictionary of Arabic Compound Words

  • Conference paper
  • First Online:
Automatic Processing of Natural-Language Electronic Texts with NooJ (NooJ 2015)

Abstract

NooJ is a linguistic development environment that allows formalizing complex linguistic phenomena such as compound words generation, processing as well as analysis. We will take advantage of NooJ’s linguistic engine strength in order to create a new large coverage terminological compound word’s dictionary for Modern Standard Arabic language. Classifying and annotating Arabic compound words would have a major impact on the disambiguation of applications working with Arabic texts. The diverse analyzers, based on morphological aspect, are not able to recognize multiword expressions. Morphological analyzers usually separate compound expressions into single terms. Therefore recognizing the entire compound words is essential to preserve the semantic of texts and to provide a crucial resource for a better analysis and understanding of Arabic language.

Our work is composed of three sections. First, we will deal with a literature review on Arabic compound expression’s categories which aims to dress a detailed topology. The structural variability of multiword expressions in Arabic language will be studied in order to measure the degree of morphological, lexical and grammatical flexibility of multiword expressions. Then, we will discuss the electronic thematic dictionary of compound Arabic expressions and give detailed description of our methodology and guidelines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.nooj4nlp.net/.

References

  • Nakagawa, H., Mori, T.: A simple but powerful automatic term extraction method. In: COLING-02 on COMPUTERM 2002: Second International Workshop on Computational Terminology, vol. 14, pp. 1–7. Association for Computational Linguistics (2002)

    Google Scholar 

  • Silberztein, M.: Les groupes nominaux productifs et les noms composés lexicalizes. In: Lingvisticae Investigationes XVII: 2. John Benjamins B.V., Amsterdam (1993)

    Google Scholar 

  • Bounhas, I., Slimani, Y.: A hybrid approach for Arabic multi-word term extraction. In: International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009, pp. 1–8. IEEE (2009)

    Google Scholar 

  • Attia, M.: Handling Arabic morphological and syntactic ambiguity within the LFG framework with a view to machine translation. Thèse de doctorat, University of Manchester (2008)

    Google Scholar 

  • Mesfar, S.: Analyse Morpho-syntaxique Automatique et Reconnaissance Des Entités Nommées En Arabe Standard. Thesis, Graduate School―Languages, Space, Time, Societies, Paris, France (2008)

    Google Scholar 

  • Mesfar, S.: Towards a cascade of morpho-syntactic tools for Arabic natural language processing. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 150–162. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  • Silberztein, M.: NooJ’s dictionaries. In: The Proceedings of the 2nd Language and Technology Conference, Poznan (2005)

    Google Scholar 

  • Mesfar, S.: Analyse morpho-syntaxique et reconnaissance des entités nommées en arabe standard. Thèse, Université de franche-comté, France (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Slim Mesfar .

Editor information

Editors and Affiliations

Annex :

Annex :

NooJ’s syntactic categories:

Syntactic codes

<ADJ>

Adjective

<V>

Verb

<N>

Noun

<ADV>

Adverb

<CONJ>

Conjunction

<PREP>

Preposition

<PREF>

Prefix

<PRON>

Pronoun

<REL>

Relative pronoun

<PART>

Particle

<E>

Empty caracter

<P>

Ponctuation

Inflectional codes

<s>

Singular

<p>

Plurial

<m>

Male

<f>

Female

Semantic codes

<CmpdElem>

Component of a MWE

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Najar, D., Mesfar, S., Ghezela, H.B. (2016). A Large Terminological Dictionary of Arabic Compound Words. In: Okrut, T., Hetsevich, Y., Silberztein, M., Stanislavenka, H. (eds) Automatic Processing of Natural-Language Electronic Texts with NooJ. NooJ 2015. Communications in Computer and Information Science, vol 607. Springer, Cham. https://doi.org/10.1007/978-3-319-42471-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42471-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42470-5

  • Online ISBN: 978-3-319-42471-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics