Skip to main content
Log in

Arabic word sense disambiguation: a review

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Word sense disambiguation (WSD) is a specific task of computational linguistics which aims at automatically identifying the correct sense of a given ambiguous word from a set of predefined senses. In WSD the goal is to tag each ambiguous word in a text with one of the senses known a priori. In Arabic, the main cause of word ambiguity is the lack of diacritics of the most digital documents so the same word can occur with different senses. In this paper, we introduce the reader to the motivation for solving the ambiguity of Arabic words and we detail a description of the task. We overview supervised, unsupervised, semi-supervised and knowledge-based approaches. The evaluation of Arabic WSD systems is also discussed. Finally, we discuss open problems and future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. The Hadiths are religious texts assigned to the Prophet (Peace Be Upon Him) act/saying. The Hadith corpus covers more than 2.5 million words and contains more than 90,000 fragments (titles and paragraphs). It encloses the most reliable six books: Sahih Al-Bukhari (صحيح البخاري), Sahih Muslim (صحيح مسلم), Sunan Abi Dawud (سنن أبي داود), Sunan Ettermidhi (سنن الترمذي), Sunan Ibn Majah (سنن ابن ماجة) and Sunan Annasaii (سنن النسائي).

  2. http://www.collins.co.uk/Corpus/CorpusSearch.aspx.

  3. http://193.133.140.102/JustTheWord .

  4. http://www.natcorp.ox.ac.uk.

  5. http://www.sketchengine.co.uk.

  6. Longman Dictionary of Contemporary English.

  7. 2002 ,المَكْنَزُ الكَبيرُ"، د. أحمد مختار عمر - وآخَرُونَ، دارُ نَشْرِ "سُطُور" المملكة العربية السعودية، ‏الطَّبْعَة الأُولَى"

  8. لاروس المعجم العربي الأساسي"، المنظمة العربية للتربية والثقافة والعلوم، 1988"

  9. المنجد في المترادفات والمتجانسات"، الأب رفائيل نخلة اليسوعي، دار المشرق، الطبعة الثالثة، 1989"

  10. الحُـقولُ الدِّلَاليَّةُ الصَّرْفيَّةُ لِلأَفْعالِ العَرَبيَّةِ"، سُلَيْمان فَيَّاض، دارُ المَرِّيخِ بالرِّياضِ، 1990"

  11. المكنز العربي المعاصر"، د. محمود إسماعيل صيني - وآخرون، مكتبة لبنان، بيروت، الطبعة الأولى، 1993"

  12. معجم المترادفات العربية الأصغر"، وجدي رزق غالي، مكتبة لبنان، بيروت، الطبعة الأولى، 1996"

  13. كنز اللغة العربية"، د. حنا غالب، لبنان ناشرون، 2003"

  14. نجعة الرائد في المترادف والمتوارد"، إبراهيم اليازجي، مكتبة لبنان، بيروت"

  15. قواعد بيانات الربط الدلالي المعجمي العربي: نحو انطولوجيا موسعة للغة العربية". عمرو حمدي الجندي 2011"

  16. Universal Networking Language.

  17. A rule-based programming language (http://libraries.unl.edu/).

  18. http://nlp.ldeo.columbia.edu/madamira/.

  19. http://www.islamweb.net/hadith/index.php.

  20. The Termhood measure weighs a candidate term according to the structural context. Given a lemma of a composite NP or a simple noun which appears in a given node (n), a query (Q) is composed of all the terms which appear in the path linking n to the root. We assign weights to these query terms according to the difference of level between the corresponding nodes (Bounhas et al. 2011a).

  21. The Unithood measure is used to assess NPs by calculating the degree of dependency among their constituents. This measure considers that the two constituents are linked if each of them is relevant for the other (Bounhas et al. 2011a).

  22. SALAAM: Sense Annotations Leveraging Alignments and Multilinguality.

  23. The semantic tree T = (N, E, R, RC, LC, L) corresponding to a given sentence is defined as the following: N is the set of nodes, where every node corresponds to a concept; E is a set of edges that represents the relation between nodes; R is the root of the tree which is the ambiguous word; RC (resp. LC) is the set of right (resp. left) children which are the words occurring on the right (resp. left) of the ambiguous word; and L is a function assigning the level of the nodes, it corresponds to their positions regarding the ambiguous word (Merhbene et al. 2013a).

  24. http://babelnet.org.

  25. http://www.miracl.rnu.tn.

  26. Ben Mukarram M., Al-ifriqi, AL-Misri, Ibn Manzour. Lebanese printed edition of Lisàn al-'arab Ibn Manzûr, “Lisàn al-'arab”, 15 volumes, Beyrouth, 1290.

  27. These summaries were generated from: http://www.mturk.com.

  28. http://www.akhbarelyom.org.eg/.

  29. http://www.algomhuria.net.eg/.

  30. http://www.ahram.org.eg/.

  31. SemEval 2007 Task 18: Semantic Processing of Arabic.

  32. TOPSIS: Technique for Order Preference by Similarity to Ideal Solutions.

  33. Alkhalil contains about 7000 roots obtained from Sarf (sarf 2007, an open source Arabic morphology system http://sourceforge.net/projects/sarf/) and NEMLAR (Network for Euro-Mediterranean LAnguage Resources: www.nemlar.org) corpus. This corpus was produced within the project NEMLAR. The NEMLAR corpus consists of 500.000 lexical units grouped into 13 different categories, aimed at obtaining a balanced corpus that offers a representation of the variety of syntactic features, semantic and pragmatic of modern Arabic language.

  34. Automatically senses’ selections, which are most appropriate to a target domain.

  35. The Open Mind Word Expert sense tagged corpora are freely available at: http://www.teach-computers.org/download.

References

  • Abdelali A, Cowie JR, Farwell D, Ogden W (2004) UCLIR: a multilingual information retrieval tool. Inteligencia Artificial, Revista Iberoamericana de Inteligencia Artif 8(22):103–110. http://nlp.uned.es/ia-mlia/iberamia2002/papers/mlia10.pdf

  • Abualhaija S, Zimmermann K-H (2016) D-bees: a novel method inspired by bee colony optimization for solving word sense disambiguation. Swarm Evol Comput 27:188–195

    Article  Google Scholar 

  • Agirre E, Stevenson M (2006) Knowledge sources for WSD. In: Agirre E, Edmonds P (eds) Word sense disambiguation: algorithms and applications. Springer, New York, pp 217–251

    Chapter  Google Scholar 

  • Ahmed AF (1999) Developing an Arabic Parser in a multilingual machine translation system, Ph.D. Thesis, The Cairo University, Egypt

  • Albared M, Omar N, Ab Aziz MJ (2009) Classifiers combination to Arabic morphosyntactic disambiguation. In: Proceedings of the international conference on electrical engineering and informatics, Selangor, Malaysia, IEEE Xplore, pp 163–171. https://doi.org/10.1109/iceei.2009.5254797

  • Albared M, Omar N, Ab Aziz MJ, Nazri MZA (2010) Automatic part of speech tagging for Arabic: an experiment using bigram hidden Markov model. In: Proceedings of the 5th international conference on rough set and knowledge technology, Beijing, China, pp 361–370

  • Al-Daoud E, Basata A (2009) A framework to automate the parsing of Arabic language sentences. Int Arab J Inf Technol 6(2):191–195

    Google Scholar 

  • Al-Echikh A (1998) Encyclopedia of the six major citation collections. Dar-esselem, Ryadh

    Google Scholar 

  • Alkuhlani S, Habash N, Roth R (2013) Automatic morphological enrichment of a morphologically underspecified treebank. In: Clemmer A, Post M (eds) Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, HLT-NAACL, Omnipress of Madison, Wisconsin, USA, pp 460–470

  • Al-Maghasbeh MKA, Bin Hamzah MP (2015) Extract the semantic meaning of prepositions at Arabic texts: an exploratory study. Int J Comput Trends Technol 30(3):116–120

    Article  Google Scholar 

  • Aloulou C, Belguith LH, Kacem AH, Ben Hamadou A (2004) Conception et développement du système MASPAR d’analyse de l’arabe selon une approche agent. In: 14ème congrès francophone AFRIF-AFIA de reconnaissance des formes et d’intelligence artificielle, Toulouse, France, 28–30 janvier 2004

  • Alpaydin E (2004) Introduction to machine learning. MIT Press, Cambridge

    MATH  Google Scholar 

  • Alqudsi A, Omar N, Shaker K (2014) Arabic machine translation: a survey. Artif Intell Rev 42(4):549–572. https://doi.org/10.1007/s10462-012-9351-1

    Article  Google Scholar 

  • Alrahabi M (2010) EXCOM-2: plateforme d’annotation automatique de catégories sémantiques. Applications à la catégorisation des citations en Français et en Arabe. Thèse de doctorat, Université Paris-Sorbonne, Paris, France

  • Alsaeedan W, Menai MEB (2016) A novel genetic algorithm for the word sense disambiguation problem. In: Proceedings of the 29th Canadian conference on artificial intelligence (Canadian AI 2016). Springer, LNCS 9673, pp 162–167

  • Al-Serhan H, Al-Shalabi R, Kanaan G (2003) New approach for extracting Arabic roots. In: Proceedings of the international Arab conference on information technology (ACIT’2003), Alexandria, Egypt, pp 42–59

  • Al-Sulaiti L, Atwell E (2006) The design of a corpus of contemporary Arabic. Int J Corpus Linguist 11:135–171

    Article  Google Scholar 

  • Apidianaki M (2009) Data-driven semantic analysis for multilingual WSD and lexical selection in translation. In: Proceedings of the 12th conference of the European chapter of the Association for Computational Linguistics. The Association for Computer Linguistics, Athens, Greece, pp 77–85

  • Apidianaki M, Gong L (2015) LIMSI: translations as source of indirect supervision for multilingual all-words sense disambiguation and entity linking. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 298–302

  • Ashraf J, Iqbal N, Sarfraz Khattak N, Mohsin Zaidi A (2010) Speaker independent Urdu speech recognition using HMM. In: 15th international conference on applications of Natural Language to Information Systems (NLDB) Cardiff, UK, pp 140–148

  • Atkins S (1993) Tools for computer-aided corpus lexicography: the Hector project. Acta Linguist Hung 41:5–72

    Google Scholar 

  • Attia M (2008) Handling Arabic morphological and syntactic ambiguity within the LFG framework with a view to machine translation. Ph.D. Thesis, University of Manchester, UK

  • Attia M, Rashwan M, Ragheb A, Al-Badrashiny M, Al-Basoumy H (2008) A compact Arabic lexical semantics language resource based on the theory of semantic fields. In: Proceedings of the 6th international conference on language resources and evaluation, Marrakech, Morocco, pp 13–18

  • Attia M, Samih Y, Shaalan K, Genabith J (2012) The floating Arabic dictionary: an automatic method for updating a lexical database through the detection and lemmatization of the unknown words. In: Proceedings of the international conference on computational linguistics (COLING), Mumbai, India, pp 83–96. http://www.aclweb.org/anthology/C12-1006

  • Atwell E, Al-Sulaiti L, Al-Osaimi S, Abu Shawar B (2004) A review of Arabic corpus analysis tools. In: Proceedings of the 11th conference on natural language processing (TALN 2004). http://www.comp.leeds.ac.uk/eric/atwell04talnArabic.pdf

  • Ayed R, Bounhas I, Elayeb B, Evrard F, Bellamine Ben Saoud N (2012a) Arabic morphological analysis and disambiguation using a possibilistic classifier. In: Proceedings of the 8th international conference on intelligent computing (ICIC2012), LNAI 7390, Springer-Verlag, Berlin, Germany, pp 274–279. http://dx.doi.org/10.1007/978-3-642-31576-3_36

  • Ayed R, Bounhas I, Elayeb B, Evrard F, Bellamine Ben Saoud N (2012b) A possibilistic approach for the automatic morphological disambiguation of Arabic texts. In: Proceedings of the 13th international conference on software engineering, artificial intelligence, networking and parallel/distributed computing (SNPD2012). IEEE Computer Society, Kyoto, Japan, pp 187–194. https://doi.org/10.1109/snpd.2012.21

  • Ayed R, Bounhas I, Elayeb B, Bellamine Ben Saoud N, Evrard F (2014a) Improving Arabic texts morphological disambiguation using possibilistic classifier. In: Proceedings of the 19th international conference on application of natural language to information systems (NLDB). LNCS 8455, Springer International Publishing, Switzerland, pp 138–147. https://doi.org/10.1007/978-3-319-07983-7_18

  • Ayed R, Bounhas I, Elayeb B, Bellamine Ben Saoud N, Evrard F (2014b) Evaluation d’une approche possibiliste pour la désambiguïsation des textes arabes. In: Actes de Traitement Automatique des Langue Naturelles (TALN’2014). ATALA, pp 316–327. http://www.aclweb.org/anthology/F/F14/F14-1028.pdf

  • Azmi AM, Al-Thanyyan S (2012) A text summarizer for Arabic. Comput Speech Lang 26(4):260–273. https://doi.org/10.1016/j.csl.2012.01.002

    Article  Google Scholar 

  • Banerjee S, Pedersen T (2003) Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the 18th international joint conference on artificial intelligence, San Francisco, CA, USA, pp 805–810

  • Bar-Hillel Y (1960) Automatic translation of languages, advances in computers. Academic Press, New York

    MATH  Google Scholar 

  • Ben Amar BF, Gargouri B, Ben Hamadou A (2011. Domain ontology generation using LMF standardized dictionary structure. In: The 6th international conference on software and data technologies (ICSOFT2011), Seville, Spain, pp 396–401

  • Ben Amar BF, Gargouri B, Ben Hamadou A (2013) Domain ontology enrichment based on the semantic component of LMF-standardized dictionaries. In: Proceedings of knowledge science, engineering and management—6th international conference, KSEM 2013, LNAI 8041, Springer-Verlag Berlin Heidelberg, pp 404–419. https://doi.org/10.1007/978-3-642-39787-5_33

  • Ben Khiroun O, Elayeb B, Bounhas I, Evrard F, Bellamine Ben Saoud N (2012) A possibilistic approach for automatic word sense disambiguation. In: Proceedings of the 24th conference on computational linguistics and speech processing (ROCLING 2012), Chung-Li, Taiwan, China, pp 261–27. http://aclweb.org/anthology/O/O12/O12-1025.pdf

  • Ben Khiroun O, Ayed R, Elayeb B, Bounhas I, Bellamine Ben Saoud N, Evrard F (2014a) Towards a new standard Arabic test collection for mono- and cross-language information retrieval. In: Proceedings of the 19th international conference on application of natural language to information systems (NLDB). LNCS 8455, Springer International Publishing, Switzerland, pp 168–171. https://doi.org/10.1007/978-3-319-07983-7_23

  • Ben Khiroun O, Elayeb B, Bounhas I, Evrard F, Bellamine Ben Saoud N (2014b) Improving query expansion by automatic query disambiguation in intelligent information retrieval. In: Proceedings of the sixth international conference on agents and artificial intelligence (ICAART). ESEO, Angers, Loire Valley, France, pp 153–160

  • Ben Khiroun O, Elayeb B, Bellamine Ben Saoud N (2018) Towards a query translation disambiguation approach using possibility theory. In: Proceedings of the 10th international conference on agents and artificial intelligence (ICAART). Funchal, Madeira, Portugal, pp 606–613

  • Ben Romdhane W, Elayeb B, Bounhas I, Evrard F, Bellamine Ben Saoud N (2013) A possibilistic query translation approach for cross-language information retrieval. In: Proceedings of the 9th international conference on intelligent computing (ICIC2013), LNCS 7996, Springer-Verlag, Berlin, Germany, pp 73–82. http://dx.doi.org/10.1007/978-3-642-39482-9_9

  • Ben Romdhane W, Elayeb B, Bellamine Ben Saoud N (2017) A discriminative possibilistic approach for query translation disambiguation. In: Proceedings of the 22nd international conference on applications of natural language to information systems (NLDB2017), LNCS 10260, Springer, pp 366–379. https://doi.org/10.1007/978-3-319-59569-6_45

  • Bernard JRL (ed) (1986) Macquarie Thesaurus. Macquarie, Sydney

    Google Scholar 

  • Beseiso M, Ahmad AR, Ismail R (2010) A survey of Arabic language support in semantic web. Int J Comput Appl 9(1):35–40

    Google Scholar 

  • Black W, Elkateb S, Rodriguez H, Alkhalifa M, Vossen P, Pease A, Fellbaum C (2006) Introducing the Arabic WordNet project. In: Proceedings of the third international WordNet conference (GWC 2006), South Jeju Iseland, Korea, pp 295–299. http://semanticweb.kaist.ac.kr/conference/gwc/pdf2006/74.pdf

  • Blansché A (2006) Classification non supervisée avec pondération d’attributs par des méthodes évolutionnaires. Ph.D. thesis, Louis Pasteur University, France

  • Boudabous MM, Belguith Hadrich L, Sadat F (2013) Exploiting the Arabic Wikipedia for semi-automatic construction of a lexical ontology. Int J Metadata Semant Ontol 8(3):245–253

    Article  Google Scholar 

  • Boudchiche M, Mazroui A, Ould Abdallahi Ould Bebah M, Lakhouaja A, Boudlal A (2016) AlKhalil Morpho Sys 2: a robust Arabic morpho-syntactic analyzer. J King Saud Univ Comput Inf Sci. Available online 6 June 2016. http://dx.doi.org/10.1016/j.jksuci.2016.05.002

  • Bouhriz N, Benabbou F, Ben Lahmar E (2016) Word sense disambiguation approach for Arabic text. Int J Adv Comput Sci Appl 7(4):381–385

    Google Scholar 

  • Bounhas I (2012) Construction et intégration d’ontologies pour la cartographie socio-sémantique de fonds documentaires arabes guidée par la fiabilité de l’information. Ph.D. Thesis, University of Tunis-ElManar, Tunisia

  • Bounhas I, Elayeb B, Evrard F, Slimani Y (2010) Towards a computer study of the reliability of Arabic stories. J Am Soc Inf Sci Technol 61(8):1686–1705. https://doi.org/10.1002/asi.21356

    Article  Google Scholar 

  • Bounhas I, Elayeb B, Evrard F, Slimani Y (2011a) Organizing contextual knowledge for Arabic text disambiguation and terminology extraction. Knowl Org J 38(6):473–490

    Google Scholar 

  • Bounhas I, Elayeb B, Evrard F, Slimani Y (2011b) ArabOnto: experimenting a new distributional approach for Building Arabic Ontological Resources. Int J Metadata, Semant Ontol 6(2):81–95. https://doi.org/10.1504/IJMSO.2011.046578

    Article  Google Scholar 

  • Bounhas I, Lahbib W, Elayeb B (2014a) Arabic domain terminology extraction: a literature review. In: Proceedings of the 13th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE). LNCS 8841, Springer-Verlag Berlin Heidelberg, Amantea, Italy, pp 792–799. https://doi.org/10.1007/978-3-662-45563-0_51

  • Bounhas I, Lahbib W, Elayeb B (2014b) Extraction de terminologies en langue Arabe: un état de l’art. In: Proceedings of Cinquième Journées Francophones sur les Ontologies (JFO). Hammamet, Tunisia, pp 271–282

  • Bounhas I, Ayed R, Elayeb B, Ben Saoud NB (2015a) A hybrid possibilistic approach for Arabic full morphological disambiguation. Data Knowl Eng 100(Part B):240–254. https://doi.org/10.1016/j.datak.2015.06.008

    Article  Google Scholar 

  • Bounhas I, Ayed R, Elayeb B, Evrard F, Ben Saoud NB (2015b) Experimenting a discriminative possibilistic classifier with reweighting model for Arabic morphological disambiguation. Comput Speech Lang 33(1):67–87. https://doi.org/10.1016/j.csl.2014.12.005

    Article  Google Scholar 

  • Bounhas I, Elayeb B, Evrard F, Slimani Y (2015c) Information reliability evaluation: from Arabic storytelling to computer sciences. ACM J Comput Cult Herit. Article 14. http://dx.doi.org/10.1145/2693847

  • Bousmaha KZ, Abdoun SC, Belguith LH, Rahmouni MK (2013) Une Approche de désambiguïsation morpholexicale évaluée sur l’analyseur morphologique Alkhalil. Revue RIST 20(2):32–46

    Google Scholar 

  • Bouzoubaa K (2011) Extending AWN with nouns and verbs and realizing a web prototype. In: Proceedings of the experts meeting on Arabic ontologies and semantic networks, Tunis, July 26–28, 2011

  • Boyd-Graber J, Fellbaum C, Osherson D, Schapire R (2006) Adding sense, weighted connections to WordNet. In: Proceedings of the 3rd international WordNet conference, South Jeju Iseland, Korea, pp 29–35. http://semanticweb.kaist.ac.kr/conference/gwc/pdf2006/53.pdf

  • Brill E (1993) A corpus-based approach to language learning. Thesis non-published, University of Pennsylvania, Department of Computer and Information Science, Pennsylvania

    Google Scholar 

  • Brown PF, Pietra SAD, Pietra VJD, Mercer RL (1991) Word-sense disambiguation using statistical methods. In: Proceedings of the 29th annual meeting of the association for computational linguistics. The Association for Computational Linguistics, Berkeley, California, USA, pp 264–270

  • Bruce R, Wiebe J (1994) Word-sense disambiguation using decomposable models. In: Proceedings of the 32nd annual meeting of the association for computational linguistics (ACL, Las Cruces, NM), pp 139–145

  • Buckwalter T (2004) Buckwalter Arabic morphological analyzer version 2.0. Online publication, linguistic data consortium (LDC) catalog number LDC2002L49. ISBN: 1-58563-257-0, from http://www.nongnu.org/aramorph/

  • Cai JF, Lee WS, Teh YW (2007) NUS-ML: Improving word sense disambiguation using topic features. In: Proceedings of the 4th international workshop on semantic evaluations (SemEval2007), pp 249–252

  • Calzolari N, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (eds) (2014) In: Proceedings of LREC 2014, ninth international conference on language resources and evaluation, Reykjavik, Iceland

  • Chan YS, Ng HT (2005) Scaling up word sense disambiguation via parallel texts. In: Proceedings of the 20th national conference on artificial intelligence and the seventeenth innovative applications of artificial intelligence conference. AAAI Press/The MIT Press, Pittsburgh, Pennsylvania, USA, pp 1037–1042

  • Charniak E, Blaheta D, Ge N, Hall K, Hale J, Johnson M (2000) Bllip 1987–89 WSJ corpus release 1. Tech. rep. LDC2000T43. Linguistic Data Consortium (Philadelphia, PA)

  • Cheragui MA (2012) Une analyse morphologique de la langue Arabe basée sur l’aide multicritère de la decision. In: CTIC 2012, Université d’Adrar, Algérie. http://ceur-ws.org/Vol-942/paper_13.pdf

  • Chklovski T, Mihalcea R (2002) Building a sense tagged corpus with Open Mind Word Expert. In: Proceedings of the SIGLEX/SENSEVAL workshop on word sense disambiguation: recent successes and future directions, Philadelphia, pp 116–122. http://www.aclweb.org/anthology/W02-0817

  • Chklovski T, Mihalcea R, Pedersen T, Purandare A (2004) The Senseval-3 multilingual English–Hindi lexical sample task. In: Proceedings of the third international workshop on the evaluation of systems for the semantic analysis of text (Senseval-3), Barcelona, Spain, pp 5–8. http://www.d.umn.edu/~tpederse/Pubs/senseval3-3.pdf

  • Clear J (1993) The British national corpus. In: Delany P, Landow GP (eds) The digital word: text-based computing in the humanities. MIT Press, Cambridge, pp 163–187

    Google Scholar 

  • Clive H (2004) Modern Arabic: structures, functions, and varieties. Georgetown University Press. ISBN: 1-58901-022-1

  • Cohen J (1968) Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull 70(4):213–220

    Article  Google Scholar 

  • Cohn T (2003) Performance metrics for word sense disambiguation. In: Proceedings of the Australasian language technology workshop. The Australasian Language Technology Association, Melbourne, Australia, pp 86–93

  • Croft W (1983) Experiments with representation in a document retrieval system. Res Dev 2(1):1–21

    Google Scholar 

  • Cuadros M, Rigau G (2006) Quality assessment of large scale knowledge resources. In: Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP), Sydney, Australia, pp 534–541

  • Dagan I, Itai A (1994) Word sense disambiguation using a second language monolingual corpus. Comput Linguist 20(4):563–596

    Google Scholar 

  • Daimi K (2001) Identifying syntactic ambiguities in single-parse Arabic sentence. Comput Humanit 35:333–349

    Article  Google Scholar 

  • Daoud D (2009) Synchronized morphological and syntactic disambiguation for Arabic. Res Comput Sci Spec Issue Adv Comput Linguist 41:73–86

    Google Scholar 

  • Daoud M, Daoud D (2009) Arabic disambiguation using dependency grammar. Presented at the poster sessions of Traitement Automatique des Langues Naturelles, Senlis, Oise, France

    Google Scholar 

  • Darwish K, Oard DW (2002) Term selection for searching printed Arabic. In: Proceedings of the 25th ACM SIGIR conference on research and development in information retrieval. ACM, New York, USA, pp 261–268. http://doi.acm.org/10.1145/564376.564423

  • Debili F, Achour H, Souissi E (2002) La langue Arabe et l’ordinateur: de l’étiquetage grammatical à la voyellation automatique. In: Correspondances de l’IRMC, No 71, Tunis, Tunisia

  • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Derwester S, Dumais ST, Furnas GW, Landauer TK, Harshmann R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41:391–407

    Article  Google Scholar 

  • Desclés JP (2006) Contextual exploration processing for discourse and automatic annotations of texts. In: Proceedings of the 19th international FLAIRS conference. AAAI Press, Florida, USA, pp 281–284. https://www.aaai.org/Papers/FLAIRS/2006/Flairs06-055.pdf

  • Di Eugenio B (2000) On the usage of kappa to evaluate agreement on coding tasks. In: Proceedings of LREC, Athens, Greece, pp 441–444

  • Diab MT (2004a) An unsupervised approach for bootstrapping Arabic sense tagging. In: Proceedings of workshop on computational approaches to Arabic script-based languages, Semitic’04, ACL, Stroudsburg, PA, USA, pp 43–50

  • Diab M (2004b) Word sense disambiguation within a multilingual framework. Ph.D. Thesis, University of Maryland, USA

  • Diab M, Hacioglu K, Jurafsky D (2004) Automatic tagging of Arabic text: from raw text to base phrase chunks. Short paper. In: Proceedings of the conference of the North American chapter of the association for computational linguistics: human language technologies (HLT-NAACL), Stroudsburg, PA, USA: Association for Computational Linguistics, pp 149–152

  • Diab M, Alkhalifa M, Elkateb S (2007a) Semeval 2007 task 18: Arabic semantic labeling. In: Proceedings of the 4th international workshop on semantic evaluations (SemEval-2007), pp 93–98. http://www.aclweb.org/anthology/S07-1017

  • Diab M, Hacioglu K, Jurafsky D (2007b) Automated methods for processing Arabic text: from tokenization to base phrase chunking. Book chapter, Arabic computational morphology: knowledge-based and empirical methods. Editors Antal van den Bosch and Abdelhadi Soudi. Kluwer/Springer Publications

  • Diehl F, Gales MJF, Tomalin M, Woodland PC (2012) Morphological decomposition in Arabic ASR systems. Comput Speech Lang 26(4):229–243. https://doi.org/10.1016/j.csl.2011.12.001

    Article  Google Scholar 

  • Dukes K (2013) Statistical parsing by machine learning from a classical Arabic treebank, Ph.D. Thesis, The University of Leeds School of Computing

  • Edmonds P, Hirst G (2002) Near-synonymy and lexical choice. Comput Linguist 28(2):105–144

    Article  Google Scholar 

  • Eid SM, Al-Said AB, Wanas NM, Rashwan MA, Hegazy NH (2010) A comparative study of Rocchio classifier applied to supervised WSD using Arabic lexical samples. In: Proceedings of the tenth conference of language engineering (SEOLEe’2010), Cairo, Egypt, December 15–16, 2010

  • Elayeb B, Bounhas I (2016) Arabic cross-language information retrieval: a review. ACM Trans Asian Low-Resour Lang Inf Process. Article 18. http://dx.doi.org/10.1145/2789210

  • Elayeb B, Evrard F, Zaghdoud M, Ben Ahmed M (2009) Towards an intelligent possibilistic web information retrieval using multiagent system. Interact Technol Smart Educ Spec Issue: New Learn Support Syst 6(1):40–59. https://doi.org/10.1108/17415650910965191

    Article  Google Scholar 

  • Elayeb B, Bounhas I, Ben Khiroun O, Evrard F, Bellamine Ben Saoud N (2011) Towards a possibilistic information retrieval system using semantic query expansion. Int J Intell Inf Technol 7(4):1–25. https://doi.org/10.4018/jiit.2011100101

    Article  Google Scholar 

  • Elayeb B, Bounhas I, Ben Khiroun O, Evrard F, Bellamine Ben Saoud N (2015a) A comparative study between possibilistic and probabilistic approaches for monolingual word sense disambiguation. Knowl Inf Syst 44(1):91–126. https://doi.org/10.1007/s10115-014-0753-z

    Article  Google Scholar 

  • Elayeb B, Bounhas I, Ben Khiroun O, Bellamine Ben Saoud N (2015b) Combining semantic query disambiguation and expansion to improve intelligent information retrieval. In: Duval B, van den Herik J, Loiseau S, Filipe J (eds) ICAART2014 revised selected papers, LNAI 8946, pp 280–295. https://doi.org/10.1007/978-3-319-25210-0_17

  • Elayeb B, Ben Romdhane W, Bellamine Ben Saoud N (2018) Towards a new possibilistic query translation tool for cross-language information retrieval. Multim Tools Appl 77(2):2423–2465. https://doi.org/10.1007/s11042-017-4398-2

    Article  Google Scholar 

  • Elghamry K (2006) Sense and homograph disambiguation in Arabic using coordination-based semantic similarity. In: Proceedings of AUC-OXFORD conference on language and linguistics, Cairo, Egypt, March 2006

  • El-Imam YA (2004) Phonetization of Arabic: rules and algorithms. Comput Speech Lang 18(4):339–373. https://doi.org/10.1016/S0885-2308(03)00035-4

    Article  Google Scholar 

  • Elmougy S, Taher H, Noaman H (2008) Naïve Bayes classifier for Arabic word sense disambiguation. In: Proceeding of the 6th international conference on informatics and systems, pp 16–21

  • Faidi K, Ayed R, Bounhas I, Elayeb B (2014) Comparing Arabic NLP tools for hadith classification. In: Proceedings of the second international conference on Islamic applications in computer science and technologies (IMAN), Amman, Jordan

  • Farag A, Nürnberger A (2008) Arabic/English word translation disambiguation using parallel corpora and matching schemes. In: Proceedings of 12th annual conference of the European association for machine translation, pp 6–11

  • Fellbaum C (ed) (1998) WordNet: an electronic lexical database. MIT Press, Cambridge

    MATH  Google Scholar 

  • Francopoulo G (2013) LMF lexical markup framework. Wiley-ISTE, Hoboken

    Book  Google Scholar 

  • Gal Y (2002) An HMM approach to vowel restoration in Arabic and Hebrew. In: Proceedings of the SIGLEX/SENSEVAL workshop on word sense disambiguation: recent successes and future directions, Philadelphia, USA. http://aclweb.org/anthology/W/W02/W02-0504.pdf

  • Gale WA, Church KW (1993) A program for aligning sentences in bilingual corpora. Comput Linguist 19(1):75–102

    Google Scholar 

  • Girju R, Badulescu A, Andmoldovan D (2003) Learning semantic constraints for the automatic discovery of part-whole relations. In: Proceedings of the conference of the North American chapter of the association for computational linguistics on human language technology, Edmonton, Alta, Canada, pp 1–8

  • Graff D (2003) English gigaword. Tech. rep. LDC2003T05. Linguistic Data Consortium, Philadelphia, PA

  • Gruber TR (1993) Toward principles for the design of ontologies used for knowledge sharing. In: Proceedings of the international workshop on formal ontology, Padova, Italy

  • Habash N, Rambow O (2005) Arabic Tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In: Proceedings of the 43rd annual meeting on association for computational linguistics, Stroudsburg, PA, USA, pp 573–580

  • Habash N, Rambow O (2007) Arabic diacritization through full morphological tagging. Human language technologies: the conference of the North American chapter of the association for computational linguistics. Stroudsburg, PA, USA, pp 53–56

    Google Scholar 

  • Habash N, Faraj R, Roth R (2009a) Syntactic annotation in the Columbia Arabic Treebank. In: Proceedings of MEDAR second international conference on Arabic language resources and tools, pp 125–132

  • Habash N, Roth R, Rambow O, Eskander R, Tomeh N (2013) Morphological analysis and disambiguation for dialectal Arabic. In: Proceedings of the 2013 conference of the North American Chapter of the association for computational linguistics: human language technologies (NAACL-HLT), Atlanta, GA, USA, pp 426–432. http://www.aclweb.org/anthology/N13-1044

  • Hadni M, Ouatik S, Lachkar A (2013) Hybrid part-of-speech tagger for non-vocalized Arabic text. Int J Nat Lang Comput 2(6):1–15

    Article  Google Scholar 

  • Hadni M, El Alaoui S, Lachkar A (2016) Word sense disambiguation for Arabic text categorization. Int Arab J Inf Technol 13(1):215–222

    Google Scholar 

  • Hajic J (2000) Morphological tagging: data vs. dictionaries. In: Proceedings of the 1st North American chapter of the association for computational linguistics conference, Stroudsburg, PA, USA, pp 94–101

  • Harabagiu S, Miller G, Moldovan D (1999) WordNet 2—a morphologically and semantically enhanced resource. In: Proceedings of the ACL SIGLEX workshop: standardizing lexical resources, pp 1–8

  • Harman D (1986) An experimental study of factors important in document ranking. In: Proceedings of the 9th annual international ACM SIGIR conference on research and development in information retrieval, pp 186–193

  • Hoceini Y, Abbas M (2009a) Morphosyntactical disambiguation model of Arabic based on a multi-criteria approach. In: Arabnia HR, David de la Fuenteand Jose AO (eds) International conference on artificial intelligence, Las Vegas Nevada, USA, vol 2, pp 756–762

  • Hoceini Y, Abbas M (2009b) Une analyse multicritère de l’arabe. In: Journées d’étude du FSP France-Maghreb : Pratiques langagière au Maghreb : corpus et applications, Paris, France

  • Hoceini Y, Abbas M (2009c) Méthodologie Multicritère de Désambiguïsation Morphosyntaxique de la langue Arabe. In the 3rd international conference on Arabic language processing. In: Proceedings of CITALA’09, Rabat Morocco, pp 89–95

  • Hoceini Y, Cheragui MA, Abbas M (2011) Towards a new approach for disambiguation in NLP by multiple criterian decision-aid. Prague Bull Math Linguist 95:19–32. https://doi.org/10.2478/v10108-011-0002-5

    Article  Google Scholar 

  • Ide N, Suderman K (2006) Integrating linguistic resources: the American National corpus model. In: Proceedings of the 5th language resources and evaluation conference (LREC, Genoa, Italy)

  • Ide N, Veronis J (1998) Word sense disambiguation: the state of the art. Comput Linguist 24(1):1–40

    Google Scholar 

  • Jalabert F, Lafourcade M (2004) Nommage de sens à l’aide des vecteurs conceptuels. RFIA’04: Reconnaissance des Formes et Intelligence Artificielle. Toulouse, France, pp 539–547

    Google Scholar 

  • Jarrar M (2011) Building a Formal Arabic ontology methodology and progress. In: Proceedings of the experts meeting on Arabic ontologies and semantic networks. Alecso Arab League, April 26–28, Tunis, Tunisia

  • Jelinek F (1976) Continuous speech recognition by statistical methods. In: Proceedings of the IEEEE, pp 532–556

  • Jiang J, Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings on international conference on research in computational linguistics, Taiwan, pp 1–15. http://arxiv.org/pdf/cmp-lg/9709008.pdf

  • Jurafsky D, Martin JH (2000) Speech and language processing. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Jurafsky D, Martin JH (2009) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Pearson Prentice Hall, Upper Saddle River

    Google Scholar 

  • Khemakhem A, Elleuch I, Gargouri B, Ben Hamadou A (2009) Towards an automatic conversion approach of editorial Arabic dictionaries into LMF-ISO 24613 standardized mode. In: The 2nd international conference on Arabic language resources and tools (MEDAR), April 22–23, Cairo, Egypt

  • Khoja Sh (2001) APT: Arabic part-of-speech tagger. In: Proceedings of student workshop at the second meeting of the North American association for computational linguistics, Carnegie Mellon University, Pennsylvania, USA. http://archimedes.fas.harvard.edu/mdh/arabic/NAACL.pdf

  • Kilgarriff A (1997) I don’t believe in word senses. Comput Humanit 31(2):91–113

    Article  Google Scholar 

  • Kilgarriff A (2006) Word senses. In: Agirre E, Edmonds P (eds) Word sense disambiguation: algorithms and applications. Springer, New York, pp 29–46

    Chapter  Google Scholar 

  • Kilgarriff A, Yallop C (2000) What’s in a thesaurus? In: Proceedings of the 2nd conference on language resources and evaluation (LREC), Athens, Greece, pp 1371–1379

  • Kirchhoff K, Vergyri D, Bilmes J, Duh K, Stolcke A (2006) Morphology-based language modeling for conversational Arabic speech recognition. Comput Speech Lang 20(4):589–608. https://doi.org/10.1016/j.csl.2005

  • Kucera H, Francis WN (1967) Computational analysis of present-day American English. Brown University Press, Providence

    Google Scholar 

  • Lahbib W, Bounhas I, Elayeb B, Evrard F, Slimani Y (2013) An hybrid approach for arabic semantic relation extraction. In: Proceedings of the 26th international FLAIRS conference, Florida, USA, pp 315–320

  • Lahbib W, Bounhas I, Elayeb B (2014) Arabic–English domain terminology extraction from aligned corpora. In: Proceedings of the 13th international conference on ontologies, DataBases, and applications of semantics (ODBASE). LNCS 8841, Springer-Verlag, Berlin, Heidelberg, pp 745–759. https://doi.org/10.1007/978-3-662-45563-0_46

  • Lahbib W, Bounhas I, Slimani Y (2015) Arabic terminology extraction and enrichment based on domain-specific text mining. In: Proceedings of the 27th IEEE international conference on tools with artificial intelligence, Vietri sul Mare, Italy, pp 340–347. https://doi.org/10.1109/ictai.2015.59

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174

    Article  MATH  Google Scholar 

  • Leacock C, Chodorow M (1998) Combining local context and WordNet sense similarity for word sense identification. WordNet: Electron Lex Database 49(2): 265–283

  • Leacock C, Towell G, Voorhees E (1993) Corpus-based statistical sense resolution. In: Proceedings of the ARPA workshop on human language technology (Princeton, NJ), pp 260–265

  • Lefever E, Hoste V (2010) SemEval-2010 task 3: cross-lingual word sense disambiguation. In: Proceedings of the 5th international workshop on semantic evaluation. The Association for Computational Linguistics, Uppsala, Sweden, pp 15–20

  • Lefever E, Hoste V (2013) SemEval-2013 task 10: cross-lingual word sense disambiguation. In: Second joint conference on lexical and computational semantics, volume 2: proceedings of the 7th international workshop on semantic evaluation. The Association for Computational Linguistics, Atlanta, Georgia, USA, pp 158–166

  • Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th annual international conference on systems documentation, New York, NY, USA, pp 24–26

  • Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Cybern Control Theory 70(8):707–710

    Google Scholar 

  • Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of 15th international conference on machine learning, pp 296–304. https://www.cs.swarthmore.edu/~richardw/cs65-f08/litreview/phyo.pdf

  • Litkowski KC (2005) Computational lexicons and dictionaries. In: Brown KR (ed) Encyclopedia of language and linguistics, 2nd edn. Elsevier, Oxford, pp 753–761

    Google Scholar 

  • Luger GF (2004) Artificial intelligence: structures and strategies for complex problem-solving, 5th edn. Addison Wesley, Reading

    Google Scholar 

  • Maamouri M, Bies A (2004) Developing an Arabic Treebank: methods, guidelines, procedures, and tools. In: Proceedings of the workshop on computational approaches to Arabic script-based languages. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 2–9

  • Maamouri M, Bies A, Buckwalter T, Mekki W (2004) The Penn Arabic Treebank: building a large-scale annotated Arabic Corpus. In: Proceedings of NEMLAR conference on Arabic language resources and tools, Cairo, Egypt, pp 102–109

  • Maamouri M, Bies A, Kulick S (2009) Creating a methodology for large-scale correction of Treebank annotation: the case of the Arabic Treebank. In: Proceedings of MEDAR second international conference on Arabic language resources and tools, pp 138–144

  • Maamouri M, Bies A, Kulick S, Tabessi D, Krouna S (2012) Egyptian Arabic Treebank Pilot

  • Magnini B, Cavaglià G (2000) Integrating subject field codes into WordNet. In: Proceedings of the 2nd conference on language resources and evaluation (LREC, Athens, Greece), pp 1413–1418

  • Manion SL (2015) SUDOKU: treating word sense disambiguation & entity linking as a deterministic problem—via an unsupervised & iterative approach. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 365–369

  • Manning C, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge

    MATH  Google Scholar 

  • McCray AT, Nelson SJ (1995) The representation of meaning in the UMLS. Meth Inf Med 34:193–201

    Article  Google Scholar 

  • Menai MEB (2014) Word sense disambiguation using evolutionary algorithms-application to Arabic language. Comput Hum Behav 41:92–103. https://doi.org/10.1016/j.chb.2014.06.021

    Article  Google Scholar 

  • Menai MEB, Alsaeedan W (2012) Genetic algorithm for Arabic word sense disambiguation. In: Proceedings of the 13th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, pp 195–200

  • Merhbene L, Zouaghi A, Zrigui M (2009) Arabic word sense disambiguation: the results. In: Proceedings of the student research workshop, RANLP 2009, Borovets, Bulgaria, pp 45–52

  • Merhbene L, Zouaghi A, Zrigui M (2010a) Arabic word sense disambiguation. Proc Int Conf Agents Artif Intell, vol 1. Valencia, Spain, pp 652–655

    Google Scholar 

  • Merhbene L, Zouaghi A, Zrigui M (2010b) Ambiguous Arabic word disambiguation. In: Proceedings of the 11th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, pp 157–164

  • Merhbene L, Zouaghi A, Zrigui M (2012) Lexical disambiguation of Arabic language: an experimental study. Polibits 46:49–54

    Article  Google Scholar 

  • Merhbene L, Zouaghi A, Zrigui M (2013a) A semi-supervised method for Arabic word sense disambiguation using a weighted directed graph. In: Proceedings of the international joint conference on natural language processing, pp 1027–1031

  • Merhbene L, Zouaghi A, Zrigui M (2013b) An experimental study for some supervised lexical disambiguation methods of arabic language. In: Proceedings of the fourth international conference on information and communication technology and accessibility, ICTA 2013, Hammamet, Tunisia, October 24-26, 2013

  • Merhbene L, Zouaghi A, Zrigui M (2014) Approche basée sur les arbres sémantiques pour la désambiguïsation lexicale de la langue arabe en utilisant une procédure de vote. In: Proceedings of the 21st conference on natural language processing (TALN 2014), Marseille, France, pp 281–290

  • Merialdo B (1994) Tagging English text with a probabilistic model. Comput Linguist 20(2):155–171

    Google Scholar 

  • Miller GA, Beckwith R, Fellbaum CD, Gross D, Miller K (1990) WordNet: An online lexical database. Int J Lexicogr 3(4):235–244

    Article  Google Scholar 

  • Miller GA, Leacock C, Tengi R, Bunker RT (1993) A semantic concordance. In: Proceedings of the ARPA workshop on human language technology, pp 303–308

  • Moro A, Navigli R (2015) SemEval-2015 task 13: multilingual all-words sense disambiguation and entity linking. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 288–297

  • Navigli R (2005) Semi-automatic extension of large-scale linguistic knowledge bases. In: Proceedings of the 18th Florida Artificial Intelligence Research Society conference (FLAIRS), Clearwater Beach, FL, USA, pp 548–553

  • Navigli R (2009) Word sense disambiguation: a survey. ACM Comput Surv 41(2):1–69

    Article  Google Scholar 

  • Nelken R, Shieber S (2005) Arabic diacritization using weighed finite-state transducers. The ACL workshop on computational approaches to semitic language. Ann Arbor, Michigan, USA. https://dash.harvard.edu/bitstream/handle/1/2252610/Shieber_ArabicDiacritization.pdf?sequence=2

  • Ng TH (1997) Getting serious about word sense disambiguation. In: Proceedings of the ACL SIGLEX workshop on tagging text with lexical semantics: why, what, and how? Washington D.C., pp 1–7

  • Ng HT, Lee HB (1996) Integrating multiple knowledge sources to disambiguate word senses: an examplar-based approach. In: Proceedings of the 34th annual meeting of the association for computational linguistics (Santa Cruz, CA), pp 40–47

  • Ng HT, Wang B, Chan YS (2003) Exploiting parallel texts for word sense disambiguation: an empirical study. In: Proceedings of the 41st annual meeting of the association for computational linguistics. The Association for Computational Linguistics, Sapporo, Japan, pp 455–462

  • Nguyen T, Vogel S (2008) Context-based Arabic morphological analysis for machine translation. In: Proceedings of the twelfth conference on computational natural language learning, association for computational linguistics, Stroudsburg, PA, USA, pp 135–142

  • Othman E, Shaalan K, Rafea A (2004) Towards resolving ambiguity in understanding Arabic sentence. In: Proceedings of the international conference on Arabic language resources and tools, NEMLAR, Cairo, Egypt, pp 118–122

  • Pantel P (2005) Inducing ontological co-occurrence vectors. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (Ann Arbor, MI), pp 125–132

  • Pasha A, Al-Badrashiny M, Diab M, El Kholy A, Eskander R, Habash N, Pooleery M, Rambow O, Roth R (2014) MADAMIRA: a fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In: Proceedings of the 9th international conference on language resources and evaluation (LREC’14), pp 1094–1101. http://www.lrec-conf.org/proceedings/lrec2014/pdf/593_Paper.pdf

  • Pease A, Niles I, Li J (2002) The suggested upper merged ontology: a large ontology for the semantic Web and its applications. In: Proceedings of the AAAI-2002 workshop on ontologies and the semantic web (Edmonton, Alta., Canada)

  • Pennacchiotti M, Pantel P (2006) Ontologizing semantic relations. In: Proceedings of the 44th association for computational linguistics (ACL) conference joint with the 21th conference on computational linguistics (COLING), Sydney, Australia, pp 793–800

  • Philpot A, Hovy E, Pantel P (2005) The omega ontology. In: Proceedings of the IJCNLP workshop on ontologies and lexical resources (OntoLex, Jeju Island, South Korea), pp 59–66

  • Pianta E, Bentivogli L, Girardi C (2002) MultiWordNet: developing an aligned multilingual database. In: Proceedings of the 1st international conference on global WordNet (Mysore, India), pp 21–25

  • Pinto D, Rosso P, Benajiba Y, Ahachad A, Jimenez-Salazar H (2007) Word sense induction in the Arabic language: a self-term expansion based approach. In: Proceedings of ESOLE07. http://users.dsic.upv.es/~prosso/resources/PintoEtAl_ESOLE07.pdf

  • Pradhan S, Loper E, Dligach D, Palmer M (2007) Semeval-2007 task-17: English lexical sample, SRL and all words. In: Proceedings of the 4th international workshop on semantic evaluations (SemEval2007), pp 87–92

  • Proctor P (ed) (1978) Longman dictionary of contemporary English. Longman Group, Harlow

    Google Scholar 

  • Pustejovsky J (1991) The generative lexicon. Comput Linguist 17(4):409–441

    Google Scholar 

  • Pustejovsky J (1995) The generative lexicon. MIT Press, Cambridge

    Google Scholar 

  • Rashwan M, Albadrashiny M, Attia M, Abdou S, Rafea A (2011) A stochastic Arabic diacritizer based on a hybrid of factorized and unfactorized textual features. IEEE Trans Audio Speech Lang Process 19(1):166–175

    Article  Google Scholar 

  • Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI), Montreal, Canada, pp 448–453

  • Resnik P, Smith NA (2003) The Web as a parallel corpus. Comput Linguist 29(3):349–380

    Article  Google Scholar 

  • Resnik P, Yarowsky D (1999) Distinguishing systems and distinguishing senses: new evaluation methods for word sense disambiguation. J Nat Lang Eng 5(2):113–133

    Article  Google Scholar 

  • Roget PM (1911) Roget’s international thesaurus, 1st edn. Cromwell, New York

    Google Scholar 

  • Roth R, Rambow O, Habash N, Diab M, Rudin C (2008) Arabic morphological tagging, diacritization, and lemmatization using lexeme models and feature ranking. In: Proceedings of the association for computational linguistics conference (ACL), Columbus, Ohio, USA, pp 117–120

  • Said A, El-Sharqwi M, Chalabi A, Kamal E (2013) A hybrid approach for Arabic diacritization, in natural language processing and information systems 18th international conference on applications of natural language to information systems (NLDB) Salford, UK, pp 53–64

  • Shaalan MA (2012) Handling unknown words in Arabic FST Morphology. In: Proceedings of the 10th international workshop on finite state methods and natural language processing, association for computational linguistics, Donostia–San Sebastian, pp 20–24

  • Shallu C, Gupta V (2013) A survey of word-sense disambiguation effective techniques and methods for Indian languages. J Emerg Technol Web Intell 5(4):354–360

    Google Scholar 

  • Snyder B, Palmer M (2001) The English all-words task. In: Proceedings of SENSEVAL-2: second international workshop on evaluating word sense disambiguation systems, Toulouse, France. https://wiki.eecs.yorku.ca/course_archive/2010-11/F/6390/_media/snyder_1_.pdf

  • Soanes C, Stevenson A (eds) (2003) Oxford dictionary of English. Oxford University Press, Oxford

    Google Scholar 

  • Soudani N, Bounhas I, Elayeb B, Slimani Y (2014a) Toward an Arabic ontology for Arabic word sense disambiguation based on normalized dictionaries. In: Proceedings of the 13th international conference on ontologies, DataBases, and applications of semantics (ODBASE). LNCS 8842. Springer-Verlag, Berlin, Heidelberg, pp 655–658. https://doi.org/10.1007/978-3-662-45550-0_68

  • Soudani N, Bounhas I, Elayeb B, Slimani Y (2014b) An LMF-based normalization approach of Arabic Islamic dictionaries for Arabic word sense disambiguation: application on hadith. In: Proceedings of the second international conference on Islamic applications in computer science and technologies (IMAN), Amman, Jordan

  • Soudani N, Bounhas I, Elayeb B, Slimani Y (2014c) Generic normalization approach of Arabic dictionaries for Arabic word sense disambiguation. In: Proceedings of Cinquième Journées Francophones sur les Ontologies (JFO), Hammamet, Tunisia, pp 309–315

  • Soudani N, Bounhas I, Slimani Y (2016) Semantic information retrieval: a comparative experimental study of NLP tools and language resources for Arabic. In: Proceedings of the 28th international conference on tools with artificial intelligence (ICTAI), November 06-08, San Jose, Canada

  • Specia L, Nunes MGV, Stevenson M (2007) Learning expressive models for word sense disambiguation. In: Proceedings of the 45th annual meeting of the association of computational linguistics. The Association of Computational Linguistics, Prague, Czech Republic, pp 41–48

  • Tlili-Guiassa Y, Merouani HF (2006) Désambiguïsation sémantique d’un texte Arabe. In: Proceedings of the 13th conference on natural language processing (TALN 2006), Leuven, Belgium, pp 41–60

  • Tufis D, Ion R, Ide N (2004) Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering and aligned WordNets. In: Proceedings of the 20th international conference on computational linguistics. The Association for Computational Linguistics, Geneva, Switzerland, pp 1312–1318

  • Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10(5):988–999. https://doi.org/10.1109/72.788640

    Article  Google Scholar 

  • Véronis J (2003a) Sense tagging: Does it make sense? In: Rayson P, Wilson A, McEnery T et al (eds) Proceedings of the Corpus linguistics 2001 conference. Peter Lang Frankfurt, Lancaster, UK. http://sites.univ-provence.fr/veronis/pdf/2001-lancaster-sense.pdf

  • Vossen P (ed) (1998) EuroWordNet: a multilingual database with lexical semantic networks. Kluwer Academic Publishers, Norwell

    MATH  Google Scholar 

  • Wilks Y, Slator B, Guthrie L (eds) (1996) Electric words: dictionaries, computers and meanings. MIT Press, Cambridge

    Google Scholar 

  • Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meetings of the associations for computational linguistics, pp 133–138

  • Yaseen et al (2006) Building annotated written and spoken Arabic LR’s in NEMLAR project. In: Proceedings of the 5th international conference on language resources and evaluation, Genoa-Italy, pp 533–538

  • Yue Y (2012) A multi-classified method of support vector machine (SVM) based on entropy. Appl Mech Mater 241–244:1629–1632. https://doi.org/10.4028/www.scientific.net/AMM.241-244.1629

    Article  Google Scholar 

  • Zaghouani W (2014) Critical survey of the freely available Arabic corpora. In: Proceedings of LREC 2014, ninth international conference on language resources and evaluation, Reykjavik, Iceland, pp 1–8

  • Zitouni I (ed) (2014) Natural language processing of semitic languages. Springer-Verlag, Berlin, Heidelberg

    Google Scholar 

  • Zitouni I, Sarikaya R (2009) Arabic diacritic restoration approach based on maximum entropy models. Comput Speech Lang 23(3):257–276. https://doi.org/10.1016/j.csl.2008.06.001

    Article  Google Scholar 

  • Zitouni I, Sorensen J, Sarikaya R (2006) Maximum entropy based restoration of Arabic diacritics. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics (COLING-ACL), Sydney, Australia, pp 577–584. http://anthology.aclweb.org/P/P06/P06-1073.pdf

  • Zouaghi A, Merhbene L, Zrigui M (2011) Word sense disambiguation for Arabic language using the variants of the Lesk algorithm. In: Proceedings of the international conference on artificial intelligence (ICAI’11), Las Vegas, USA, pp 561–567

  • Zouaghi A, Merhbene L, Zrigui M (2012a) A hybrid approach for Arabic word sense disambiguation. Int J Comput Process Lang 24(2):133–151. https://doi.org/10.1142/S1793840612400090

    Article  Google Scholar 

  • Zouaghi A, Merhbene L, Zrigui M (2012b) Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation. Artif Intell Rev 38(4):257–269. https://doi.org/10.1007/s10462-011-9249-3

    Article  Google Scholar 

  • Zouaghi A, Zrigui M, Antoniadis G, Merhbene L (2012c) Contribution to semantic analysis of Arabic language. Adv Artif Intell. Article ID 620461. https://doi.org/10.1155/2012/620461

Download references

Acknowledgements

We thank the anonymous reviewers for their constructive comments, which significantly enhanced the quality of this manuscript during reviewing process. The author wish to thank Catherine Joseph who revised the paper and improved its English.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bilel Elayeb.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elayeb, B. Arabic word sense disambiguation: a review. Artif Intell Rev 52, 2475–2532 (2019). https://doi.org/10.1007/s10462-018-9622-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-018-9622-6

Keywords

Navigation