skip to main content
research-article

Arabic Word Sense Disambiguation for Information Retrieval

Published:19 January 2022Publication History
Skip Abstract Section

Abstract

In the context of using semantic resources for information retrieval, the relationship and distance between concepts are considered important for word sense disambiguation. In this article, we experiment with Conceptual Density and Random Walk with graph methods to enhance the performance of the Arabic Information Retrieval System. To do this, a medium-sized corpus was used. The results proved that Random Walk can enhance the performance of the information retrieval system by achieving a mean improvement of 13%, 16%, and 12% in terms of recall, precision, and F-score, respectively.

REFERENCES

  1. [1] Abderrahim M. A.. 2016. Exploitation des Ontologies dans les Systèmes de recherche d’informations Arabes. Thése de doctorat, Universitè de Tlemcen, Algerie.Google ScholarGoogle Scholar
  2. [2] Abderrahim M. A., Dib M., Abderrahim M. A., and Chikh M. A.. 2016. Semantic indexing of Arabic texts for information retrieval system. Int. J. Speech Technol. 19, 2 (June 2016), 229236. https://doi.org/10.1007/s10772-015-9307-3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Agirre E., Arregi X., Artola X., Ilarraza A. Díaz de, and Sarasola K.. 1994. Conceptual distance and automatic spelling correction. Technical Report. Retrieved from http://www.researchgate.net/publication/2273948_Conceptual_Distance_and_Automatic_Spelling_Correction/file/79e4150b4d909428bc.pdf%5Cnhttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.9218.Google ScholarGoogle Scholar
  4. [4] Agirre E., Lacalle O. López de, and Soroa A.. 2014. Random walks for knowledge-based word sense disambiguation. Comput. Linguist. 40, 1 (Mar. 2014), 5784. https://doi.org/10.1162/COLI_a_00164 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Agirre E. and Rigau G.. 1995. A proposal for word sense disambiguation using conceptual distance. Technical Report. https://doi.org/10.1075/cilt.136.16agiGoogle ScholarGoogle Scholar
  6. [6] Agirre E. and Rigau G.. 1996. An Experiment in Word Sense Disambiguation of the Brown Corpus Using WordNet. Technical Report.Google ScholarGoogle Scholar
  7. [7] Agirre E. and Soroa A.. 2008. Using the Multilingual Central Repository for Graph-Based Word Sense Disambiguation. Technical Report. 13881392. Retrieved from http://nipadio.lsi.upc.es/nlp/meaning.Google ScholarGoogle Scholar
  8. [8] Agirre E. and Soroa A.. 2009. Personalizing PageRank for word sense disambiguation. 3341. Retrieved from https://dl.acm.org/citation.cfm?id=1609070. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Al-Shalabi R., Kanaan G., Yaseen M., AlSarayreh B., and Al-Naji N.. 2009. Arabic query expansion using interactive word sense disambiguation. In Proceedings of the 2nd International Conference in Arabic Language Resources and Tools.Google ScholarGoogle Scholar
  10. [10] Alkhatlan A., Kalita J., and Alhaddad A.. 2018. Word sense disambiguation for arabic exploiting arabic WordNet and word embedding. Procedia Comput. Sci. 142 (2018), 5060.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Banerjee S. and Pedersen T.. 2003. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03). 805810. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Basile P., Caputo A., and Semeraro G.. 2014. An Enhanced Lesk Word Sense Disambiguation Algorithm through a Distributional Semantic Model. Technical Report. 15911600. Retrieved from https://www.aclweb.org/anthology/C14-1151.Google ScholarGoogle Scholar
  13. [13] F. Boubekeur, M. Boughanem, L. Tamine, and M. Daoud. 2010. De l’utilisation deWordNet pour l’indexation conceptuelle des documents. In le 13 ème Colloque International sur le Document Electronique (CIDE 13), 16-17 Décembre 2010, INHA, Paris. France. https://lorexplor.istex.fr/Wicri/Ticri/CIDE/fr/images/d/db/CIDE_%282010%29_Boubekeur.pdf.Google ScholarGoogle Scholar
  14. [14] Bouhriz N., Benabbou F., and Lahmar H. Ben. 2016. Word sense disambiguation approach for arabic text. Int. J. Adv. Comput. Sci. Appl. 7, 4 (2016), 381385.Google ScholarGoogle Scholar
  15. [15] Brin S. and Page L.. 1998. The Anatomy of a Large-Scale Hypertextual Web Search Engine The Anatomy of a Search Engine. Technical Report. 107117. https://doi.org/10.1016/S0169-7552(98)00110-XGoogle ScholarGoogle Scholar
  16. [16] Buscaldi D. and Rosso P.. 2008. A conceptual density based approach for the disambiguation of toponyms. Int. J. Geogr. Info. Sci. 22, 3 (Mar. 2008), 301313. https://doi.org/10.1080/13658810701626251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Buscaldi D., Rosso P., and Masulli F.. 2004. Integrating conceptual density with WordNet Domains and CALD glosses for noun sense disambiguation. In Advances in Natural Language Processing. Springer, Berlin, 183194. https://doi.org/10.1007/978-3-540-30228-5_17Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Camacho-Collados J., Pilehvar M. T., and Navigli R.. 2016. Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artific. Intell. 240 (Nov. 2016), 3664. https://doi.org/10.1016/J.ARTINT.2016.07.005Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Che W., Liu Y., Wang Y., Zheng B., and Liu T.. 2018. Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation. In Proceedings of the CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Association for Computational Linguistics, Brussels, Belgium, 5564. Retrieved from http://www.aclweb.org/anthology/K18-2005.Google ScholarGoogle Scholar
  20. [20] Cowie J., Guthrie J., and Guthrie L.. 1992. Lexical disambiguation using simulated annealing. In Proceedings of the 14th conference on Computational linguistics, Vol. 1. Association for Computational Linguistics, 359365. https://doi.org/10.3115/992066.992125 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Devlin J., Chang M., Lee K., and Toutanova K.. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1. Association for Computational Linguistics, 41714186.Google ScholarGoogle Scholar
  22. [22] Farghaly A. and Shaalan K.. 2009. Arabic natural language processing: Challenges and solutions. ACM Trans. Asian Lang. Inform. Process (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Gliozzo A., Magnini B., and Strapparava C.. 2004. Unsupervised Domain Relevance Estimation for Word Sense Disambiguation. Technical Report. 380387. Retrieved from http://wndomains.itc.it.Google ScholarGoogle Scholar
  24. [24] Hadni M., Ouatik S. E. A., and Lachkar A.. 2016. Word sense disambiguation for arabic text categorization. Int. Arab J. Inf. Technol. 13(1A) (2016), 215222.Google ScholarGoogle Scholar
  25. [25] Lesk M. E.. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from a nice cream cone. In Proceedings of the SIGDOC Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Magnini B. and Cavaglia G.. 2000. Integrating subject field codes into WordNet. Proceedings of the 2nd International Conference on Language Resources and Evaluation Theoretical Aspects of Computer Software. 14131418.Google ScholarGoogle Scholar
  27. [27] Magnini B., Strapparava C., Pezzulo G., and Gliozzo A.. 2002. The role of domain information in Word Sense Disambiguation. Natural Lang. Eng. 8, 4 (Dec. 2002), 359373. https://doi.org/10.1017/S1351324902003029 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Menai M. B.. 2014. Word sense disambiguation using an evolutionary approach. Informatica 38, 2 (2014), 155169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Mihalcea R., Chklovski T., and Kilgarriff A.. 2004. The SENSEVAL-3 English Lexical Sample Task. Technical Report. 2528. Retrieved from http://digital.library.unt.edu/ark:/67531/metadc30963/.Google ScholarGoogle Scholar
  30. [30] Mihalcea R. and Moldovan D. I.. 1999. A method for word sense disambiguation of unrestricted text. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics. Association for Computational Linguistics. 152158. https://doi.org/10.3115/1034678.1034709 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Mihalcea R. and Moldovan D. I.. 2000. Semantic indexing using WordNet senses. In Proceedings of the ACL Workshop on IR & NLP. 3545. Retrieved from http://www.seas.smu.edu/rada/papers/acl00.nlp_ir.ps.gz. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Mohammad S. and Hirst G.. 2006. Determining Word Sense Dominance Using a Thesaurus. Technical Report. 121128. Retrieved from https://www.aclweb.org/anthology/E06-1016.Google ScholarGoogle Scholar
  33. [33] Moro A., Raganato A., and Navigli R.. 2014. Entity Linking meets Word Sense Disambiguation. Technical Report. 231244. https://doi.org/:10.1371/journal.pone.0098221Google ScholarGoogle Scholar
  34. [34] Navigli R. and Lapata M.. 2007. Graph connectivity measures for unsupervised word sense disambiguation. Technical Report. 16831688. Retrieved from https://www.research.ed.ac.uk/portal/files/24353052/IJCAI07_272.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Peters M., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., and Zettlemoyer L.. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1. Association for Computational Linguistics, 22272237.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Ponzetto S. P. and Navigli R.. 2010. Knowledge-rich Word Sense Disambiguation rivaling supervised systems. 15221531. Retrieved from https://dl.acm.org/citation.cfm?id=1858835. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Rigau G., Atserias J., and Agirre E.. 1997. Combining unsupervised lexical knowledge methods for word sense disambiguation. In Proceedings of the 35th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 4855. https://doi.org/10.3115/976909.979624 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Rijsbergen C. J. Van. 1979. Information retrieval. J. Amer. Soc. Info. Sci. 30, 6 (1979), 374375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Rosso P., Masulli F., Buscaldi D., Pla F., and Molina A.. 2003. Automatic noun sense disambiguation. In Lecture Notes Computer Science. Springer, Berlin, 273276. https://doi.org/10.1007/3-540-36456-0_27 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Schutze H.. 1998. Automatic word sense discrimination. Comput. Linguist. 24, 1 (1998), 97123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Sinha R. and Mihalcea R.. 2007. Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. Technical Report. 363369. https://doi.org/10.1109/ICSC.2007.87 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Sussna M.. 1993. Word sense disambiguation for free-text indexing using a massive semantic network. Technical Report. 6774. https://doi.org/10.1145/170088.170106 Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Tripodi R. and Pelillo M.. 2017. A game-theoretic approach to word sense disambiguation. Comput. Linguist. 43, 1 (2017), 3170. https://doi.org/10.1162/COLI_a_00274 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Voorhees E. and Ellen M.. 1993. Using WordNet to disambiguate word senses for text retrieval. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’93). ACM Press, 171180. https://doi.org/10.1145/160688.160715 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Weissenborn D., Hennig L., Xu F., and Uszkoreit H.. 2015. Multi-Objective Optimization for the Joint Disambiguation of Nouns and Named Entities. Technical Report. 596605. https://doi.org/10.3115/v1/p15-1058Google ScholarGoogle Scholar
  46. [46] Wilks Y., Fass D., Guo C., McDonald J. E., Plate T., and Slator B. M.. 1990. Providing machine tractable dictionary tools. Mach. Translat. 5, 2 (June 1990), 99154. https://doi.org/10.1007/BF00393758Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Yarowsky D.. 1992. Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. Technical Report. 454460. https://doi.org/10.3115/992133.992140 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Zouaghi A., Marhbène L., and Zrigui M.. 2012. A hybrid approach for Arabic word sense disambiguation. Int. J. Comput. Process. Lang. 24, 2 (2012), 133151.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Zouaghi A., Merhbene L., and Zrigui M.. 2011. Word Sense disambiguation for Arabic language using the variants of the Lesk algorithm. Technical Report. 561567. Retrieved from http://www.lidi.info.unlp.edu.ar/WorldComp2011-Mirror/ICA4686.pdf.Google ScholarGoogle Scholar
  50. [50] Zouaghi A., Zrigui M., Antoniadis G., and Merhbene L.. 2012. Contribution to semantic analysis of arabic language. In Advances in Artificial Intelligence. Springer 18. https://doi.org/10.1155/2012/620461 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Arabic Word Sense Disambiguation for Information Retrieval

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Asian and Low-Resource Language Information Processing
            ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 4
            July 2022
            464 pages
            ISSN:2375-4699
            EISSN:2375-4702
            DOI:10.1145/3511099
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 January 2022
            • Accepted: 1 October 2021
            • Revised: 1 May 2021
            • Received: 1 January 2020
            Published in tallip Volume 21, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format