Skip to main content

An Arabic-Multilingual Database with a Lexicographic Search Engine

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11608))

Abstract

We present a lexicographic search engine built on top of the largest Arabic multilingual database, allowing people to search and retrieve translations, synonyms, definitions, and more. The database currently contains about 150 Arabic multilingual lexicons that we have been digitizing, restructuring, and normalizing over 9 years. It comprises most types of lexical resources, such as modern and classical lexicons, thesauri, glossaries, lexicographic datasets, and (bi/)tri-lingual dictionaries. This is in addition to the Arabic Ontology – an Arabic WordNet with ontologically cleaned content, which is being used to reference and interlink lexical concepts. The search engine was developed with the state-of-the-art design features and according to the W3C’s recommendation and best practices for publishing data on the web, as well as the W3C’s Lemon RDF model. The search engine is publicly available at (https://ontology.birzeit.edu).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Developed by LDC, accessible at: https://catalog.ldc.upenn.edu/LDC2010L01.

  2. 2.

    An open source project, accessible at: https://sourceforge.net/projects/sarf/.

  3. 3.

    http://ontology.birzeit.edu/concept/293198.

References

  1. Khemakhem, A., Gargouri, B., Hamadou, A.B., Francopoulo, G.: ISO standard modeling of a large Arabic dictionary. Nat. Lang. Eng. 22(6), 849–879 (2016)

    Article  Google Scholar 

  2. Hyland, B., Atemezing, G., Villazón-Terrazas, B.: Best practices for publishing linked data. World Wide Web Consortium (2014)

    Google Scholar 

  3. Kamholz, D., Pool, J., Colowick, S.M.: PanLex: building a resource for panlingual lexical translation. In: LREC 2014 (2014)

    Google Scholar 

  4. Al-Hafi, D., Amayreh, H., Jarrar, M.: Usability Evaluating of a Lexicographic Search Engine. Technical Report. Birzeit University (2019)

    Google Scholar 

  5. Amayreh, H., Dwaikat, M., Jarrar, M.: Lexicons Digitization. Technical Report. Birzeit University (2019)

    Google Scholar 

  6. Maks, I., Tiberius, C., Veenendaal, R.V.: Standardising bilingual lexical resources according to the lexicon markup framework. In: LREC 2018 Proceedings ( 2008)

    Google Scholar 

  7. McCrae, J.P., Chiarcos, C., Bond, F., Cimiano, P., et al.: The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud. LREC (2016)

    Google Scholar 

  8. Helou, M.A., Palmonari, M., Jarrar, M.: Effectiveness of automatic translations for cross-lingual ontology mapping. J. Artif. Intell. Res. 55(1), 165–208 (2016). AI Access Foundation

    Article  MathSciNet  Google Scholar 

  9. Jarrar, M.: The arabic ontology - an arabic wordnet with ontologically clean content. Appl. Ontol. J. (2019, Forthcoming). IOS Press

    Google Scholar 

  10. Jarrar, M., Amayreh, H., McCrae, J.: Progress on representing Arabic Lexicons in Lemon. In: The 2nd Conference on Language, Data and Knowledge (LDK 2019). Leipzig, Germany (2019)

    Google Scholar 

  11. Jarrar, M., Zaraket, F., Asia, R., Amayreh, H.: Diacritic-based matching of Arabic Words. ACM Trans. Asian Low-Resource Langu. Inf. Process. 18(2), 10 (2018)

    Google Scholar 

  12. Jarrar, M., Ceusters, W.: Classifying processes and basic formal ontology. In: The 8th International Conference on Biomedical Ontology (ICBO), Newcastle, UK (2017)

    Google Scholar 

  13. Jarrar, M., Habash, N., Alrimawi, F., Akra, D., Zalmout, N.: Curras: an annotated corpus for the Palestinian Arabic Dialect. J. Lang. Resources Eval. 51(3), 745–775 (2017)

    Article  Google Scholar 

  14. Jarrar, M., Habash, N., Akra, D., Zalmout, N.: Building a corpus for Palestinian Arabic: a preliminary study. In: Workshop on Arabic Natural Language Processing (EMNLP 2014). Association for Computational Linguistics (ACL), Qatar, pp. 18–27 (2014)

    Google Scholar 

  15. Jarrar, M.: Building a formal Arabic ontology (Invited Paper). In: Proceedings of the Experts Meeting on Arabic Ontologies and Semantic Networks at ALECSO, Tunis (2011)

    Google Scholar 

  16. Jarrar, M., Meersman, R.: Ontology engineering – the DOGMA approach. In: Dillon, T.S., Chang, E., Meersman, R., Sycara, K. (eds.) Advances in Web Semantics I. LNCS, vol. 4891, pp. 7–34. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89784-2_2

    Chapter  Google Scholar 

  17. Jarrar, M., Keet, M., Dongilli, P.: Multilingual verbalization of ORM conceptual models and axiomatized ontologies. Technical report. Vrije Universiteit Brussel (2006)

    Google Scholar 

  18. Jarrar, M.: Position paper: towards the notion of gloss, and the adoption of linguistic resources in formal ontology engineering. In: The Web Conference (WWW 2006). ACM (2006)

    Google Scholar 

  19. Jarrar, M.: Towards methodological principles for ontology engineering. Ph.D. Thesis. Vrije Universiteit Brussel (2005)

    Google Scholar 

  20. Khalfi, M., Nahli, O., Zarghili, A.: Classical dictionary Al-Qamus in lemon. In: 4th IEEE International Colloquium on Information Science and Technology. IEEE (2016)

    Google Scholar 

  21. Soudani, N., Bounhas, I., Elayeb, B., Slimani, Y.: An LMF-based normalization approach of Arabic Islamic dictionaries for Arabic word sense disambiguation: application on hadith. J. Islamic Appl. Comput. Sci. 3(2), 10–18 (2015)

    Google Scholar 

  22. Cimiano, P., McCrae, J.P., Buitelaar, P.: Lexicon Model for Ontologies. Final Community Group Report. World Wide Web Consortium (2016)

    Google Scholar 

  23. Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: The 31st AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

  24. Navigli, R., Ponzetto, S.P.: BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. AI 193 (2012)

    Google Scholar 

  25. Salmon-Alt, S., Akrout, A., Romary, L.: Proposals for a normalized representation of Standard Arabic full form lexica. In: The International Conference on Machine Intelligence (2005)

    Google Scholar 

Download references

Acknowledgments

The authors are thankful to Mohannad Saidi, Mohammad Dwaikat, and other students and former employees who helped us in the technical development and digitization phases. We would like to also thank John P. McCrae for helping us in representing our lexical data in the W3C lemon model. We are also thankful to all lexicon owners, especially the ALECSO team who provided us with many lexicons and supported us in the digitization process.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mustafa Jarrar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jarrar, M., Amayreh, H. (2019). An Arabic-Multilingual Database with a Lexicographic Search Engine. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23281-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23280-1

  • Online ISBN: 978-3-030-23281-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics