Skip to main content
Log in

A Framework for Building an Arabic Multi-disciplinary Ontology from Multiple Resources

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Over recent years, the Internet has become people’s main source of information, with many databases and web pages being added and accessed every day. This continued growth in the amount of information available has led to frustration and difficulty for those attempting to find a specific piece of information. As such, many techniques are widely used to retrieve useful information and to mine valuable data; indeed, these techniques make it possible to discover hidden relations and patterns. Most of the above-mentioned techniques have been used primarily to process and analyse English text, but not Arabic text. Limited Arabic resources (e.g. datasets, databases, and ontologies), also make analysing and processing Arabic text a difficult task. As such, in this paper, we propose a framework for building an Arabic ontology from multiple resources. Thus, we will first extract and build an Arabic ontology from a publicly available directory, following which, we will enhance this ontology with rich data from the Internet. We will then use an Arabic online directory to construct a multi-disciplinary ontology that provides a hierarchical representation of topics in a conceptual way. Following this, we introduce an enhanced technique to enrich these ontologies with sufficient information and proper annotation for each concept. Finally, by using common information retrieval evaluation techniques, we confirm the viability of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://www.dmoz.org/World/Arabic/.

  2. http://developers.google.com/web-search/

References

  1. Baharudin B, Lee LH, Khan K. A review of machine learning algorithms for text-documents classification. J Adv Inf Technol 2010;1.

  2. Chantar HKH. 2013. New techniques for Arabic document classification. Thesis, Heriot-Watt University.

  3. Wu K, Aggarwal CC, Yu PS. Personalization with dynamic profiler. in Advanced Issues of E-Commerce and Web-Based Information Systems, WECWIS 2001 Third International Workshop on., p. 12–20, IEEE; 2001.

  4. Middleton SE, Shadbolt NR, De Roure DC. Ontological user profiling in recommender systems. ACM Trans Inf Syst (TOIS) 2004;22:54–88.

    Article  Google Scholar 

  5. Weng S, Chang H. Using ontology network analysis for research document recommendation. Expert Syst Appl 2008;34:1857–1869.

    Article  Google Scholar 

  6. Liu W, Jin F, Zhang X. Ontology-Based user modeling for E-Commerce system. in Third International Conference on Pervasive Computing and Applications, 2008. ICPCA 2008, vol. 1, pp. 260–263, IEEE. p. 2008.

  7. Vallet D, Fernández M, Castells P, Mylonas P, Avrithis Y. A contextual personalization approach based on ontological knowledge. in International Workshop on Context and Ontologies: Theory, Practice and Applications; 2006.

  8. Lee W-P, Lin C-H. Combining expression data and knowledge ontology for gene clustering and network reconstruction. Cogn Comput 2016;8(2):217–227.

    Article  Google Scholar 

  9. Eirinaki M, Mavroeidis D, Tsatsaronis G, Vazirgiannis M. Introducing semantics in web personalization: The role of ontologies, in Semantics, Web and Mining. In: Ackermann M, Berendt B, Grobelnik M, Hotho A., Mladenič D, Semeraro G, Spiliopoulou M, Stumme G, Svátek V, and Someren M, editors. Berlin: Springer; 2006. p. 147–162.

  10. Mooney RJ, Bennett PN, Roy L. Book recommending using text categorization with extracted information. in Recommender systems. Papers from 1998 workshop, pp. 49–54 AAAI Press; 1998.

  11. Minhas S, Hussain A. From spin to swindle: Identifying falsification in financial text. Cogn Comput 2016;8 (4):729–745.

    Article  Google Scholar 

  12. Liu H, Sun F. Discovery of topical objects from video: A structured dictionary learning approach. Cogn Comput 2016;8(3):519–528.

    Article  Google Scholar 

  13. Ding C, Patra JC. User modeling for personalized web search with self-organizing map. J Am Soc Inf Sci Technol 2007;58(4):494–507.

    Article  Google Scholar 

  14. Costa R, Lima C. Document Clustering Using an Ontology-Based Vector Space Model. Int J Inf Retr Res 2015;5:39–60.

    Google Scholar 

  15. Yang L, Lin H, Lin Y, Liu S. Detection and extraction of hot topics on chinese microblogs. Cogn Comput 2016;8(4):577–586.

    Article  Google Scholar 

  16. Chen Y-w, Zhou Q, Luo W, Du J-X. Classification of chinese texts based on recognition of semantic topics. Cogn Comput 2016;8(1):114–124.

    Article  Google Scholar 

  17. Zaidi S, Laskri MT, Bechkoum K. A cross-language information retrieval based on an arabic ontology in the legal domain. in Proceedings of the International Conference on Signal-Image Technology and Internet-Based Systems (SITIS’05); 2005. p. 86–91.

  18. Mazari AC, Aliane H, Alimazighi Z. Automatic construction of ontology from arabic texts. in ICWIT. Citeseer; 2012. p. 193–202.

  19. Aliane H, Alimazighi Z, Mazari AC. Al-khalil: The arabic linguistic ontology project. in LREC; 2010.

  20. Al-Rajebah NI, Al-Khalifa HS. Extracting ontologies from arabic wikipedia: A linguistic approach. Arab J Sci Eng 2014;39(4):2749–2771.

    Article  Google Scholar 

  21. Albukhitan S, Helmy T. Automatic ontology-based annotation of food, nutrition and health arabic web content. Procedia Comput Sci 2013;19:461–469.

    Article  Google Scholar 

  22. Al-Safadi L, Al-Badrani M, Al-Junidey M. Developing ontology for arabic blogs retrieval. Int J Comput Appl 2011;19(4):40– 45.

    Google Scholar 

  23. Al-Safadi L, Al-Rgebh D, AlOhali W. A comparison between ontology-based and translation-based semantic search engines for arabic blogs. Arab J Sci Eng 2013;38(11):2985–2992.

    Article  Google Scholar 

  24. Mahyoub FH, Siddiqui MA, Dahab MY. Building an arabic sentiment lexicon using semi-supervised learning. J King Saud University-Comput Inf Sci 2014;26(4):417–424.

    Google Scholar 

  25. Harrag F, Alothaim A, Abanmy A, Alomaigan F, Alsalehi S. Ontology extraction approach for prophetic narration (hadith) using association rules. Int J Islamic Appl Comput Sci Technol 2013;1(2):48–57.

    Google Scholar 

  26. Dalloul YM. 2013. An Ontology-Based Approach to Support the Process of Judging Hadith Isnad. PhD thesis, Islamic University of Gaza.

  27. Al-Arfaj A, Al-Salman A. Towards ontology construction from arabic texts-a proposed framework. in Computer and Information Technology (CIT) IEEE International Conference on, pp. 737–742; 2014. p. 2014.

  28. Azmi A, bin Badia N. An Application for Creating an Ontology of Hadiths Narration Tree Semantically and Graphically. Arab J Sci Eng (AJSE) 2010;35(2C):7–24.

    Google Scholar 

  29. Al-Rumkhani A, Al-Razgan M, Al-Faris A. Tibbonto: Knowledge representation of prophet medicine (tibb al-nabawi), Procedia Computer Science, vol. 82, pp. 138–142, 2016. 4th Symposium on Data Mining Applications, SDMA2016, 30 March, 2016, Riyadh, Saudi Arabia.

  30. Snchez D. 2012. Domain Ontology Learning from the Web: An Unsupervised, Automatic and Domain Independent Approach. AV Akademikerverlag.

  31. Hou L, Zheng S, He H, Peng X Wong WE, Zhu T, (eds). 2014. Chinese Domain Ontology Learning Based on Semantic Dependency and Formal Concept Analysis, in Computer Engineering and Networking. doi:http://dx.doi.org/10.1007/978-3-319-01766-2_56.

  32. Küçük D, Arslan Y. Semi-automatic construction of a domain ontology for wind energy using wikipedia articles. Renew Energy 2014;62:484–489.

    Article  Google Scholar 

  33. Corcho O, Fernndez-Lpez M, Gmez-Prez A. Methodologies, tools and languages for building ontologies. Where is their meeting point?. Data Knowl Eng 2003;46(1):41–64.

    Article  Google Scholar 

  34. Chirita PA, Nejdl W, Paiu R, Kohlschütter C. Using odp metadata to personalize search. in proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’05, (New York, NY, USA), pp. 178–185 ACM; 2005.

  35. Mamoun R, Ahmed MA. 2014. A Comparative Study on Different Types of Approaches to the Arabic text classification.

  36. Hu X, Zhang X, Lu C, Park EK, Zhou X. Exploiting Wikipedia As External Knowledge for Document Clustering. in proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’09, (New York, NY, USA), pp. 389–396 ACM; 2009.

  37. Cagliero L, Garza P. Improving classification models with taxonomy information. Data Knowl Eng 2013; 86:85–101.

    Article  Google Scholar 

  38. Mobasher B, Dai H, Luo T, Nakagawa M. Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization. Data Min Knowl Disc 2002;6:61–82.

    Article  Google Scholar 

  39. Salem B, Rauterberg M. Multiple User Profile Merging (MUPE): Key Challenges for Environment Awareness, in Ambient Intelligence. In: Hutchison D, Kanade T, Kittler J, Kleinberg JM, Mattern F, Mitchell JC, Naor M, Nierstrasz O, Pandu C, Rangan B, Steffen M, Sudan D, Terzopoulos D, Tygar M Y, Vardi G, Weikum P, Markopoulos B, Eggen E, Aarts J, and Crowley L, editors. Berlin: Springer; 2004. p. 196–206.

  40. Haddi E, Liu X, Shi Y. The role of text pre-processing in sentiment analysis. Procedia Comput Sci 2013; 17:26–32.

    Article  Google Scholar 

  41. Munkov D, Munk M, Vozr M. Data Pre-processing Evaluation for Text Mining: Transaction/Sequence Model. Procedia Comput Sci 2013;18:1198–1207.

    Article  Google Scholar 

  42. Uysal AK, Gunal S. The impact of preprocessing on text classification. Inf Process Manag 2014;50(1): 104–112.

    Article  Google Scholar 

  43. Larkey LS, Ballesteros L, Connell ME. Light Stemming for Arabic Information Retrieval, in Arabic Computational Morphology. In: Soudi A., Bosch A. V. D., and Neumann G., editors. no. 38 in Text, Speech and Language Technology, pp. 221–243, Springer Netherlands; 2007. doi:10.1007/978-1-4020-6046-5_12.

  44. El-Khair IA. Effects of stop words elimination for Arabic information retrieval: a comparative study. Int J Comput Inform Sci 2006;4(3):119–133.

    Google Scholar 

  45. Liu M, Shen W, Hao Q, Yan J. An weighted ontology-based semantic similarity algorithm for web service. Expert Syst Appl 2009;36(10):12480–12490.

    Article  Google Scholar 

  46. Tair MMA, Baraka RS. 2013. Design and Evaluation of a Parallel Classifier for Large-Scale Arabic Text, networks, vol. 75, no. 3.

  47. Saad MK. 2010. The Impact of Text Preprocessing and Term Weighting on Arabic Text Classification.

  48. Al-Marghilani A, Zedan H, Ayesh A. Text mining based on the self-organizing map method for arabic-english documents. in Proc. of the 19th Midwest Artificial Intelligence and Cognitive Science Conf.(MAICS 2008), Cincinnati, USA; 2008. p. 174–181.

  49. Al-Shammari ET. 2010. T Improving Arabic text processing via stemming with application to text mining and web retrieval. PhD thesis, George Mason University.

  50. Alghamdi HM, Selamat A. Topic detections in Arabic dark websites using improved vector space model. in Data Mining and Optimization (DMO), 2012 4th Conference on, pp. 6–12 IEEE; 2012.

  51. Atwan J, Mohd M, Kanaan G. Enhanced arabic information retrieval: Light stemming and stop words. in Soft Computing Applications and Intelligent Systems, pp. 219–228 Springer; 2013.

  52. Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intell Syst 2013;28(2):15–21.

    Article  Google Scholar 

  53. Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst 2016;31(2):102–107.

    Article  Google Scholar 

  54. DMOZ - the Open Directory Project.

  55. Jiang S, Pang G, Wu M, Kuang L. An improved K-nearest-neighbor algorithm for text categorization. Expert Syst Appl 2012;39(1):1503–1509.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmad Hawalah.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hawalah, A. A Framework for Building an Arabic Multi-disciplinary Ontology from Multiple Resources. Cogn Comput 10, 156–164 (2018). https://doi.org/10.1007/s12559-017-9460-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-017-9460-x

Keywords

Navigation