Abstract
This paper focuses on the word sense disambiguation (WSD) problem in the context of Urdu language. Word sense disambiguation (WSD) is a phenomena for disambiguating the text so that machine (computer) would be capable to deduce correct sense of individual given word(s). WSD is critical for solving natural language engineering (NLE) tasks such as machine translation and speech processing etc. It also increase the performance of other tasks such as text retrieval, document classification and document clustering etc. Research work in WSD has been conducted up to different extents in computationally developed languages of the world. In the context of Urdu language the NLE research in general and the WSD research in particular is still in the infancy stage due to the rich morphological structure of Urdu. In this paper, we use machine learning (ML) approaches such as Bayes net classifier (BN), support vector machine (SVM) and decision tree (DT) for WSD in native script Urdu text. The results shown that BN has better F-measure than SVM and DT. The maximum F-measure of 0.711 over 2.5 million words raw Urdu corpus was recorded for the Bayes net classifier.









Similar content being viewed by others
References
Bouhriz, N.: Word sense disambiguation approach for Arabic text. J. Adv. Comput. Sci. Appl. 7(4), 381–385 (2016)
Jumi, S., Sarma, K.: Decision tree based supervised word sense disambiguation for Assamese. Int. J. Comput. Appl. 141(1), 0975 (2016)
Zhou, J., et al.: An online marking system conducive to learning. J Int Fuzzy Syst. 31(5), 2463–2471 (2016)
Sreenivasan, D., Vidya, M.: A walk through the approaches of word sense disambiguation. Int. J. Innov. Res. Sci. Technol. 2(10), 218–223 (2016)
Mittal, K., Jain, A.: Word sense disambiguation method using semantic similarity measures and owa operator. J. Soft Comput. 5(2), (2015). ISSN: 2229-6956(online)
Kalita, P., Barman, A.K.: Word sense disambiguation: a survey. Int. J. Eng. Comput. Sci. 4(5), 11743–11748V (2015)
Pal, P.A., Saha, D.: Word sense disambiguation: a survey. Int. J. Control Theory Comput. Model. 5(3), 1–16 (2015)
Hadni, M., Alaoui, E., Lachkar, A.: Word sense disambiguation for Arabic text categorization. Int. Arab J. Inf. Technol. 13(1A), 215–222 (2016)
Aung, N.T.T., Soe, K.M., Thein, N.L.: A word sense disambiguation system using Nave Bayesian algorithm for Myanmar language. Int. J. Sci. Eng. Res. 2(9), 1–7 (2011)
Gupta, V., Lehal, G.S.: Named entity recognition for Punjabi language text summarization. Int. J. Comput. Appl. 33(3), 28–32 (2011)
Riaz, B.: Named entity recognition in Urdu: a progress report. Proceeding of international conference on internet computing, pp. 1–5. (2002)
Arif, Z.S., Yaqoob, M.M., Rehman, A., Jamil, A., Jamil, F.: Word sense disambiguation for Urdu text by machine learning. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 14(5), 738–757 (2016)
Bala, P.: Word sense disambiguation using selectional restriction. Int. J. Sci. Res. Publ. 3(4), 1–4 (2013)
Kaur, K., Gupta, V.: Named entity recognition system for Punjabi language. Int. J. Comput. Sci. Inf. Technol. Secur. 2, 561–567 (2012)
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In Proceedings of 3rd IEEE international conference on data mining, pp. 427-434, (2003)
Singh, S., Singh, V.K., Siddiqui, T.J.: Hindi word sense disambiguation using semantic relatedness measure, pp. 247–256. In international workshop on multi-disciplinary trends in, artificial intelligence (2013)
Jiang, Y., et al.: A self-adaptively evolutionary screening approach for Sepsis patient. Computer-based medical systems (CBMS). IEEE 29th international symposium on. IEEE, (2016)
Bushra, A.: Automatic approach for word sense disambiguation using genetic algorithms. Int. J. Adv. Comput. Sci. Appl. 7(1), 41–44 (2016)
Gupta, V., Lehal, G.S.: A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1(1), 60–76 (2009)
Agirre E., Lopez de., Lacalle, A., Soroa, A.: Knowledge-based WSD on specific domains performing better than generic supervised WSD. In Proceedings of 21st international joint conference on Artificial intelligence, pp. 1501–1506, San Francisco, CA, USA (2009)
Elmougy S., Hamza T., Noaman, H.M.: Naive Bayes classifier for Arabic word sense disambiguation. In: Proceedings of INFOS Cairo, pp. 2729. (2008)
Zouaghi, A., Merhbene, L., Zrigui, M.: Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation. Artif. Intell. Rev. 38, 257269 (2012)
Manish S., et.al.: Hindi word sense disambiguation. In: Proceedings of the 3rd global wordnet conference (GWC 05), (2006)
McCarthy, D., et al.: Unsupervised acquisition of predominant word senses. Comput. Linguist. 33(4), 553–590 (2007)
Saif M., Graeme, H.: Distributional measures of concept-distance: a task-oriented evaluation. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, (2006)
Roy, A., Sarkar, S., Purakayastha, B.S.: Knowledge based approaches to Nepali word sense disambiguation. Int. J. Nat. Lang. Comput. 3(4), 51–63 (2014)
Dhungana U.R., Shakya S., Baral K., Sharma, B.: Word sense disambiguation using WSD specific wordnet of polysemy words. Int. J. Nat. Lang. Comput. 3(4), 2014
Parameswarappa, S., Narayan, V.N.: Kannada word sense disambiguation using decision list. Int. J. Emerg. Trends Technol. Comput. Sci. 2(3), 272–278 (2013)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abid, M., Habib, A., Ashraf, J. et al. Urdu word sense disambiguation using machine learning approach. Cluster Comput 21, 515–522 (2018). https://doi.org/10.1007/s10586-017-0918-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0918-0