Skip to main content
Log in

Urdu word sense disambiguation using machine learning approach

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

This paper focuses on the word sense disambiguation (WSD) problem in the context of Urdu language. Word sense disambiguation (WSD) is a phenomena for disambiguating the text so that machine (computer) would be capable to deduce correct sense of individual given word(s). WSD is critical for solving natural language engineering (NLE) tasks such as machine translation and speech processing etc. It also increase the performance of other tasks such as text retrieval, document classification and document clustering etc. Research work in WSD has been conducted up to different extents in computationally developed languages of the world. In the context of Urdu language the NLE research in general and the WSD research in particular is still in the infancy stage due to the rich morphological structure of Urdu. In this paper, we use machine learning (ML) approaches such as Bayes net classifier (BN), support vector machine (SVM) and decision tree (DT) for WSD in native script Urdu text. The results shown that BN has better F-measure than SVM and DT. The maximum F-measure of 0.711 over 2.5 million words raw Urdu corpus was recorded for the Bayes net classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bouhriz, N.: Word sense disambiguation approach for Arabic text. J. Adv. Comput. Sci. Appl. 7(4), 381–385 (2016)

    Google Scholar 

  2. Jumi, S., Sarma, K.: Decision tree based supervised word sense disambiguation for Assamese. Int. J. Comput. Appl. 141(1), 0975 (2016)

    Google Scholar 

  3. Zhou, J., et al.: An online marking system conducive to learning. J Int Fuzzy Syst. 31(5), 2463–2471 (2016)

    Article  Google Scholar 

  4. Sreenivasan, D., Vidya, M.: A walk through the approaches of word sense disambiguation. Int. J. Innov. Res. Sci. Technol. 2(10), 218–223 (2016)

    Google Scholar 

  5. Mittal, K., Jain, A.: Word sense disambiguation method using semantic similarity measures and owa operator. J. Soft Comput. 5(2), (2015). ISSN: 2229-6956(online)

  6. Kalita, P., Barman, A.K.: Word sense disambiguation: a survey. Int. J. Eng. Comput. Sci. 4(5), 11743–11748V (2015)

    Google Scholar 

  7. Pal, P.A., Saha, D.: Word sense disambiguation: a survey. Int. J. Control Theory Comput. Model. 5(3), 1–16 (2015)

    Article  Google Scholar 

  8. Hadni, M., Alaoui, E., Lachkar, A.: Word sense disambiguation for Arabic text categorization. Int. Arab J. Inf. Technol. 13(1A), 215–222 (2016)

    Google Scholar 

  9. Aung, N.T.T., Soe, K.M., Thein, N.L.: A word sense disambiguation system using Nave Bayesian algorithm for Myanmar language. Int. J. Sci. Eng. Res. 2(9), 1–7 (2011)

    Google Scholar 

  10. Gupta, V., Lehal, G.S.: Named entity recognition for Punjabi language text summarization. Int. J. Comput. Appl. 33(3), 28–32 (2011)

    Google Scholar 

  11. Riaz, B.: Named entity recognition in Urdu: a progress report. Proceeding of international conference on internet computing, pp. 1–5. (2002)

  12. Arif, Z.S., Yaqoob, M.M., Rehman, A., Jamil, A., Jamil, F.: Word sense disambiguation for Urdu text by machine learning. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 14(5), 738–757 (2016)

    Google Scholar 

  13. Bala, P.: Word sense disambiguation using selectional restriction. Int. J. Sci. Res. Publ. 3(4), 1–4 (2013)

    Google Scholar 

  14. Kaur, K., Gupta, V.: Named entity recognition system for Punjabi language. Int. J. Comput. Sci. Inf. Technol. Secur. 2, 561–567 (2012)

    Google Scholar 

  15. Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In Proceedings of 3rd IEEE international conference on data mining, pp. 427-434, (2003)

  16. Singh, S., Singh, V.K., Siddiqui, T.J.: Hindi word sense disambiguation using semantic relatedness measure, pp. 247–256. In international workshop on multi-disciplinary trends in, artificial intelligence (2013)

  17. Jiang, Y., et al.: A self-adaptively evolutionary screening approach for Sepsis patient. Computer-based medical systems (CBMS). IEEE 29th international symposium on. IEEE, (2016)

  18. Bushra, A.: Automatic approach for word sense disambiguation using genetic algorithms. Int. J. Adv. Comput. Sci. Appl. 7(1), 41–44 (2016)

    Google Scholar 

  19. Gupta, V., Lehal, G.S.: A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1(1), 60–76 (2009)

    Google Scholar 

  20. Agirre E., Lopez de., Lacalle, A., Soroa, A.: Knowledge-based WSD on specific domains performing better than generic supervised WSD. In Proceedings of 21st international joint conference on Artificial intelligence, pp. 1501–1506, San Francisco, CA, USA (2009)

  21. Elmougy S., Hamza T., Noaman, H.M.: Naive Bayes classifier for Arabic word sense disambiguation. In: Proceedings of INFOS Cairo, pp. 2729. (2008)

  22. Zouaghi, A., Merhbene, L., Zrigui, M.: Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation. Artif. Intell. Rev. 38, 257269 (2012)

    Article  Google Scholar 

  23. Manish S., et.al.: Hindi word sense disambiguation. In: Proceedings of the 3rd global wordnet conference (GWC 05), (2006)

  24. McCarthy, D., et al.: Unsupervised acquisition of predominant word senses. Comput. Linguist. 33(4), 553–590 (2007)

    Article  Google Scholar 

  25. Saif M., Graeme, H.: Distributional measures of concept-distance: a task-oriented evaluation. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics, (2006)

  26. Roy, A., Sarkar, S., Purakayastha, B.S.: Knowledge based approaches to Nepali word sense disambiguation. Int. J. Nat. Lang. Comput. 3(4), 51–63 (2014)

    Article  Google Scholar 

  27. Dhungana U.R., Shakya S., Baral K., Sharma, B.: Word sense disambiguation using WSD specific wordnet of polysemy words. Int. J. Nat. Lang. Comput. 3(4), 2014

  28. Parameswarappa, S., Narayan, V.N.: Kannada word sense disambiguation using decision list. Int. J. Emerg. Trends Technol. Comput. Sci. 2(3), 272–278 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Abid.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abid, M., Habib, A., Ashraf, J. et al. Urdu word sense disambiguation using machine learning approach. Cluster Comput 21, 515–522 (2018). https://doi.org/10.1007/s10586-017-0918-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-0918-0

Keywords