Skip to main content

Urdu Natural Language Processing Issues and Challenges: A Review Study

  • Conference paper
  • First Online:
Book cover Intelligent Technologies and Applications (INTAP 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1198))

Included in the following conference series:

  • 900 Accesses

Abstract

Natural language processing is the technology used to aid computers to understand the human’s natural language. However this is not an easy task to teach a machine to understand how humans communicate. This paper provides a summary of information about some speech recognition techniques that are in the literature for new scholars to look into. It also discusses related work along with efficiency comparison for different natural languages. After that, a brief summary of Urdu language and related work done in Urdu language processing issues and challenges is presented. In the last part, future work is proposed for efficient processing of Urdu language along with some useful techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Olson, H.F., Belar, H.: Phonetic typewriter. J. Acoust. Soc. Am. 28(6), 1072–1081 (1956)

    Article  Google Scholar 

  2. Hussain, S.: Resources for Urdu language processing. In: Proceedings of the 6th Workshop on Asian Language Resources (2019)

    Google Scholar 

  3. Ashraf, J., Iqbal, N., Khattak, N.S., Zaidi, A.M.: Speaker independent Urdu speech recognition using HMM. In: Hopfe, Christina J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds.) NLDB 2010. LNCS, vol. 6177, pp. 140–148. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13881-2_14

    Chapter  Google Scholar 

  4. Tran, D.T.: Fuzzy Approaches to Speech and Speaker Recognition. A thesis submitted for the degree of Doctor of Philosophy of the university of Canberra (2000)

    Google Scholar 

  5. Anusuya, M.A., Katti, S. K.: Speech recognition by machine: a review. Int. J. Comput. Sci. Inf. Secur. (2010)

    Google Scholar 

  6. Katagiri, S., et al.: A New hybrid algorithm for speech recognition based on HMM segmentation and learning Vector quantization. IEEE Transactions on Audio Speech and Language processing 1(4), 421–430 (1993)

    Article  Google Scholar 

  7. Shaikh, M.K., Khowaja, H.A., Khan, M.A.: Urdu text translation with natural language processing. In: Student Conference On Engineering, Sciences and Technology, Karachi, Pakistan, pp. 81–85 (2004)

    Google Scholar 

  8. Karim, R., Rahman, M.S., Iqbal, M.Z.: Recognition of spoken letters in bangla. In: Proceedings 5th International Conference on Computer and Information Technology (ICCIT02), Dhaka, Bangladesh (2002)

    Google Scholar 

  9. Oney, B., Durgunoglu, A.Y.: Learning to read in Turkish: a phonologically transparent orthography. Appl. Psycholinguist. 18, 1–15 (1997)

    Article  Google Scholar 

  10. Tamzida, A., Siddiqui, S.: A synchronic comparison between the vowel phonemes of Bengali & English phonology and its classroom applicability. Stamford J. English 6, 285–314 (2013)

    Article  Google Scholar 

  11. Barman, B.: A contrastive analysis of english and bangla phonemics. Dhaka University J. Linguist. 2(4), 19–42 (2011)

    Article  Google Scholar 

  12. Hossain, S.A., Rahman, M.L., Ahmed, F.: A review on bangla phoneme production and perception for computational approaches. In: 7th WSEAS International Conference on Mathematical Methods and Computational Techniques in Electrical Engineering, pp. 69–89 (2005)

    Google Scholar 

  13. Hassan, F., Alam Kotwal, M.R., Rahman, M.M., Nasiruddin, M., Latif, M.A., Nurul Huda, M.: Local feature or mel frequency cepstral coefficients - which one is better for mln-based bangla speech recognition? In: Abraham, A., Lloret Mauri, J., Buford, John F., Suzuki, J., Thampi, Sabu M. (eds.) ACC 2011. CCIS, vol. 191, pp. 154–161. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22714-1_17

    Chapter  Google Scholar 

  14. Ali, M., Hossain, M., Bhuiyan, M.N., et al.: Automatic speech recognition technique for bangla words. Int. J. Adv. Sci. Technol. 50, 51–60 (2013)

    Google Scholar 

  15. Rahman, M.M., Khatun, F.: Development of isolated speech recognition system for bangla words. Daffodil Int. Univ. J. Sci. Technol. 6(1), 30–35 (2011)

    Article  Google Scholar 

  16. Hasnat, M.A., Mowla, J., Khan, M.: Isolated and continuous bangla speech recognition: implementation, performance and application perspective. In: Center for research on Bangla language processing (CRBLP) (2007)

    Google Scholar 

  17. Rahman, M.M., Bhuiyan, M.A.-A.: On segmentation and extraction of features from continuous bangla speech including windowing. Int. J. Appl. Res. Inf. Technol. Comput. 2(2), 31–40 (2011)

    Article  Google Scholar 

  18. Ettaouil, M., Lazaar, M., En-Naimani, Z.: A hybrid ANN/HMM models for arabic speech recognition using optimal codebook. In:2013 8th International Conference on Intelligent Systems: Theories and Applications (SITA), Rabat, pp. 1–5 (2013)

    Google Scholar 

  19. Can, B., Artuner, H.: A syllable-based Turkish speech recognition system by using time delay neural networks (TDNNs). Department of Computer Engineering Hacettepe University Ankara, Turkey. IEEE (2013)

    Google Scholar 

  20. Palaz, H., Kanak, A., Bicil, Y., Dog̃an, M.U., İslam, T.: TREN - Turkish speech recognition platform. In: 2005 13th European Signal Processing Conference, Antalya, pp. 1–4 (2005)

    Google Scholar 

  21. Salor, O.L., Pellom, B., Çiloglu, T., Hacioglu, K., Demirekler, M.: On developing new text and audio corpora and speech recognition tools for the Turkish language. In: Seventh International Conference on Spoken Language Processing (2002)

    Google Scholar 

  22. Kuo, H.J., Arisoy, E., Mangu, L., Saon, G.: Minimum Bayes risk discriminative language models for Arabic speech recognition. In: 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, Waikoloa, HI, pp. 208–213 (2011)

    Google Scholar 

  23. Alotaibi, Y., Selouani, S.A., Alghamdi, M., Meftah, A.: Arabic and English speech recognition using cross-language acoustic models. In: 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA, pp. 40–44 (2012). https://doi.org/10.1109/isspa.2012.6310585

  24. Bayeh, R., Mokbel, C., Chollet, G.: Broadcast news transcription baseline system using the Nemlar database. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC): the workshop in HLT & NLP within the Arabic world, Marrakech, Morocco (2008)

    Google Scholar 

  25. Hammami, N., Bedda, M., Farah, N.: Probabilistic classification based on Gaussian copula for speech recognition: Application to Spoken Arabic digits. In: 2013 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, pp. 312–317 (2013)

    Google Scholar 

  26. See http://www.answers.com/topic/urdu

  27. Shaikh, S., Strzalkowski, T., Webb, N.: Classification of dialogue acts in Urdu multi- party discourse (2011)

    Google Scholar 

  28. Centre for Research in Urdu Language Processing (CRULP). http://www.crulp.org

  29. National Language Authority (NLA), Islamabad. http://www.nlauit.gov.pk

  30. Kabir, H.: Natural language processing for Urdu TTS system, 58–58 (2002). https://doi.org/10.1109/inmic.2002.1310165

  31. Shaukat, A.A., Ali, H., Akram, U.: Automatic Urdu speech recognition using hidden Markov model. In: 2016 International Conference on Image, Vision and Computing (ICIVC), Portsmouth, pp. 135–139 (2016)

    Google Scholar 

  32. Anwar, W., Wang, X., Wang, X.: A survey of automatic Urdu language Processing. In: 2006 International Conference on Machine Learning and Cybernetics, Dalian, China, pp. 4489–4494 (2006)

    Google Scholar 

  33. Qasim, M., Nawaz, S., Hussain, S., Habib, T.: Urdu speech recognition system for district names of Pakistan: Development, challenges and solutions. In: 2016 Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA), Bali, pp. 28–32 (2016)

    Google Scholar 

  34. Ali, S., Iqbal, S., Saeed, I.: Voice Controlled Urdu interface using isolated and Continuous speech Recognizer (2012). https://doi.org/10.1109/inmic.2012.6511493

  35. Hussain, S.: Urdu localization project 80 (2004). https://doi.org/10.3115/1621804.1621825

  36. Oprea, M., Şchiopu, D.: An artificial neural network-based isolated word speech recognition system for the Romanian language. In: 2012 16th International Conference on System Theory, Control and Computing (ICSTCC), Sinaia, pp. 1–6 (2012)

    Google Scholar 

  37. Revathi, B., Humera Khanam, B.: Hindi To English part of speech tagger by using Crf method. North Asian Int. Res. J. Sci. Eng. I.T. 2(1), 2–10 (2016). ISSN: 2454-7514

    Google Scholar 

  38. Hussain, S.: Letter-To-sound conversion For Urdu text-to-speech system. Coling (2004)

    Google Scholar 

  39. Medhi, B., Talukdar, P.H.: Isolated Assamese speech recognition using artificial neural network. In: 2015 International Symposium on Advanced Computing and Communication (ISACC), Silchar, pp. 141–148 (2015)

    Google Scholar 

  40. Krishnan, V.R.V., Jayakumar, A., Babu, A.P.: Speech recognition of isolated Malayalam words using wavelet features and artificial neural network. In:4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008), Hong Kong, pp. 240-243 (2008)

    Google Scholar 

  41. Polur, P.D., Zhou, R., Yang, J., Adnani, F., Hobson, R.S.: Isolated speech recognition using artificial neural networks. In: 2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Istanbul, Turkey, vol. 2, pp. 1731–1734 2001

    Google Scholar 

  42. Sukumar, R.A., Sukumar, S.A., Shah, F.A., Anto, B.P.: Key-word based query recognition in a speech corpus by using artificial neural networks. In:2010 2nd International Conference on Computational Intelligence, Communication Systems and Networks, Liverpool, pp. 33–36 (2010)

    Google Scholar 

  43. Sukumar, A.R., Shah, A.F., Anto, P.B.: Isolated question words recognition from speech queries by using Artificial Neural Networks. In: 2010 Second International conference on Computing, Communication and Networking Technologies, Karur, pp. 1–4 (2010)

    Google Scholar 

  44. Dey, N.S., Mohanty, R., Chugh, K.L.: Speech and speaker recognition system using artificial neural networks and hidden markov model. In: 2012 International Conference on Communication Systems and Network Technologies, Rajkot, pp. 311–315 (2012)

    Google Scholar 

  45. Hwang, J.-N., Lay, S.-R., Mächler, M., Martin, R.Douglas, Schimert, J.: Regression modeling in back-propagation and projection pursuit learning. IEEE Trans. Neural Netw. 5, 342–353 (1994). https://doi.org/10.1109/72.286906

    Article  Google Scholar 

  46. Vecci, L., Campolucci, P., Piazza, F., Uncini, A.: Approximation capabilities of adaptive spline neural networks 1, 260-265 (1997). https://doi.org/10.1109/icnn.1997.611675

  47. Benvenuto, N., Marchesi, M., Piazza, F., Uncini, A.: A comparison between real and complex valued neural networks in communication applications. In: Teuvo, K., Kai, M., Olli, S., Jari, K. (eds.) Artificial Neural Networks, North-Holland, pp. 1177–1180 (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Usman Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khan, U., Ahmad, M.B., Shafiq, F., Sarim, M. (2020). Urdu Natural Language Processing Issues and Challenges: A Review Study. In: Bajwa, I., Sibalija, T., Jawawi, D. (eds) Intelligent Technologies and Applications. INTAP 2019. Communications in Computer and Information Science, vol 1198. Springer, Singapore. https://doi.org/10.1007/978-981-15-5232-8_39

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-5232-8_39

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-5231-1

  • Online ISBN: 978-981-15-5232-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics