Skip to main content
Log in

Ligature based Urdu Nastaleeq sentence recognition using gated bidirectional long short term memory

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Bidirectional long short term memory (BLSTM) architecture—a special case of recurrent neural network—is successfully applied for recognition of Urdu Nastaleeq sentence images based on character information. In such cases, manual labeling of characters in sentences for a large dataset is an intensive job, because identical characters observe different shapes at different positions inside ligatures and words. On the other hand, labeling any dataset with ligatures is a relatively easier and more accurate phenomenon. In the current paper, we propose a novel gated BLSTM (GBLSTM) model for recognition of printed Urdu Nastaleeq text based on ligature information. Our proposed model incorporates raw pixel values as features instead of human crafted features, because of the latter being more error prone. The model is trained on un-degraded and tested on unseen artificially degraded versions of Urdu printed text images dataset. The recognition accuracy of the proposed GBLSTM model is 96.71% that is higher than the prevalent Urdu optical character recognition systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. UPTIs dataset is provided by faisal.shafait@uwa.edu.au and adnan@cs.uni-kl.de.

References

  1. Naz, S., Umar, A.I., Shirazi, S.H., Khan, S.A., Ahmed, I., Khan, A.A.: Challenges of Urdu named entity recognition: a scarce resourced language. Res. J. Appl. Sci. Eng. Technol. 8(10), 1272–1278 (2014)

    Google Scholar 

  2. Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. (2016). doi:10.1007/s10462-016-9482-x

  3. Weber, G.: Top languages. World 10 (2008)

  4. Naz, S., Hayat, K., Razzak, M.I., Anwar, M.W., Madani, S.A., Khan, S.U.: The optical character recognition of Urdu-like cursive scripts. Pattern Recognit. 47(3), 1229–1248 (2014)

    Article  Google Scholar 

  5. Javed, S.T., Hussain, S., Maqbool, A., Asloob, S., Jamil, S., Moin, H.: Segmentation free Nastalique Urdu OCR. World Acad. Sci. Eng. Technol. 46, 456–461 (2010)

    Google Scholar 

  6. Husain, S.A.: A multi-tier holistic approach for Urdu Nastaleeq recognition. In: International Multi-topic Conference, 2002. INMIC 2002. Abstracts, pp. 84–84. IEEE (2002)

  7. Ahmad, I., Wang, X., Li, R., Rasheed, S.: Offline Urdu Nastaleeq optical character recognition based on stacked denoising autoencoder. China Commun. 14(1), 146–157 (2017)

    Article  Google Scholar 

  8. Sabbour, N., Shafait, F.: A segmentation-free approach to Arabic and Urdu OCR. IS&T/SPIE Electron. Imaging Int. Soc. Opt. Photon. (2013) doi:10.1117/12.2003731

  9. Lehal, G.S., Rana, A.: Recognition of Nastalique Urdu ligatures. In: Proceedings of the 4th International Workshop on Multilingual OCR, p. 7. ACM, New York (2013)

  10. El-Korashy, A., Shafait, F.: Search space reduction for holistic ligature recognition in Urdu Nastalique script. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1125–1129. IEEE (2013)

  11. Hussain, D., Hassan, D., Guo, P.: Improved Arabic word classification using spatial pyramid matching method. In: International Conference Image and Vision Computing New Zealand (IVCNZ, 2011)

  12. Hussain, S., Ali, S., Akram, Q.: Nastalique segmentation-based approach for Urdu OCR. Int. J. Doc. Anal. Recognit. 18(4), 357–374 (2015)

    Article  Google Scholar 

  13. Javed, S.T., Hussain, S.: Segmentation based Urdu Nastalique OCR. In: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 41–49. Springer, Berlin (2013)

  14. Ul-Hasan, A., Ahmed, S.B., Rashid, F., Shafait, F., Breuel, T.M.: Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1061–1065. IEEE (2013)

  15. Naz, S., Umar, A.I., Ahmad, R., Ahmed, S.B., Shirazi, S.H., Razzak, M.I.: Urdu Nastaliq text recognition system based on multidimensional recurrent neural network and statistical features. Neural Comput. Appl. (2015). doi:10.1007/s00521-015-2051-4

  16. Naz, S., Umar, A.I., Ahmad, R., Ahmed, S.B., Shirazi, S.H., Siddiqi, I., Razzak, M.I.: Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing (2015). doi:10.1016/j.neucom.2015.11.030

  17. Ahmed, S.B., Naz, S., Razzak, M.I., Rashid, S.F., Afzal, M.Z., Breuel, T.M.: Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput. Appl. (2015). doi:10.1007/s00521-015-1881-4

  18. Naz, S., Ahmed, S.B., Ahmad, R., Razzak, M.I.: Zoning features and 2DLSTM for Urdu text-line recognition. Procedia Comput. Sci. 96, 16–22 (2016)

    Article  Google Scholar 

  19. Naz, S., Umar, A.I., Ahmad, R., Razzak, M.I., Rashid, S.F., Shafait, F.: Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks. SpringerPlus 5(1), 2010 (2016)

    Article  Google Scholar 

  20. Naz, S., Umar, A.I., Ahmad, R., Siddiqi, I., Ahmed, S.B., Razzak, M.I., Shafait, F.: Urdu Nastaliq recognition using convolutional-recursive deep learning. Neurocomputing 243, 80–87 (2017)

    Article  Google Scholar 

  21. Graves, A.: Supervised Sequence Labelling. Springer, Berlin (2012)

    Book  MATH  Google Scholar 

  22. Satti, D.A., Saleem, K.: Complexities and implementation challenges in offline Urdu Nastaliq OCR. In: Proceedings of the Conference on Language and Technology, pp. 85–91 (2012)

  23. Sarfraz, H., Dilawari, A., Hussain, S.: Assessing Urdu language support on the multilingual web. In: Proceedings of the 12th AMIC Annual Conference on e-Worlds: Governments, Business and Civil Society. Asian Media Information Center, Singapore (2003)

  24. Bridle, J.S.: Probabilistic interpretation of feed forward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing (1990). doi:10.1007/978-3-642-76153-9_28

  25. Canyameres Masip, S., López Peña, A.M., et al.: On the use of convolutional neural networks for pedestrian detection (2015)

  26. Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)

    Article  Google Scholar 

  27. Wu, D.: Human action recognition using deep probabilistic graphical models. PhD Thesis, University of Sheffield (2014)

  28. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM, New York (2008)

  29. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  30. Plamondon, R., Srihari, S.N.: Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)

    Article  Google Scholar 

  31. Jaeger, H.: Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the “echo state network” Approach. GMDForschungszentrum Informationstechnik (2002)

  32. Senior, A., Robinson, T.: Forward–backward retraining of recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 743–749 (1996)

  33. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  34. Graves, A.: Offline Arabic handwriting recognition with multidimensional recurrent neural networks. In: Guide to OCR for Arabic Scripts, pp. 297–313. Springer, London (2012)

  35. Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. Adv. Neural Inf. Process. Syst. 19, 545–552 (2009)

    Google Scholar 

  36. Graves, A., Fernández, S., Schmidhuber, J.: Bidirectional LSTM networks for improved phoneme classification and recognition. In: International Conference on Artificial Neural Networks, pp. 799–804. Springer, Berlin (2005)

  37. Graves, A., Liwicki, M., Bunke, H., Schmidhuber, J., Fern’andez, S.: Unconstrained on-line handwriting recognition with recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 577–584 (2008)

  38. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)

    Article  Google Scholar 

  39. Wojna, Z.: Fast methods in training deep neural networks for image recognition. PhD Thesis, University College London (2015)

  40. Lian, Z., Jing, X., Wang, X., Huang, H., Tan, Y., Cui, Y.: DropConnect regularization method with sparsity constraint for neural networks. Chin. J. Electron. 25(1), 152–158 (2016)

    Article  Google Scholar 

  41. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics (AISTATS-11) (G.J. Gordon and D.B. Dunson, eds.). J. Mach. Learn. Res. Workshop Conf. Proc. 15, 315–323 (2011)

    Google Scholar 

  42. Baird, H.S.: Document image defect models. In: Structured Document Image Analysis, pp. 546–556. Springer, New York (1992)

  43. Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 369–376. ACM, New York (2006)

  44. Ahmad, I., Wang, X., Li, R., Ahmad, M., Ullah, R.: Line and ligature segmentation of Urdu Nastaleeq text. IEEE Access (2017). doi:10.1109/ACCESS.2017.2703155

Download references

Acknowledgements

The authors would like to thank Dr. Ruifan Li, Center for Intelligence of Science and Technology (CIST), Beijing University of Posts and Telecommunications, China. We would also like to thank Mr. Mohammad Saad Khan, Beijing University of Posts and Telecommunications, China for his guidance and unprecedented help.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ibrar Ahmad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmad, I., Wang, X., Mao, Y.h. et al. Ligature based Urdu Nastaleeq sentence recognition using gated bidirectional long short term memory. Cluster Comput 21, 703–714 (2018). https://doi.org/10.1007/s10586-017-0990-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-0990-5

Keywords

Navigation