Skip to main content
Log in

Meta features-based scale invariant OCR decision making using LSTM-RNN

  • S.I. : CMKBO
  • Published:
Computational and Mathematical Organization Theory Aims and scope Submit manuscript

Abstract

Urdu optical character recognition (OCR) is a complex problem due to the nature of its script, which is cursive. Recognizing characters of different font sizes further complicates the problem. In this research, long short term memory-recurrent neural network (LSTM-RNN) and convolution neural network (CNN) are used to recognize Urdu optical characters of different font sizes. LSTM-RNN is trained on formerly extracted feature sets, which are extracted for scale invariant recognition of Urdu characters. From these features, LSTM-RNN extracts meta features. CNN is trained on raw binary images. Two benchmark datasets, i.e. centre for language engineering text images (CLETI) and Urdu printed text images (UPTI) are used. LSTM-RNN reveals consistent results on both datasets, and outperforms CNN. Maximum 99% accuracy is achieved using LSTM-RNN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Ahmed SB, Naz S, Razzak MI, Rashid SF, Afzal MZ, Breuel TM (2016) Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput Appl 27(3):603–613

    Article  Google Scholar 

  • Akram Q, Hussain S (2017) Ligature-based font size independent ocr for noori nastalique writing style. In: 1st international workshop on Arabic script analysis and recognition (ASAR 2017)

  • Bacha Rehman ZH, Ahmad M (2012) ASCII based GUI system for Arabic scripted languages: a case of Urdu

  • Breuel TM, Ul-Hasan A, Al-Azawi MA, Shafait F (2013) High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR), 2013. IEEE, pp 683–687

  • De Vries B, Príncipe JC (1991) A theory for neural networks with time delays. In: Proceedings of the advances in neural information processing systems, pp 162–168

  • Hassen H, Dörnemann K, Khemakhem M (2017) Advanced distributed architecture for a complex and large scale arabic handwriting recognition framework. Int J High Perform Comput Netw 10(6):505–514

    Article  Google Scholar 

  • Karayil T, Ul-Hasan A, Breuel TM (2015) A segmentation-free approach for printed Devanagari script recognition. In: Proceedings of the 13th international conference on document analysis and recognition (ICDAR), 2015. IEEE, pp 946–950

  • LeCun Y, Haffner P, Bottou L, Bengio Y (1999) Object recognition with gradient-based learning. In: Proceedings of the Shape, contour and grouping in computer vision. Springer, pp 319–345

  • Maliki M, Al-Jawad N, Jassim S (2017) Off line writer identification for arabic language: analysis and classification techniques using subwords features. In: Proceedings of the 1st international workshop on Arabic script analysis and recognition (ASAR), 2017. IEEE, pp 145–152

  • Muhammad T, Halim Z (2016) Employing artificial neural networks for constructing metadata-based model to automatically select an appropriate data visualization technique. Appl Soft Comput 49:365–384

    Article  Google Scholar 

  • Naseer A, Zafar K (2018) Comparative analysis of raw images and meta feature based Urdu OCR using CNN and LSTM. Int J Adv Comput Sci Appl 9(1):419–424

    Google Scholar 

  • Naz S, Umar AI, Ahmed SB, Shirazi SH, Razzak MI, Siddiqi I (2014) An OCR system for printed Nasta’liq script: a segmentation based approach. In: Proceedings of the IEEE 17th international conference on multi-topic conference (INMIC), 2014. IEEE, pp 255–259

  • Naz S, Ahmed SB, Ahmad R, Razzak MI (2016a) Zoning features and 2DLSTM for Urdu text-line recognition. Procedia Comput Sci 96:16–22

    Article  Google Scholar 

  • Naz S, Umar AI, Ahmad R, Ahmed SB, Shirazi SH, Siddiqi I, Razzak MI (2016b) Offline cursive urdu-nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing 177:228–241

    Article  Google Scholar 

  • Naz S, Umar AI, Ahmad R, Ahmed SB, Shirazi SH, Razzak MI (2017a) Urdu nastaliq text recognition system based on multi-dimensional recurrent neural network and statistical features. Neural Comput Appl 28(2):219–231

    Article  Google Scholar 

  • Naz S, Umar AI, Ahmad R, Siddiqi I, Ahmed SB, Razzak MI, Shafait F (2017b) Urdu nastaliq recognition using convolutional–recursive deep learning. Neurocomputing 243:80–87

    Article  Google Scholar 

  • Qurrat-ul A, Niazi A, Adeeba F, Urooj S, Hussain S, Shams S (2016) A comprehensive image dataset of Urdu Nastalique document images. In: Proceedings of the conference on language and technology 2016 (CLT 16), pp 81–88

  • Rahman T (2015) From Hindi to Urdu: a social and political history. Orientalistische Literaturzeitung 110(6):486–488

    Google Scholar 

  • Rehmam B, Halim Z, Abbas G, Muhammad T (2015) Artificial neural network-based speech recognition using dwt analysis applied on isolated words from oriental languages. Malays J Comput Sci 28(3):242–262

    Article  Google Scholar 

  • Sabbour N, Shafait F (2013) A segmentation-free approach to Arabic and Urdu OCR. In: Document Recognition and Retrieval XX, International Society for Optics and Photonics, vol 8658, p 86580N

  • Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  • Schmidhuber J, Hochreiter S (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Senior AW, Robinson AJ (1996) Forward-backward retraining of recurrent neural networks. In: Proceedings of the advances in neural information processing systems, pp 743–749

  • Ul-Hasan A (2016) Generic text recognition using long short-term memory networks

  • Ul-Hasan A, Breuel TM (2013) Can we build language-independent OCR using LSTM networks? In: Proceedings of the 4th international workshop on multilingual OCR, ACM, p 9

  • Ul-Hasan A, Bukhari SS, Shafait F, Breuel TM (2012) OCR-free table of contents detection in Urdu books. In: Proceedings of the 10th IAPR international workshop on document analysis systems (DAS), 2012. IEEE, pp 404–408

  • Ul-Hasan A, Ahmed SB, Rashid F, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR), 2013. IEEE, pp 1061–1065

  • Ul-Hasan A, Afzal MZ, Shafait F, Liwicki M, Breuel TM (2015a) A sequence learning approach for multiple script identification. In: Proceedings of the 13th international conference on Document Analysis and Recognition (ICDAR), 2015. IEEE, pp 1046–1050

  • Ul-Hasan A, Shafaity F, Liwicki M (2015b) Curriculum learning for printed text line recognition of ligature-based scripts. In: Proceedings of the 13th international conference on document analysis and recognition (ICDAR), 2015. IEEE, pp 1001–1005

  • Ul-Hasan A, Bukhari SS, Dengel A (2016) Ocroract: a sequence learning OCR system trained on isolated characters. In: Proceedings of the DAS, pp 174–179

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kashif Zafar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Naseer, A., Zafar, K. Meta features-based scale invariant OCR decision making using LSTM-RNN. Comput Math Organ Theory 25, 165–183 (2019). https://doi.org/10.1007/s10588-018-9265-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10588-018-9265-9

Keywords

Navigation