Abstract
Urdu optical character recognition (OCR) is a complex problem due to the nature of its script, which is cursive. Recognizing characters of different font sizes further complicates the problem. In this research, long short term memory-recurrent neural network (LSTM-RNN) and convolution neural network (CNN) are used to recognize Urdu optical characters of different font sizes. LSTM-RNN is trained on formerly extracted feature sets, which are extracted for scale invariant recognition of Urdu characters. From these features, LSTM-RNN extracts meta features. CNN is trained on raw binary images. Two benchmark datasets, i.e. centre for language engineering text images (CLETI) and Urdu printed text images (UPTI) are used. LSTM-RNN reveals consistent results on both datasets, and outperforms CNN. Maximum 99% accuracy is achieved using LSTM-RNN.
Similar content being viewed by others
References
Ahmed SB, Naz S, Razzak MI, Rashid SF, Afzal MZ, Breuel TM (2016) Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput Appl 27(3):603–613
Akram Q, Hussain S (2017) Ligature-based font size independent ocr for noori nastalique writing style. In: 1st international workshop on Arabic script analysis and recognition (ASAR 2017)
Bacha Rehman ZH, Ahmad M (2012) ASCII based GUI system for Arabic scripted languages: a case of Urdu
Breuel TM, Ul-Hasan A, Al-Azawi MA, Shafait F (2013) High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR), 2013. IEEE, pp 683–687
De Vries B, Príncipe JC (1991) A theory for neural networks with time delays. In: Proceedings of the advances in neural information processing systems, pp 162–168
Hassen H, Dörnemann K, Khemakhem M (2017) Advanced distributed architecture for a complex and large scale arabic handwriting recognition framework. Int J High Perform Comput Netw 10(6):505–514
Karayil T, Ul-Hasan A, Breuel TM (2015) A segmentation-free approach for printed Devanagari script recognition. In: Proceedings of the 13th international conference on document analysis and recognition (ICDAR), 2015. IEEE, pp 946–950
LeCun Y, Haffner P, Bottou L, Bengio Y (1999) Object recognition with gradient-based learning. In: Proceedings of the Shape, contour and grouping in computer vision. Springer, pp 319–345
Maliki M, Al-Jawad N, Jassim S (2017) Off line writer identification for arabic language: analysis and classification techniques using subwords features. In: Proceedings of the 1st international workshop on Arabic script analysis and recognition (ASAR), 2017. IEEE, pp 145–152
Muhammad T, Halim Z (2016) Employing artificial neural networks for constructing metadata-based model to automatically select an appropriate data visualization technique. Appl Soft Comput 49:365–384
Naseer A, Zafar K (2018) Comparative analysis of raw images and meta feature based Urdu OCR using CNN and LSTM. Int J Adv Comput Sci Appl 9(1):419–424
Naz S, Umar AI, Ahmed SB, Shirazi SH, Razzak MI, Siddiqi I (2014) An OCR system for printed Nasta’liq script: a segmentation based approach. In: Proceedings of the IEEE 17th international conference on multi-topic conference (INMIC), 2014. IEEE, pp 255–259
Naz S, Ahmed SB, Ahmad R, Razzak MI (2016a) Zoning features and 2DLSTM for Urdu text-line recognition. Procedia Comput Sci 96:16–22
Naz S, Umar AI, Ahmad R, Ahmed SB, Shirazi SH, Siddiqi I, Razzak MI (2016b) Offline cursive urdu-nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing 177:228–241
Naz S, Umar AI, Ahmad R, Ahmed SB, Shirazi SH, Razzak MI (2017a) Urdu nastaliq text recognition system based on multi-dimensional recurrent neural network and statistical features. Neural Comput Appl 28(2):219–231
Naz S, Umar AI, Ahmad R, Siddiqi I, Ahmed SB, Razzak MI, Shafait F (2017b) Urdu nastaliq recognition using convolutional–recursive deep learning. Neurocomputing 243:80–87
Qurrat-ul A, Niazi A, Adeeba F, Urooj S, Hussain S, Shams S (2016) A comprehensive image dataset of Urdu Nastalique document images. In: Proceedings of the conference on language and technology 2016 (CLT 16), pp 81–88
Rahman T (2015) From Hindi to Urdu: a social and political history. Orientalistische Literaturzeitung 110(6):486–488
Rehmam B, Halim Z, Abbas G, Muhammad T (2015) Artificial neural network-based speech recognition using dwt analysis applied on isolated words from oriental languages. Malays J Comput Sci 28(3):242–262
Sabbour N, Shafait F (2013) A segmentation-free approach to Arabic and Urdu OCR. In: Document Recognition and Retrieval XX, International Society for Optics and Photonics, vol 8658, p 86580N
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Schmidhuber J, Hochreiter S (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Senior AW, Robinson AJ (1996) Forward-backward retraining of recurrent neural networks. In: Proceedings of the advances in neural information processing systems, pp 743–749
Ul-Hasan A (2016) Generic text recognition using long short-term memory networks
Ul-Hasan A, Breuel TM (2013) Can we build language-independent OCR using LSTM networks? In: Proceedings of the 4th international workshop on multilingual OCR, ACM, p 9
Ul-Hasan A, Bukhari SS, Shafait F, Breuel TM (2012) OCR-free table of contents detection in Urdu books. In: Proceedings of the 10th IAPR international workshop on document analysis systems (DAS), 2012. IEEE, pp 404–408
Ul-Hasan A, Ahmed SB, Rashid F, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR), 2013. IEEE, pp 1061–1065
Ul-Hasan A, Afzal MZ, Shafait F, Liwicki M, Breuel TM (2015a) A sequence learning approach for multiple script identification. In: Proceedings of the 13th international conference on Document Analysis and Recognition (ICDAR), 2015. IEEE, pp 1046–1050
Ul-Hasan A, Shafaity F, Liwicki M (2015b) Curriculum learning for printed text line recognition of ligature-based scripts. In: Proceedings of the 13th international conference on document analysis and recognition (ICDAR), 2015. IEEE, pp 1001–1005
Ul-Hasan A, Bukhari SS, Dengel A (2016) Ocroract: a sequence learning OCR system trained on isolated characters. In: Proceedings of the DAS, pp 174–179
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Naseer, A., Zafar, K. Meta features-based scale invariant OCR decision making using LSTM-RNN. Comput Math Organ Theory 25, 165–183 (2019). https://doi.org/10.1007/s10588-018-9265-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10588-018-9265-9