Urdu signboard detection and recognition using deep learning

Arafat, Syed Yasser; Ashraf, Nabeel; Iqbal, Muhammad Javed; Ahmad, Iftikhar; Khan, Suleman; Rodrigues, Joel J. P. C.

doi:10.1007/s11042-020-10175-2

Urdu signboard detection and recognition using deep learning

1177: Advances in Deep Learning for Multimodal Fusion and Alignment
Published: 06 January 2021

Volume 81, pages 11965–11987, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Syed Yasser Arafat^1,2,
Nabeel Ashraf²,
Muhammad Javed Iqbal¹,
Iftikhar Ahmad ORCID: orcid.org/0000-0003-3719-2387³,
Suleman Khan⁴ &
…
Joel J. P. C. Rodrigues^5,6

990 Accesses
Explore all metrics

Abstract

Signboard detection and recognition is an important task in automated context-aware marketing. Recently many scripting languages like Latin, Japanese, and Chinese have been effectively detected by several machine learning algorithms. As compared to other languages, outdoor Urdu text needs further attention in detection and recognition due to its cursive nature. Urdu detection and recognition are also difficult due to a wide variety of illuminations, low resolution, inconsistent font styles, color, and backgrounds. To overcome the deficiency of Urdu text detection from the outdoor environment, we have proposed a new Urdu-text signboard dataset with 467 ligature categories, containing a 30 + K images for recognition and 700 base images with annotation are created for detection. We also propose a methodology, that consists of 3-phases. In first phase text regions containing Urdu ligatures from shop-signboard images are detected by a faster regional convolutional neural network (FasterRCNN) using pre-trained CNNs like Alexnet and Vgg16. In the second phase detected regions from the first phase are clustered to identify unique ligatures in a dataset. Lastly in the third phase, all detected regions are recognized by 18-layer convolutional neural network trained model. The proposed system has successfully achieved the precision and recall of 87% and 96% respectively using vgg16 model for detection. For the classification of ligatures, a recognition rate of 97.50% is achieved. Recognition of ligatures was also evaluated using bilingual evaluation understudy (BLEU), and achieved an encouraging score of 0.96 on the newly developed Urdu-Signboard dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Recognition of Indian Sign Language (ISL) Using Deep Learning Model

Article 28 September 2021

A Study for Sign Language Detection Using Deep Learning Methods

Deep Neural Networks for Image-Based Indian Sign Language Recognition: A Comprehensive Review and Practical Analysis

References

Ackley HS (2019) Methods for optical character recognition (OCR). US Patent Application No. 15/793:407
Ahmad I, Wang X, Li R, Rasheed S (2017) Offline Urdu Nastaleeq optical character recognition based on stacked denoising autoencoder. China Communications 14(1):146–157
Article Google Scholar
Ahmed SB, Naz S, Razzak MI, Yousaf R (2017) Deep learning based isolated Arabic scene character recognition. In: 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). IEEE, pp 46–51
Akram QUA, Hussain S (2017) Ligature-based font size independent OCR for Noori Nastalique writing style. 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), IEEE, pp 129–133
Ali A, Pickering MA (2019) Hybrid deep neural network for Urdu text recognition in natural images. In: 4th International Conference on Image. Vision and Computing (ICIVC), IEEE, pp 321–325
Google Scholar
Ali A, Pickering M (2019) Urdu-text: A dataset and benchmark for Urdu text detection and recognition in natural scenes. In: 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, pp 323–328
Ali T, Ahmad T, Imran M (2016) UOCR: A ligature based approach for an Urdu OCR system. 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), IEEE, pp 388–394
Ali A, Pickering M, Shafi K (2018) Urdu natural scene character recognition using convolutional neural networks. In: 2nd International Workshop on Arabic and Derived Script Analysis and Recognition (ASAR), IEEE edn, pp 29–34
Google Scholar
Arafat SY, Iqbal MJ (2019) Two stream deep neural network for sequence-based Urdu ligature recognition. IEEE Access 7:159090–159099
Article Google Scholar
Arafat SY, Iqbal MJ (2020) Urdu-text detection and recognition in natural scene images using deep learning. IEEE Access 8:96787–96803
Article Google Scholar
Arora A, Chang CC, Rekabdar B, Povey D, Etter D, Raj D, Hadian H, Trmal J, Garcia P (2019) Using ASR methods for OCR. 2019 International Conference on Document Analysis and Recognition (ICDAR), IEEE, pp 663–668
Baran R, Partila P, Wilk R (2018) Automated text detection and character recognition in natural scenes based on local image features and contour processing techniques. International Conference on Intelligent Human Systems Integration. Springer, pp 42–48
Beeferman D, Berger A (2000) Agglomerative clustering of a search engine query log. Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 407–416
Bhowmik S, Sarkar R, Nasipuri M, Doermann D (2018) Text and non-text separation in offline document images: a survey. International Journal on Document Analysis and Recognition (IJDAR) 21(1-2):1–20
Article Google Scholar
Brants T, Popat AC, Xu P, Och FJ, Dean J (2007) Large language models in machine translation. Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp 858–867
Breuel TM, Ul-Hasan A, Al-Azawi MA, Shafait F (2013) High-performance OCR for printed English and Fraktur using LSTM networks. 12th International Conference on Document Analysis and Recognition, IEEE, pp 683–687
Chandio AA, Pickering M (2019) Convolutional Feature Fusion for Multi-Language Text Detection in Natural Scene Images. In: 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET). IEEE, pp 1–6
Chandio AA, Pickering M, Shafi K (2018) Character classification and recognition for Urdu texts in natural scene images. In: International Conference on Computing, Mathematics and Engineering Technologies (iCoMET). IEEE, pp 1–6
Chandio AA, Leghari M, Memon MA, Leghari M, Jalbani AH (2020) A database for Urdu text detection and recognition in natural scene images. Mehran University Research Journal of Engineering and Technology 39(1):47–54
Article Google Scholar
Chandio AA, Asikuzzaman M, Pickering M, Leghari M (2020) Cursive-text: A comprehensive dataset for end-to-end Urdu text recognition in natural scene images. Data in Brief 105749
Chen H, Tsai SS, Schroth G, Chen DM, Grzeszczuk R, Girod B (2011) Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: 18th International Conference on Image Processing. IEEE, pp 2609–2612
Dang S, Wen M, Mumtaz S, Li J, Li C (2020) Enabling Multi-carrier relay selection by sensing fusion and cascaded ANN for intelligent vehicular communications. IEEE Sensors Journal
Darab M, Rahmati M (2012) A hybrid approach to localize farsi text in natural scene images. Procedia Comput Sci 13:171–184
Article Google Scholar
Das D, Philip J, Mathew M, Jawahar C (2019) A cost efficient approach to correct OCR errors in large document collections. In: International Conference on Document Analysis and Recognition (ICDAR). IEEE, pp 655–662
Devlin J, Cheng H, Fang H, Gupta S, Deng L, He X, Zweig G, Mitchell M (2015) Language models for image captioning: the quirks and what works. arXiv preprint:1505.01809
Din IU, Siddiqi I, Khalid S, Azam T (2017) Segmentation-free optical character recognition for printed Urdu text. EURASIP J Image Vide 2017(1):62
Article Google Scholar
Dreyer M, Marcu D (2012) Hyter: Meaning-equivalent semantics for translation evaluation. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 162–171
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. Computer society conference on computer vision and pattern recognition. IEEE, pp 2963–2970
He P, Huang W, He T, Zhu Q, Qiao Y, Li X (2017) Single shot text detector with regional attention. IEEE International Conference on Computer Vision, pp 3047–3055
He W, Zhang X-Y, Yin F, Liu C-L (2017) Deep direct regression for Multi-oriented scene text detection. Proceedings of the IEEE International Conference on Computer Vision. IEEE, pp 745–753
Google Scholar
Hong T, Hull JJ (1995) Algorithms for postprocessing OCR results with visual inter-word constraints. International Conference on Image Processing. IEEE, pp 312–315
Google Scholar
Horie F, Goto H (2018) Synthetic scene character generator and multi-scale voting classifier for Japanese scene character recognition. In: International Conference on Image and Vision Computing New Zealand (IVCNZ). IEEE, pp 1–6
Hosozawa K, Wijaya RH, Linh TD, Seya H, Arai M, Maekawa T, Mizutani K (2018) Recognition of expiration dates written on food packages with open source OCR. International Journal of Computer Theory and Engineering 10(5):170–174
Article Google Scholar
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv preprint 1602.07360
Iqbal MS, Ahmad I, Bin L, Khan S, Rodrigues JJ (2020) Deep learning recognition of diseased and normal cell representation. T Emerg Telecommun T: e4017
Jamil AJ, Batool A, Malik Z, Mirza A, Siddiqi I (2016) Multilingual artificial text extraction and script identification from video images. Int J Adv Comput Sci Appl 1(7):529–539
Google Scholar
Javed ST, Hussain S, Maqbool A, Asloob S, Jamil S, Moin H (2010) Segmentation free nastalique Urdu OCR. World Acad Sci Eng Technol 46:456–461
Google Scholar
Khan WQ, Khan RQ (2015) Urdu optical character recognition technique using point feature matching; a generic approach. In: International Conference on Information and Communication Technologies (ICICT). IEEE, pp 1–7
Khan S, Ali H, Ullah Z, Minallah N, Maqsood S, Hafeez A (2019) Higher accurate recognition of handwritten Pashto letters through zoning feature by using K-nearest neighbour and artificial neural network. arXiv preprint:1904.03391
Khattak IU, Siddiqi I, Khalid S, Djeddi C (2015) Recognition of Urdu ligatures-a holistic approach. In: 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, pp 71–75
Kolton A, Bentov A (2019) Location based optical character recognition (OCR). U.S. Patent and Trademark Office. US Patent No. 10,489,671
Liao M, Shi B, Bai X, Wang X, Liu W (2017) Textboxes: a fast text detector with a single deep neural network. arXiv preprint:1611.06779
Liu Y, Jin L (2017) Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 1962–1969
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. European Conference on Computer Vision, Springer, pp 21–37
Long S, He X, Ya C (2018) Scene text detection and recognition: the deep learning era. arXiv preprint:1811.04256
Mahmood A, Srivastava A (2018) A novel segmentation technique for urdu type-written text. In: Recent advances on engineering, technology and computational sciences (RAETCS). IEEE, pp 1–5
Mirza A, Fayyaz M, Seher Z, Siddiqi I (2018) Urdu caption text detection using textural features. In: 2nd Mediterranean Conference on Pattern Recognition and Artificial Intelligence. ACM, pp 70–75
Mittal A, Roy PP, Singh P, Raman B (2017) Rotation and script independent text detection from video frames using sub pixel mapping. J Vis Commun Image R 46:187–198
Article Google Scholar
Naz S, Hayat K, Anwar MW, Akbar H, Razzak MI (2013) Challenges in baseline detection of cursive script languages. Science and information conference. IEEE, pp 551–556
Google Scholar
Naz S, Umar AI, Ahmed R, Razzak MI, Rashid SF, Shafait F (2016) Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks. SpringerPlus 5(1):2010
Article Google Scholar
Naz S, Umar AI, Ahmad R, Siddiqi I, Ahmed SB, Razzak MI, Shafait F (2017) Urdu Nastaliq recognition using convolutional–recursive deep learning. Neurocomputing 243:80–87
Article Google Scholar
Neumann L, Matas J (2012) Real-time scene text localization and recognition. IEEE Conference on Computer Vision And Pattern Recognition. IEEE, pp 3538–3545
Google Scholar
Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318
Qassim H, Verma A, Feinzimer D (2018) Compressed residual-VGG16 CNN model for big data places image recognition. In: 8th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, pp 169–175
Rafeeq MJ, ur Rehman Z, Khan A, Khan IA, Jadoon W (2019) Ligature categorization based Nastaliq Urdu recognition using deep neural networks. Comput Math Organ Theory 25(2):184–195
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 779–788
Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Advances in neural information processing systems, pp 91–99
Google Scholar
Rong X, Yi C, Tian Y (2017) Unambiguous text localization and retrieval for cluttered scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 5494–5502
Google Scholar
Samaee M, Tavakoli H (2017) Farsi text localization in natural scene images. International Journal of Computer Science and Information Security 15(2):22
Google Scholar
Sami Ur R, Tayyab BU, Naeem MF, Ul-Hasan A, Shafait FA (2018) Multi-faceted OCR Framework for artificial Urdu news ticker text recognition. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), 24–27 April 2018. pp 211–216. https://doi.org/10.1109/DAS.2018.83
Sanjrani AA, Baber J, Bakhtyar M, Noor W, Khalid M (2016) Handwritten optical character recognition system for Sindhi numerals. In: 2016 International Conference on Computing. Electronic and Electrical Engineering (ICE Cube), IEEE, pp 262–267
Google Scholar
Shabbir S (2016) Optical character recognition system for Urdu words in nastaliq font. Int J Adv Comput Sci Appl 7(5):567–576
Google Scholar
Shi B, Bai X, Belongie S (2017) Detecting oriented text in natural images by linking segments. Conference on Computer Vision and Pattern Recognition. IEEE, pp 2550–2558
Google Scholar
Sriman B, Schomaker L (2019) Multi-script text versus non-text classification of regions in scene images. J Vis Commun Image Represent 62:23–42
Article Google Scholar
Sulaiman Khan HA, Ullah Z, Minallah N, Maqsood S, Hafeez A (2018) KNN and ANN-based recognition of handwritten Pashto letters using zoning features. Machine Learning 9(10)
Sun X, Wu P, Hoi SC (2018) Face detection using deep learning: an improved faster RCNN approach. Neurocomputing 299:42–50
Article Google Scholar
Tounsi M, Moalla I, Alimi AM, Lebouregois F (2015) Arabic characters recognition in natural scenes using sparse coding for feature representations. In: 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, pp 1036–1040
Unar S, Jalbani AH, Jawaid MM, Shaikh M, Chandio AA (2018) Artificial Urdu text detection and localization from individual video frames. Mehran University Research Journal of Engineering and Technology 37(2):429–438
Article Google Scholar
Wang K, Babenko B, Belongie S (2011) End-to-end scene text recognition. In: 2011 International Conference on Computer Vision. IEEE, pp 1457–1464
Wang Q, Liu M, Zhang W, Guo Y, Li T (2019) Automatic proofreading in chinese: detect and correct spelling errors in character-level with deep neural networks. CCF International Conference on Natural Language Processing and Chinese Computing. Springer, pp 349–359
Google Scholar
Yan C, Xie H, Liu S, Yin J, Zhang Y, Dai Q (2017) Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst 19(1):220–229
Article Google Scholar
Yan C, Xie H, Chen J, Zha Z, Hao X, Zhang Y, Dai Q (2018) A fast Uyghur text detector for complex background images. IEEE T Multimedia 20(12):3389–3398
Article Google Scholar
Yan S, Xie Y, Wu F, Smith JS, Lu W, Zhang B (2020) Image captioning via hierarchical attention mechanism and policy gradient optimization. Signal Process 167:107329
Article Google Scholar
Yao T, Pan Y, Li Y, Mei T (2019) Hierarchy parsing for image captioning. Proceedings of the IEEE International Conference on Computer Vision, pp 2621–2629
Ye Q, Doermann D (2014) Text detection and recognition in imagery: a survey. IEEE Trans Pattern Anal Mach Intell 37(7):1480–1500
Article Google Scholar
Zaman S, Anwar K, Khan R (2016) Image character through signal and pattern formation. In: 13th learning and technology conference (L&T). IEEE, pp 1–6
Zhang Z, Shen W, Yao C, Bai X (2015) Symmetry-based text line detection in natural scenes. Conference on Computer Vision and Pattern Recognition. IEEE, pp 2558–2567
Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. Conference on Computer Vision and Pattern Recognition. IEEE, pp 4159–4167
Google Scholar
Zhang C, Peng G, Tao Y, Fu F, Jiang W, Almpanidis G, Chen K (2019) ShopSign: a diverse scene text dataset of Chinese shop signs in street views. arXiv preprint arXiv:1903.10412
Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) EAST: An efficient and accurate scene text detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 5551–5560
Google Scholar

Download references

Acknowledgments

The authors would like to acknowledge Higher Education Commission (HEC) for supporting this work under their NRPU Project No. 6338. This work was also supported by FCT/MCTES through national funds and when applicable co-funded EU funds under the Project UIDB/EEA/50008/2020; and by the Brazilian National Council for Research and Development (CNPq) via Grants No. 309335/2017-5.

Author information

Authors and Affiliations

Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan
Syed Yasser Arafat & Muhammad Javed Iqbal
Department of Computer Science and Information Technology (CS&IT), Mirpur University of Science and Technology (MUST), Mirpur, 10250, AJK, Pakistan
Syed Yasser Arafat & Nabeel Ashraf
Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Iftikhar Ahmad
Department of Computer and Information Sciences, Northumbria University, Newcastle, NE1 8ST, UK
Suleman Khan
Federal University of Piauí (UFPI), Teresina, PI, Brazil
Joel J. P. C. Rodrigues
Instituto de Telecomunicações, Covilhã, Portugal
Joel J. P. C. Rodrigues

Authors

Syed Yasser Arafat
View author publications
You can also search for this author inPubMed Google Scholar
Nabeel Ashraf
View author publications
You can also search for this author inPubMed Google Scholar
Muhammad Javed Iqbal
View author publications
You can also search for this author inPubMed Google Scholar
Iftikhar Ahmad
View author publications
You can also search for this author inPubMed Google Scholar
Suleman Khan
View author publications
You can also search for this author inPubMed Google Scholar
Joel J. P. C. Rodrigues
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Syed Yasser Arafat.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arafat, S.Y., Ashraf, N., Iqbal, M.J. et al. Urdu signboard detection and recognition using deep learning. Multimed Tools Appl 81, 11965–11987 (2022). https://doi.org/10.1007/s11042-020-10175-2

Download citation

Received: 01 May 2020
Revised: 20 August 2020
Accepted: 10 November 2020
Published: 06 January 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s11042-020-10175-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Urdu signboard detection and recognition using deep learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recognition of Indian Sign Language (ISL) Using Deep Learning Model

A Study for Sign Language Detection Using Deep Learning Methods

Deep Neural Networks for Image-Based Indian Sign Language Recognition: A Comprehensive Review and Practical Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now