4 March 2020 Deep optical character recognition: a case of Pashto language
Shizza Zahoor, Saeeda Naz, Naila Habib Khan, Muhammad I. Razzak
Author Affiliations +
Abstract

Over the past decades, text recognition technologies have focused immensely on noncursive isolated scripts. A text recognition system for the cursive Pashto script will serve as a great contribution, allowing the traditional, cultural, and educational Pashto literature to be converted into machine-readable form. We propose the use of deep learning architectures based on the transfer learning for the recognition of Pashto ligatures. For recognition analysis and evaluation, the ligature images in the dataset are preprocessed by data augmentation techniques, i.e., negatives, contours, and rotated to increase the variation of each sample and size of the original dataset. Rich feature representations are automatically extracted from the Pashto ligature images using deep convolution layers of the convolution neural network (CNN) architectures using fine-tuned approach. Pretrained CNN architectures: AlexNet, GoogleNet, and VGG (VGG-16 and VGG-19) are used for classification by feeding the extracted features to a fully connected layer and a softmax layer. The proposed deep transfer-based learning has achieved phenomenal recognition rates for Pashto ligatures on benchmark FAST-NU Pashto dataset. An accuracy of 97.24%, 97.46%, and 99.03% is achieved using AlexNext, GoogleNet, and VGGNet architectures, respectively.

© 2020 SPIE and IS&T 1017-9909/2020/$28.00 © 2020 SPIE and IS&T
Shizza Zahoor, Saeeda Naz, Naila Habib Khan, and Muhammad I. Razzak "Deep optical character recognition: a case of Pashto language," Journal of Electronic Imaging 29(2), 023002 (4 March 2020). https://doi.org/10.1117/1.JEI.29.2.023002
Received: 30 September 2019; Accepted: 14 February 2020; Published: 4 March 2020
Lens.org Logo
CITATIONS
Cited by 7 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Data modeling

RGB color model

Convolution

Feature extraction

Analytical research

Network architectures

Back to Top