Targeted Optical Character Recognition: Classification Using Capsule Network

Prajapati, Pratik; Thakkar, Shaival; Shah, Ketul

doi:10.1007/978-981-15-4018-9_42

Targeted Optical Character Recognition: Classification Using Capsule Network

Conference paper
First Online: 29 March 2020

730 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1148))

Abstract

Optical Character Recognition (OCR) is a process of digitizing an image or document containing text in a machine-readable format. In this paper, we are focusing on targeting only the numeric part with a few special characters in the tables. Many firms dealing in financial information would want to parse data from scanned tables and in some cases, they do not focus on the row labels as they might not change a lot. Only focusing on numeric information may also provide language independence to such firms that deal with documents written in a variety of languages. They can have foreign language experts who can just read row labels and have the OCR extract the numeric data. This makes their collection processes fast. We developed a targeted OCR to save time by processing only important characters and it can also overcome erroneous predictions in case of under segmentation of characters. In this paper, we propose a novel approach which segments the document into blocks of text (each line or word into one block) and classifies each block as numeric or non-numeric using a binary CNN. The process of character level segmentation and classification using capsule networks is then applied only to the blocks which are classified as numeric by the binary CNN.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Christensson, P.: OCR Definition, 16 April 2018. https://techterms.com. Accessed 14 Nov 2018
Sharma, S., Sasi, A., Cheeran, A.: An SVM based character recognition system. In: 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information Communication Technology (RTEICT), pp. 1703–1707 (2017)
Google Scholar
Lu, Z.A., Bazzi, I., Kornai, A., Makhoul, J., Natarajan, P. S., Schwartz, R.: A robust, language-independent OCR system. In: Proceedings of SPIE - The International Society for Optical Engineering, vol. 3584 (2000)
Google Scholar
Bebis, G., Georgiopoulos, M.: Feed-forward neural networks. IEEE Potentials 13(4), 27–31 (1994)
Article Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.7665
Article Google Scholar
Simon, M., Rodner, E., Denzler, J.: ImageNet pre-trained models with batch normalization. CoRR, vol. abs/1612.01452 (2016). http://arxiv.org/abs/1612.01452
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015 (2015). http://arxiv.org/abs/1409.1556
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. arXiv e-prints, October 2017
Google Scholar
FactSet. FactSet Research Systems: Company Name Inc.: Targets and ratings (n.d.). From FactSet database. Accessed 25 May 2018
Google Scholar
Dataset: Characters from computer fonts with 4 variations (Combinations of italic, bold and normal). http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/EnglishFnt.tgz

Download references

Author information

Authors and Affiliations

Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT), Gandhinagar, 382007, GJ, India
Pratik Prajapati & Ketul Shah
FactSet Systems India Pvt. Ltd., Hyderabad, 500032, Telangana, India
Shaival Thakkar

Authors

Pratik Prajapati
View author publications
You can also search for this author in PubMed Google Scholar
Shaival Thakkar
View author publications
You can also search for this author in PubMed Google Scholar
Ketul Shah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pratik Prajapati .

Editor information

Editors and Affiliations

Malaviya National Institute of Technology, Jaipur, Rajasthan, India
Neeta Nain
Malaviya National Institute of Technology, Jaipur, Rajasthan, India
Santosh Kumar Vipparthi
Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Balasubramanian Raman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prajapati, P., Thakkar, S., Shah, K. (2020). Targeted Optical Character Recognition: Classification Using Capsule Network. In: Nain, N., Vipparthi, S., Raman, B. (eds) Computer Vision and Image Processing. CVIP 2019. Communications in Computer and Information Science, vol 1148. Springer, Singapore. https://doi.org/10.1007/978-981-15-4018-9_42

Download citation

DOI: https://doi.org/10.1007/978-981-15-4018-9_42
Published: 29 March 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4017-2
Online ISBN: 978-981-15-4018-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics