Skip to main content

Targeted Optical Character Recognition: Classification Using Capsule Network

  • Conference paper
  • First Online:
  • 730 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1148))

Abstract

Optical Character Recognition (OCR) is a process of digitizing an image or document containing text in a machine-readable format. In this paper, we are focusing on targeting only the numeric part with a few special characters in the tables. Many firms dealing in financial information would want to parse data from scanned tables and in some cases, they do not focus on the row labels as they might not change a lot. Only focusing on numeric information may also provide language independence to such firms that deal with documents written in a variety of languages. They can have foreign language experts who can just read row labels and have the OCR extract the numeric data. This makes their collection processes fast. We developed a targeted OCR to save time by processing only important characters and it can also overcome erroneous predictions in case of under segmentation of characters. In this paper, we propose a novel approach which segments the document into blocks of text (each line or word into one block) and classifies each block as numeric or non-numeric using a binary CNN. The process of character level segmentation and classification using capsule networks is then applied only to the blocks which are classified as numeric by the binary CNN.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Christensson, P.: OCR Definition, 16 April 2018. https://techterms.com. Accessed 14 Nov 2018

  2. Sharma, S., Sasi, A., Cheeran, A.: An SVM based character recognition system. In: 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information Communication Technology (RTEICT), pp. 1703–1707 (2017)

    Google Scholar 

  3. Lu, Z.A., Bazzi, I., Kornai, A., Makhoul, J., Natarajan, P. S., Schwartz, R.: A robust, language-independent OCR system. In: Proceedings of SPIE - The International Society for Optical Engineering, vol. 3584 (2000)

    Google Scholar 

  4. Bebis, G., Georgiopoulos, M.: Feed-forward neural networks. IEEE Potentials 13(4), 27–31 (1994)

    Article  Google Scholar 

  5. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)

    Article  Google Scholar 

  6. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.7665

    Article  Google Scholar 

  7. Simon, M., Rodner, E., Denzler, J.: ImageNet pre-trained models with batch normalization. CoRR, vol. abs/1612.01452 (2016). http://arxiv.org/abs/1612.01452

  8. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015 (2015). http://arxiv.org/abs/1409.1556

  9. Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. arXiv e-prints, October 2017

    Google Scholar 

  10. FactSet. FactSet Research Systems: Company Name Inc.: Targets and ratings (n.d.). From FactSet database. Accessed 25 May 2018

    Google Scholar 

  11. Dataset: Characters from computer fonts with 4 variations (Combinations of italic, bold and normal). http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/EnglishFnt.tgz

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pratik Prajapati .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Prajapati, P., Thakkar, S., Shah, K. (2020). Targeted Optical Character Recognition: Classification Using Capsule Network. In: Nain, N., Vipparthi, S., Raman, B. (eds) Computer Vision and Image Processing. CVIP 2019. Communications in Computer and Information Science, vol 1148. Springer, Singapore. https://doi.org/10.1007/978-981-15-4018-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-4018-9_42

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-4017-2

  • Online ISBN: 978-981-15-4018-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics