Skip to main content

Preparing for OCR of Books Handled by Visually Impaired

  • Conference paper
  • First Online:
Ubiquitous Computing and Ambient Intelligence (IWAAL 2016, AmIHEALTH 2016, UCAmI 2016)

Abstract

The objective of this work is to synthesize the difficulties an algorithm must handle in book digitization for subsequent OCR application, such as angle correction, image distortion and words segmentation in addition to being operated by blind or visually impaired people real-time by video stream without further assistance. The developed method seems reliable, and provides good OCR results on a page by page basis. The results show improvements above 99,3 % in OCR performance in some cases, although execution time has increased. “The Vocalizer Project” emerged from a demand from the Brazilian Ministry of Culture and Education for application in schools and public libraries. It aims to create more inclusive smart cities. Furthermore, it is destined for the inclusion of visually impaired and blind people to the vast bibliographic material existent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Singh, S.: Optical character recognition techniques: a survey. J. Emerg. Trends Comput. Inf. Sci. 4(6), 545–550 (2013)

    Google Scholar 

  2. Xiu, P., Baird, H.S.: Whole-Book recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2467–2480 (2012)

    Article  Google Scholar 

  3. Chakraborty, D., Roy, P.P., Alvarez, J.M., Pal, U.: OCR from video stream of book flip-ping. In: 2013 2nd IAPR Asian Conference on Pattern Recognition, pp. 130–134. IEEE, Okinawa (2013)

    Google Scholar 

  4. Sarkar, P., Baird, H.S., Zhang, X.: Training on severely degraded text-line images. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition, pp. 38–43. IEEE (2003)

    Google Scholar 

  5. Crovato, C.D.P., et al.: A preprocessing algorithm to increase OCR performance on application processor-centric FPGA architectures. In: 14th International Conference on Smart Homes and Health Telematics. ICOST 2016 - Inclusive Smart Cities and Digital Health, Wuhan, China (2016)

    Google Scholar 

  6. Hersh, M., Johnson, M.A. (eds.): Assistive technology for visually impaired and blind people. Springer Science & Business Media, London (2010)

    Google Scholar 

  7. Liang, J., DeMenthon, D., Doermann, D.: flattening curved documents in images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 338–345. IEEE, San Diego (2005)

    Google Scholar 

  8. Shamqoli, M., Khosravi, H.: Warped document restoration by recovering shape of the surface. In: 8th Iranian Conference on Machine Vision and Image Processing, pp. 262–265. IEEE Press, Zanjan (2013)

    Google Scholar 

  9. Song, L., Wu, Y., Sun, B.: A robust and fast dewarping method of document images. In: International Conference on E-Product E-Service and E-Entertainment, pp. 1–4. IEEE Press, Henan (2010)

    Google Scholar 

  10. Stamatopoulos, N. Gatos, B., Pratikakis, I., Perantonis, S.: A two-step dewarping of camera document images. In: The 8th IAPR International Workshop on Document Analysis Systems, pp. 209–216. IEEE Press, Nara (2008)

    Google Scholar 

  11. Stamatopoulos, N., Gatos, B., Pratikakis, I., Perantonis, S.J.: Goal-oriented rectification of camera-based document images. IEEE Trans. Image Process. 20(4), 910–920 (2011). IEEE

    Article  MathSciNet  Google Scholar 

  12. Mitchell, P.E., Yan, H.: Document page segmentation and layout analysis using soft ordering. In: 15th International Conference on Proceedings Pattern Recognition, vol. 1, pp. 458–461. IEEE (2000)

    Google Scholar 

  13. Sadri, J., Cheriet, M.: A new approach for skew correction of documents based on particle swarm optimization. In: 10th International Conference on Document Analysis and Recognition, pp. 1066–1070. IEEE (2009)

    Google Scholar 

  14. Shivakumar, P., Kumar, G.H., Guru, D.S., Nagabhushan, P.: A new boundary growing and hough transform based approach for accurate skew detection in binary document images. In: International Conference on Proceedings of Intelligent Sensing and Information Processing, pp. 140–146. IEEE (2005)

    Google Scholar 

  15. Ulges, A., Lampert, C.H., Breuel, T.M.: Document image dewarping using robust estimation of curled text lines. In: Eighth International Conference on Document Analysis and Recognition, pp. 1001–1005. IEEE (2005)

    Google Scholar 

  16. Gatos, B., Pratikakis, I., Ntirogiannis, K.: Segmentation based recovery of arbitrarily warped document images. In: Ninth International Conference on In Document Analysis and Recognition, vol. 2, pp. 989–993. IEEE (2007)

    Google Scholar 

  17. Peng, X., Cao, H., Subramanian, K., Prasad, R., Natarajan, P.: Automated image quality assessment for camera-captured OCR. In: 18th IEEE International Con-ference on Image Processing, pp. 2621–2624. IEEE (2011)

    Google Scholar 

  18. Video Stabilization Using Point Feature Matching. http://www.mathworks.com/help/vision/examples/video-stabilization-using-point-feature-matching.html. Last Access: 1 June 2016

  19. Lee, K.Y., Chuang, Y.Y., Chen, B.Y., Ouhyoung, M.: Video stabilization using robust feature trajectories. In: 12th International Conference on Computer Vision, pp. 1397–1404. IEEE (2009)

    Google Scholar 

  20. Plustek BookReader V200. Plustek Inc., Taipei, Taiwan

    Google Scholar 

  21. Image Magick, version 7.0.2, Computer Software, ImageMagick Studio LLC (2016)

    Google Scholar 

  22. Tesseract, version 3.03 (rc1), Computer Software, Google Inc., Mountain View, California (2014)

    Google Scholar 

  23. DiffMatch, version 20121119, Computer Software, Google Inc., Mountain View, California (2012)

    Google Scholar 

Download references

Acknowledgements

We would like to thank FINEP and CNPq for the financial support. And a special acknowledgment to the company Pináculo.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to César Crovato .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Crovato, C., Torok, D., Heidrich, R., Cerqueira, B., Velho, E. (2016). Preparing for OCR of Books Handled by Visually Impaired. In: García, C., Caballero-Gil, P., Burmester, M., Quesada-Arencibia, A. (eds) Ubiquitous Computing and Ambient Intelligence. IWAAL AmIHEALTH UCAmI 2016 2016 2016. Lecture Notes in Computer Science(), vol 10070. Springer, Cham. https://doi.org/10.1007/978-3-319-48799-1_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48799-1_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48798-4

  • Online ISBN: 978-3-319-48799-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics