Preparing for OCR of Books Handled by Visually Impaired

Crovato, César; Torok, Delfim; Heidrich, Regina; Cerqueira, Bernardo; Velho, Eduardo

doi:10.1007/978-3-319-48799-1_46

César Crovato¹⁷,
Delfim Torok¹⁷,
Regina Heidrich¹⁸,
Bernardo Cerqueira¹⁹ &
…
Eduardo Velho¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10070))

Included in the following conference series:

1312 Accesses

Abstract

The objective of this work is to synthesize the difficulties an algorithm must handle in book digitization for subsequent OCR application, such as angle correction, image distortion and words segmentation in addition to being operated by blind or visually impaired people real-time by video stream without further assistance. The developed method seems reliable, and provides good OCR results on a page by page basis. The results show improvements above 99,3 % in OCR performance in some cases, although execution time has increased. “The Vocalizer Project” emerged from a demand from the Brazilian Ministry of Culture and Education for application in schools and public libraries. It aims to create more inclusive smart cities. Furthermore, it is destined for the inclusion of visually impaired and blind people to the vast bibliographic material existent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LECTO: A Smart Assistant for People with Visual Impairment for Reading Texts in Spanish

NCERT5K-IITRPR: A Benchmark Dataset for Non-textual Component Detection in School Books

An Optimized Object Detection System for Visually Impaired People

References

Singh, S.: Optical character recognition techniques: a survey. J. Emerg. Trends Comput. Inf. Sci. 4(6), 545–550 (2013)
Google Scholar
Xiu, P., Baird, H.S.: Whole-Book recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2467–2480 (2012)
Article Google Scholar
Chakraborty, D., Roy, P.P., Alvarez, J.M., Pal, U.: OCR from video stream of book flip-ping. In: 2013 2nd IAPR Asian Conference on Pattern Recognition, pp. 130–134. IEEE, Okinawa (2013)
Google Scholar
Sarkar, P., Baird, H.S., Zhang, X.: Training on severely degraded text-line images. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition, pp. 38–43. IEEE (2003)
Google Scholar
Crovato, C.D.P., et al.: A preprocessing algorithm to increase OCR performance on application processor-centric FPGA architectures. In: 14th International Conference on Smart Homes and Health Telematics. ICOST 2016 - Inclusive Smart Cities and Digital Health, Wuhan, China (2016)
Google Scholar
Hersh, M., Johnson, M.A. (eds.): Assistive technology for visually impaired and blind people. Springer Science & Business Media, London (2010)
Google Scholar
Liang, J., DeMenthon, D., Doermann, D.: flattening curved documents in images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 338–345. IEEE, San Diego (2005)
Google Scholar
Shamqoli, M., Khosravi, H.: Warped document restoration by recovering shape of the surface. In: 8th Iranian Conference on Machine Vision and Image Processing, pp. 262–265. IEEE Press, Zanjan (2013)
Google Scholar
Song, L., Wu, Y., Sun, B.: A robust and fast dewarping method of document images. In: International Conference on E-Product E-Service and E-Entertainment, pp. 1–4. IEEE Press, Henan (2010)
Google Scholar
Stamatopoulos, N. Gatos, B., Pratikakis, I., Perantonis, S.: A two-step dewarping of camera document images. In: The 8th IAPR International Workshop on Document Analysis Systems, pp. 209–216. IEEE Press, Nara (2008)
Google Scholar
Stamatopoulos, N., Gatos, B., Pratikakis, I., Perantonis, S.J.: Goal-oriented rectification of camera-based document images. IEEE Trans. Image Process. 20(4), 910–920 (2011). IEEE
Article MathSciNet Google Scholar
Mitchell, P.E., Yan, H.: Document page segmentation and layout analysis using soft ordering. In: 15th International Conference on Proceedings Pattern Recognition, vol. 1, pp. 458–461. IEEE (2000)
Google Scholar
Sadri, J., Cheriet, M.: A new approach for skew correction of documents based on particle swarm optimization. In: 10th International Conference on Document Analysis and Recognition, pp. 1066–1070. IEEE (2009)
Google Scholar
Shivakumar, P., Kumar, G.H., Guru, D.S., Nagabhushan, P.: A new boundary growing and hough transform based approach for accurate skew detection in binary document images. In: International Conference on Proceedings of Intelligent Sensing and Information Processing, pp. 140–146. IEEE (2005)
Google Scholar
Ulges, A., Lampert, C.H., Breuel, T.M.: Document image dewarping using robust estimation of curled text lines. In: Eighth International Conference on Document Analysis and Recognition, pp. 1001–1005. IEEE (2005)
Google Scholar
Gatos, B., Pratikakis, I., Ntirogiannis, K.: Segmentation based recovery of arbitrarily warped document images. In: Ninth International Conference on In Document Analysis and Recognition, vol. 2, pp. 989–993. IEEE (2007)
Google Scholar
Peng, X., Cao, H., Subramanian, K., Prasad, R., Natarajan, P.: Automated image quality assessment for camera-captured OCR. In: 18th IEEE International Con-ference on Image Processing, pp. 2621–2624. IEEE (2011)
Google Scholar
Video Stabilization Using Point Feature Matching. http://www.mathworks.com/help/vision/examples/video-stabilization-using-point-feature-matching.html. Last Access: 1 June 2016
Lee, K.Y., Chuang, Y.Y., Chen, B.Y., Ouhyoung, M.: Video stabilization using robust feature trajectories. In: 12th International Conference on Computer Vision, pp. 1397–1404. IEEE (2009)
Google Scholar
Plustek BookReader V200. Plustek Inc., Taipei, Taiwan
Google Scholar
Image Magick, version 7.0.2, Computer Software, ImageMagick Studio LLC (2016)
Google Scholar
Tesseract, version 3.03 (rc1), Computer Software, Google Inc., Mountain View, California (2014)
Google Scholar
DiffMatch, version 20121119, Computer Software, Google Inc., Mountain View, California (2012)
Google Scholar

Download references

Acknowledgements

We would like to thank FINEP and CNPq for the financial support. And a special acknowledgment to the company Pináculo.

Author information

Authors and Affiliations

Institute of Technology and Exact Sciences at Feevale University, Novo Hamburgo, Brazil
César Crovato & Delfim Torok
Feevale University, Novo Hamburgo, Brazil
Regina Heidrich
Scientific Improvement Researcher, Feevale University, Novo Hamburgo, Brazil
Bernardo Cerqueira & Eduardo Velho

Authors

César Crovato
View author publications
You can also search for this author in PubMed Google Scholar
Delfim Torok
View author publications
You can also search for this author in PubMed Google Scholar
Regina Heidrich
View author publications
You can also search for this author in PubMed Google Scholar
Bernardo Cerqueira
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Velho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to César Crovato .

Editor information

Editors and Affiliations

University of Las Palmas de Gran Canaria, Las Palmas, Spain
Carmelo R. García
Departamento de Estadistica, Universidad La Laguna, La Laguna, Spain
Pino Caballero-Gil
Florida State University, Tallahassee, Florida, USA
Mike Burmester
University of Las Palmas de Gran Canaria, Las Palmas, Spain
Alexis Quesada-Arencibia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crovato, C., Torok, D., Heidrich, R., Cerqueira, B., Velho, E. (2016). Preparing for OCR of Books Handled by Visually Impaired. In: García, C., Caballero-Gil, P., Burmester, M., Quesada-Arencibia, A. (eds) Ubiquitous Computing and Ambient Intelligence. IWAAL AmIHEALTH UCAmI 2016 2016 2016. Lecture Notes in Computer Science(), vol 10070. Springer, Cham. https://doi.org/10.1007/978-3-319-48799-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-48799-1_46
Published: 03 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48798-4
Online ISBN: 978-3-319-48799-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics