Abstract
The objective of this work is to synthesize the difficulties an algorithm must handle in book digitization for subsequent OCR application, such as angle correction, image distortion and words segmentation in addition to being operated by blind or visually impaired people real-time by video stream without further assistance. The developed method seems reliable, and provides good OCR results on a page by page basis. The results show improvements above 99,3 % in OCR performance in some cases, although execution time has increased. “The Vocalizer Project” emerged from a demand from the Brazilian Ministry of Culture and Education for application in schools and public libraries. It aims to create more inclusive smart cities. Furthermore, it is destined for the inclusion of visually impaired and blind people to the vast bibliographic material existent.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Singh, S.: Optical character recognition techniques: a survey. J. Emerg. Trends Comput. Inf. Sci. 4(6), 545–550 (2013)
Xiu, P., Baird, H.S.: Whole-Book recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2467–2480 (2012)
Chakraborty, D., Roy, P.P., Alvarez, J.M., Pal, U.: OCR from video stream of book flip-ping. In: 2013 2nd IAPR Asian Conference on Pattern Recognition, pp. 130–134. IEEE, Okinawa (2013)
Sarkar, P., Baird, H.S., Zhang, X.: Training on severely degraded text-line images. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition, pp. 38–43. IEEE (2003)
Crovato, C.D.P., et al.: A preprocessing algorithm to increase OCR performance on application processor-centric FPGA architectures. In: 14th International Conference on Smart Homes and Health Telematics. ICOST 2016 - Inclusive Smart Cities and Digital Health, Wuhan, China (2016)
Hersh, M., Johnson, M.A. (eds.): Assistive technology for visually impaired and blind people. Springer Science & Business Media, London (2010)
Liang, J., DeMenthon, D., Doermann, D.: flattening curved documents in images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 338–345. IEEE, San Diego (2005)
Shamqoli, M., Khosravi, H.: Warped document restoration by recovering shape of the surface. In: 8th Iranian Conference on Machine Vision and Image Processing, pp. 262–265. IEEE Press, Zanjan (2013)
Song, L., Wu, Y., Sun, B.: A robust and fast dewarping method of document images. In: International Conference on E-Product E-Service and E-Entertainment, pp. 1–4. IEEE Press, Henan (2010)
Stamatopoulos, N. Gatos, B., Pratikakis, I., Perantonis, S.: A two-step dewarping of camera document images. In: The 8th IAPR International Workshop on Document Analysis Systems, pp. 209–216. IEEE Press, Nara (2008)
Stamatopoulos, N., Gatos, B., Pratikakis, I., Perantonis, S.J.: Goal-oriented rectification of camera-based document images. IEEE Trans. Image Process. 20(4), 910–920 (2011). IEEE
Mitchell, P.E., Yan, H.: Document page segmentation and layout analysis using soft ordering. In: 15th International Conference on Proceedings Pattern Recognition, vol. 1, pp. 458–461. IEEE (2000)
Sadri, J., Cheriet, M.: A new approach for skew correction of documents based on particle swarm optimization. In: 10th International Conference on Document Analysis and Recognition, pp. 1066–1070. IEEE (2009)
Shivakumar, P., Kumar, G.H., Guru, D.S., Nagabhushan, P.: A new boundary growing and hough transform based approach for accurate skew detection in binary document images. In: International Conference on Proceedings of Intelligent Sensing and Information Processing, pp. 140–146. IEEE (2005)
Ulges, A., Lampert, C.H., Breuel, T.M.: Document image dewarping using robust estimation of curled text lines. In: Eighth International Conference on Document Analysis and Recognition, pp. 1001–1005. IEEE (2005)
Gatos, B., Pratikakis, I., Ntirogiannis, K.: Segmentation based recovery of arbitrarily warped document images. In: Ninth International Conference on In Document Analysis and Recognition, vol. 2, pp. 989–993. IEEE (2007)
Peng, X., Cao, H., Subramanian, K., Prasad, R., Natarajan, P.: Automated image quality assessment for camera-captured OCR. In: 18th IEEE International Con-ference on Image Processing, pp. 2621–2624. IEEE (2011)
Video Stabilization Using Point Feature Matching. http://www.mathworks.com/help/vision/examples/video-stabilization-using-point-feature-matching.html. Last Access: 1 June 2016
Lee, K.Y., Chuang, Y.Y., Chen, B.Y., Ouhyoung, M.: Video stabilization using robust feature trajectories. In: 12th International Conference on Computer Vision, pp. 1397–1404. IEEE (2009)
Plustek BookReader V200. Plustek Inc., Taipei, Taiwan
Image Magick, version 7.0.2, Computer Software, ImageMagick Studio LLC (2016)
Tesseract, version 3.03 (rc1), Computer Software, Google Inc., Mountain View, California (2014)
DiffMatch, version 20121119, Computer Software, Google Inc., Mountain View, California (2012)
Acknowledgements
We would like to thank FINEP and CNPq for the financial support. And a special acknowledgment to the company Pináculo.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Crovato, C., Torok, D., Heidrich, R., Cerqueira, B., Velho, E. (2016). Preparing for OCR of Books Handled by Visually Impaired. In: García, C., Caballero-Gil, P., Burmester, M., Quesada-Arencibia, A. (eds) Ubiquitous Computing and Ambient Intelligence. IWAAL AmIHEALTH UCAmI 2016 2016 2016. Lecture Notes in Computer Science(), vol 10070. Springer, Cham. https://doi.org/10.1007/978-3-319-48799-1_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-48799-1_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48798-4
Online ISBN: 978-3-319-48799-1
eBook Packages: Computer ScienceComputer Science (R0)