Abstract
As the amount of born-analog engineering documents is still very large, the information they contain can not be processed by a machine or any automatic process. To overcome this, a whole process of digital transformation must be implemented on this type of documents. In this paper, we propose to detect and recognize all textual entities present on this type of documents. They can be part of technical details about a technical diagrams, bill of material or functional descriptions, or simple tags written in a standardized format. These texts are present in the document in an unstructured way, so that they can be located anywhere on the plan. They can also be of any size and orientation. We propose here a study allowing the text detection and recognition with or without associated semantics (symbolic annotations and dictionary words). A solution coupling a text detector based on a deep learning architecture, an open-source OCR for string recognition and an OCR post-correction process based on text clustering is proposed as a first step in the digital transformation process of industrial plans and P&ID schemes. The results applied to a database of 30 images of industrial maps and plans from different industries (oil, gas, water...) are very promising and close to 84% of correct detection and 82% of correct tags (and lexicon-free words) recognition after post-correction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jamieson, L., Moreno-Garcia, C.F., Elyan, E.: Deep learning for text detection and recognition in complex engineering diagrams. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2020)
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2642–2651 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.: OpenNMT: open-source toolkit for neural machine translation. In: Proceedings of the Association for Computational Linguistics on System Demonstrations, ACL 2017, pp. 67–72 (2017)
Hakala, K., Vesanto, A., Miekka, N., Salakoski, T., Ginter, F.: Leveraging text repetitions and denoising autoencoders in OCR post-correction. CoRR abs/1906.10907 (2019)
Huynh, V.-N., Hamdi, A., Doucet, A.: When to use OCR post-correction for named entity recognition? In: Ishita, E., Pang, N.L.S., Zhou, L. (eds.) ICADL 2020. LNCS, vol. 12504, pp. 33–42. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64452-9_3
Moreno-GarcĂa, C.F., Elyan, E., Jayne, C.: New trends on digitisation of complex engineering drawings. Neural Comput. Appl. 31(6), 1695–1712 (2018). https://doi.org/10.1007/s00521-018-3583-1
Das, D., Philip, J., Mathew, M., Jawahar, C.V.: A cost efficient approach to correct ocr errors in large document collections. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 655–662 (2019)
Smith, R.: An overview of the Tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), pp. 629–633 (2007)
Jung, E.-S., Son, H., Oh, K., Yun, Y., Kwon, S., Kim, M.S.: DUET: detection utilizing enhancement for text in scanned or captured documents. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5466–5473 (2021)
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9357–9366 (2019)
Yoshihashi, R., Tanaka, T., Doi, K., Fujino, T., Yamashita, N.: Context-Free TextSpotter for real-time and mobile end-to-end text detection and recognition. In: LladĂ³s, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 240–257. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_16
Dueck, D.: Affinity propagation: clustering data by passing messages. Ph.D. dissertation. Citeseer (2009)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–6 (2007)
Refianti, R., Mutiara, A.B., Syamsudduha, A.A.: Performance evaluation of affinity propagation approaches on data clustering. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 7(3), 420–429 (2016)
Volk, M., Furrer, L., Sennrich, R.: Strategies for reducing and correcting OCR errors. In: Sporleder, C., van den Bosch, A., Zervanou, K. (eds.) Language Technology for Cultural Heritage. TANLP, pp. 3–22. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20227-8_1
Mittendorf, E., Schäuble, P.: Information retrieval can cope with many errors. Inf. Retrieval 3, 189–216 (2000)
Drobac, S., Lindén, K.: Optical character recognition with neural networks and post-correction with finite state methods. Int. J. Doc. Anal. Recogn. (IJDAR) 23(4), 279–295 (2020)
Nguyen, T., Jatowt, A., Nguyen, N., Coustaty, M., Doucet, A.: Neural machine translation with BERT for Post-OCR error detection and correction. In: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, pp. 333–336. Association for Computing Machinery (2020)
Bazzo, G.T., Lorentz, G.A., Suarez Vargas, D., Moreira, V.P.: Assessing the impact of OCR errors in information retrieval. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 102–109. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_13
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Francois, M., Eglin, V., Biou, M. (2022). Text Detection and Post-OCR Correction in Engineering Documents. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_49
Download citation
DOI: https://doi.org/10.1007/978-3-031-06555-2_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06554-5
Online ISBN: 978-3-031-06555-2
eBook Packages: Computer ScienceComputer Science (R0)