Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

Jo, Junho; Koo, Hyung Il; Soh, Jae Woong; Cho, Nam Ik

doi:10.1007/s11042-020-09624-9

Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

Published: 26 August 2020

Volume 79, pages 32137–32150, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Junho Jo²,
Hyung Il Koo³,
Jae Woong Soh² &
…
Nam Ik Cho ORCID: orcid.org/0000-0001-5297-4649^1,2

1017 Accesses
Explore all metrics

Abstract

We present a method that separates handwritten and machine-printed components that are mixed and overlapped in documents. Many conventional methods addressed this problem by extracting connected components (CCs) and classifying the extracted CCs into two classes. They were based on the assumption that two types of components are not overlapping each other, while we are focusing on more challenging and realistic cases where the components are often overlapping each other. For this, we propose a new method that performs pixel-level classification with a convolutional neural network. Unlike conventional neural network methods, our method works in an end-to-end manner and does not require any preprocessing steps (e.g., foreground extraction, handcrafted feature extraction, and so on). For the training of our network, we develop a cross-entropy based loss function to alleviate the class imbalance problem. Regarding the training dataset, although there are some datasets of mixed printed characters and handwritten scripts, most of them do not have overlapping cases and do not provide pixel-level annotations. Hence, we also propose a data synthesis method that generates realistic pixel-level training samples having many overlappings of printed and handwritten components. Experimental results on synthetic and real images have shown the effectiveness of the proposed method. Although the proposed network has been trained only with synthetic images, it also improves the OCR rate of real documents. Specifically, the OCR rate for machine-printed texts is increased from 0.8087 to 0.9442 by removing the overlapped handwritten scribbles by our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Meitei Mayek handwritten dataset: compilation, segmentation, and character recognition

Article 31 January 2020

HPSegNet: A Method for Handwritten and Printed Text Separation in Document Images

Improved Learning for Online Handwritten Chinese Text Recognition with Convolutional Prototype Network

References

Antonacopoulos A, Bridson D, Papadopoulos C, Pletschacher S (2009) “A realistic dataset for performance evaluation of document layout analysis”. In: Document Analysis and Recognition, 2009. ICDAR’09. 10th International Conference on IEEE, pp 296–300
Bhowmik S, Kundu S, De BK, Sarkar R, Nasipuri M (2019) A Two-Stage Approach for Text and Non-text Separation from Handwritten Scientific Document Images. In: Information Technology and Applied Mathematics, Springer, Singapore, pp 41–51
Breuel TM (2003) “High performance document layout analysis.” Proceedings of the Symposium on Document Image Understanding Technology
Franke J, Oberlander M (1993) “Writing style detection by statistical combination of classifiers in form reader applications”. In: Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on. IEEE, pp. 581–584
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Kandan R, Reddy NK, Arvind KR, Ramakrishnan AG (2007) “A robust two level classification algorithm for text localization in documents”. In: International Symposium on Visual Computing Springer, pp 96–105
Kingma DP, Ba J (2014) “Adam: A method for stochastic optimization”. arXiv preprint arXiv:1412.6980
Koo HI, Cho NI (2012) Text-line extraction in handwritten Chinese documents based on an energy minimization framework. IEEE Trans Image Process 21(3):1169–1175
Article MathSciNet Google Scholar
Levenshtein VI (1966) “Binary codes capable of correcting deletions, insertions, and reversals”. Soviet physics doklady 8:10
Google Scholar
Li X-H, Yin F, Liu C-L (2018) “Printed/handwritten texts and graphics separation in complex documents using conditional random fields”. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS) IEEE, pp 145–150
Lin T-Y, Goyal P, Girshick R, He K, Doll’ar P (2018) “Focal loss for dense object detection,” IEEE transactions on pattern analysis and machine intelligence
Lin Y, Song Y, Li Y, Wang F, He K (2017) Multilingual corpus construction based on printed and handwritten character separation. Multimedia Tools and Applications 76(3):4123–4139
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Marti U-V, Bunke Horst (2002) The iam-database: an english sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5(1):39–46
Article Google Scholar
Neelima KB, Arulselvi S (2020) Classification of printed text and handwritten characters with neural networks. Journal of Critical Reviews 7(2):134–139
Google Scholar
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE transactions on systems, man, and cybernetics 9(1):62–66
Article Google Scholar
Peng X, Setlur S, Govindaraju V, Sitaram R (2013) Handwritten text separation from annotated machine printed documents using markov random fields. International Journal on Document Analysis and Recognition (IJDAR) 16 (1):1–16
Article Google Scholar
Ronneberger O, Fischer P, Brox T (2015) “u-net: Convolutional networks for biomedical image segmentation”. In: International Conference on Medical image computing and computer-assisted intervention Springer, pp 234–241
Ryu J, Koo HI, Cho NI (2014) Language-independent text-line extraction algorithm for handwritten documents. IEEE Signal processing letters 21 (9):1115–1119
Article Google Scholar
Sahare P, Dhok SB (2019) Separation of Machine-Printed and Handwritten Texts in Noisy Documents using Wavelet Transform, vol 36, pp 341–361
Seuret M, Liwicki M, Ingold R (2014) “Pixel level handwritten and printed content discrimination in scanned documents”. In: Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on IEEE, pp 423–428
Smith R (2007) “An overview of the tesseract ocr engine”. In: Document Analysis and Recognition. ICDAR 2007. Ninth International Conference on. IEEE, 2007, vol. 2, pp. 629–633
Zhou Z-H, Liu X-Y (2006) Training costsensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18 (1):63–77
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported in part by Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.NI190004,Development of AI based Robot Technologies for Understanding Assembly Instruction and Automatic Assembly Task Planning), and in part by Hancom Inc.

Author information

Authors and Affiliations

Dept. of Electrical and Computer Engineering, Seoul National University, Gwanak-ro 1, Gwanak-Gu, Seoul, 08826, Korea
Nam Ik Cho
Dept. of Electrical and Computer Eng., INMC, Seoul National University, Seoul, Korea
Junho Jo, Jae Woong Soh & Nam Ik Cho
Dept. of Electrical and Computer Engineering, Ajou University, Suwon, Korea
Hyung Il Koo

Authors

Junho Jo
View author publications
You can also search for this author inPubMed Google Scholar
Hyung Il Koo
View author publications
You can also search for this author inPubMed Google Scholar
Jae Woong Soh
View author publications
You can also search for this author inPubMed Google Scholar
Nam Ik Cho
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Nam Ik Cho.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Cite this article

Jo, J., Koo, H.I., Soh, J.W. et al. Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks. Multimed Tools Appl 79, 32137–32150 (2020). https://doi.org/10.1007/s11042-020-09624-9

Download citation

Received: 05 December 2019
Revised: 10 August 2020
Accepted: 13 August 2020
Published: 26 August 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s11042-020-09624-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Meitei Mayek handwritten dataset: compilation, segmentation, and character recognition

HPSegNet: A Method for Handwritten and Printed Text Separation in Document Images

Improved Learning for Online Handwritten Chinese Text Recognition with Convolutional Prototype Network

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now