A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification

Shivakumara, Palaiahnakote; Jain, Tanmay; Surana, Nitish; Pal, Umapada; Lu, Tong; Blumenstein, Michael; Chanda, Sukalpa

doi:10.1007/978-3-030-86159-9_11

Palaiahnakote Shivakumara¹⁰,
Tanmay Jain¹¹,
Nitish Surana¹¹,
Umapada Pal¹¹,
Tong Lu¹²,
Michael Blumenstein¹³ &
…
Sukalpa Chanda¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12917))

Included in the following conference series:

International Conference on Document Analysis and Recognition

1978 Accesses

Abstract

Due to the presence of struck-out handwritten words in document images, the performance of different methods degrades for several important applications, such as handwriting recognition, writer, gender, fraudulent document identification, document age estimation, writer age estimation, normal/abnormal behavior of person analysis, and descriptive answer evaluation. This work proposes a new method which combines connected component analysis for text component detection and deep learning for classification of struck-out and non-struck-out words. For text component detection, the proposed method finds the stroke width to detect edges of texts in images, and then performs smoothing operations to remove noise. Furthermore, morphological operations are performed on smoothed images to label connected components as text by fixing bounding boxes. Inspired by the great success of deep learning models, we explore DenseNet for classifying struck-out and non-struck-out handwritten components by considering text components as input. Experimental results on our dataset demonstrate the proposed method outperforms the existing methods in terms of classification rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recognition of Struck Out Words Using a Deep Learning Approach

Scene Text Detection Based on Robust Stroke Width Transform and Deep Belief Network

Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

Article 26 August 2020

References

Chaudhuri, B.B., Adak, C.: An approach for detecting and cleaning of struck-out handwritten text. Pattern Recogn. 61, 282–294 (2017)
Article Google Scholar
Brink, A., Klauw, H.V.D., Schomaker, L.: Automatic removal of crossed-out handwritten text and the effect on writer verification and identification. In: Proceedings of SPIE (2008)
Google Scholar
Shivakumara, P., Pal, U., Lu, T., Chakarborti, T., Blumenstein, M.: A new roadmap for evaluating descriptive handwritten answer script. In: Proceedings of ICPRAI, pp. 83–96 (2019)
Google Scholar
Navya, B.J., et al.: Multi-gradient directional features for gender identification. In: Proceedings of ICPR, pp. 3657–3662 (2018)
Google Scholar
Raghunandan, K.S., et al.: Fourier coefficients for fraud handwritten document identification through age analysis. In: Proceedings of ICFHR, pp. 25–30 (2016)
Google Scholar
Basavaraj, V., Shivakumara, P., Guru, D.S., Pal, U., Lu, T., Blumenstein, M.: Age estimation using disconnectedness features in handwriting. In: Proceedings of ICDAR, pp. 1131–1136 (2019)
Google Scholar
Nag, S., Shivakumara, P., Wu, Y., Pal, U., Lu, T.: New COLD feature based handwriting analysis for ethnicity/nationality identification. In: Proceedings of ICFHR, pp. 523–527 (2018)
Google Scholar
Nandanwar, L., et al.: A new method for detecting altered text in document images. In: Proceedings of ICPRAI, pp. 93–108 (2020)
Google Scholar
Kundu, S., Shivakumara, P., Grouver, A., Pal, U., Lu, T., Blumenstein, M.: A new forged handwriting detection method based on Fourier spectral density and variation. In: Proceedings of ACPR, pp. 136–150 (2019)
Google Scholar
Adak, C., Chaudhuri, B.B.: An approach of strike-out text identification from handwritten documents. In: Proceedings of ICFHR, pp. 643–648 (2014)
Google Scholar
Adak, C., Chaudhuri, B.B., Blumenstein, M.: Impact of struck-out text on writer identification. In: Proceedings of IJCNN, pp. 1465–1471 (2017)
Google Scholar
Nisa, H., Thom, J.A., Ciesielski, V., Tennakoon, R.: A deep learning approach to handwritten text recognition in the presence of struck-out text. In: Proceedings of IVCNZ (2019)
Google Scholar
Qi, Y., Huang, W.R., Li, Q., DeGange, J.L.: DeepErase: weakly supervised ink artifact removal in document text images. In: Proceedings of WACV, pp. 3511–3519 (2020)
Google Scholar
Bhattacharya, N., Frinken, V., Pal, U., Roy, P.P.: Overwriting repetition and crossing-out detection in online handwritten text. In: ACPR 2015, pp. 680–684 (2015)
Google Scholar
Wan, Z., Yuxiang, Z., Gong, X.Z., Yu, B.: DenseNet model with RAdam optimization algorithm for cancer image classification. In: Proceedings of ICCECE, pp. 771–775 (2021)
Google Scholar
Tong, W., Chen, W., Han, W., Li, X., Wang, L.: Channel-attention-based DenseNet network for remote sensing image scene classification. IEEE Trans. AEORS 13, 4121–4132 (2020).
Google Scholar
Marti, U., Bunke, H.: The IAM-database: an english sentence database for off-line handwriting recognition. IJDAR 5, 39–46 (2002)
Google Scholar
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34, 211–224 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Palaiahnakote Shivakumara
Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India
Tanmay Jain, Nitish Surana & Umapada Pal
National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China
Tong Lu
University of Technology Sydney, Sydney, Australia
Michael Blumenstein
Department of Information Technology, Østfold University College, Halden, Norway
Sukalpa Chanda

Authors

Palaiahnakote Shivakumara
View author publications
You can also search for this author in PubMed Google Scholar
Tanmay Jain
View author publications
You can also search for this author in PubMed Google Scholar
Nitish Surana
View author publications
You can also search for this author in PubMed Google Scholar
Umapada Pal
View author publications
You can also search for this author in PubMed Google Scholar
Tong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Michael Blumenstein
View author publications
You can also search for this author in PubMed Google Scholar
Sukalpa Chanda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Palaiahnakote Shivakumara .

Editor information

Editors and Affiliations

Boise State University, Boise, ID, USA
Elisa H. Barney Smith
Indian Statistical Institute, Kolkata, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shivakumara, P. et al. (2021). A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science(), vol 12917. Springer, Cham. https://doi.org/10.1007/978-3-030-86159-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-86159-9_11
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86158-2
Online ISBN: 978-3-030-86159-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)