Skip to main content

A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2021 Workshops (ICDAR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12917))

Included in the following conference series:

  • 1978 Accesses

Abstract

Due to the presence of struck-out handwritten words in document images, the performance of different methods degrades for several important applications, such as handwriting recognition, writer, gender, fraudulent document identification, document age estimation, writer age estimation, normal/abnormal behavior of person analysis, and descriptive answer evaluation. This work proposes a new method which combines connected component analysis for text component detection and deep learning for classification of struck-out and non-struck-out words. For text component detection, the proposed method finds the stroke width to detect edges of texts in images, and then performs smoothing operations to remove noise. Furthermore, morphological operations are performed on smoothed images to label connected components as text by fixing bounding boxes. Inspired by the great success of deep learning models, we explore DenseNet for classifying struck-out and non-struck-out handwritten components by considering text components as input. Experimental results on our dataset demonstrate the proposed method outperforms the existing methods in terms of classification rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chaudhuri, B.B., Adak, C.: An approach for detecting and cleaning of struck-out handwritten text. Pattern Recogn. 61, 282–294 (2017)

    Article  Google Scholar 

  2. Brink, A., Klauw, H.V.D., Schomaker, L.: Automatic removal of crossed-out handwritten text and the effect on writer verification and identification. In: Proceedings of SPIE (2008)

    Google Scholar 

  3. Shivakumara, P., Pal, U., Lu, T., Chakarborti, T., Blumenstein, M.: A new roadmap for evaluating descriptive handwritten answer script. In: Proceedings of ICPRAI, pp. 83–96 (2019)

    Google Scholar 

  4. Navya, B.J., et al.: Multi-gradient directional features for gender identification. In: Proceedings of ICPR, pp. 3657–3662 (2018)

    Google Scholar 

  5. Raghunandan, K.S., et al.: Fourier coefficients for fraud handwritten document identification through age analysis. In: Proceedings of ICFHR, pp. 25–30 (2016)

    Google Scholar 

  6. Basavaraj, V., Shivakumara, P., Guru, D.S., Pal, U., Lu, T., Blumenstein, M.: Age estimation using disconnectedness features in handwriting. In: Proceedings of ICDAR, pp. 1131–1136 (2019)

    Google Scholar 

  7. Nag, S., Shivakumara, P., Wu, Y., Pal, U., Lu, T.: New COLD feature based handwriting analysis for ethnicity/nationality identification. In: Proceedings of ICFHR, pp. 523–527 (2018)

    Google Scholar 

  8. Nandanwar, L., et al.: A new method for detecting altered text in document images. In: Proceedings of ICPRAI, pp. 93–108 (2020)

    Google Scholar 

  9. Kundu, S., Shivakumara, P., Grouver, A., Pal, U., Lu, T., Blumenstein, M.: A new forged handwriting detection method based on Fourier spectral density and variation. In: Proceedings of ACPR, pp. 136–150 (2019)

    Google Scholar 

  10. Adak, C., Chaudhuri, B.B.: An approach of strike-out text identification from handwritten documents. In: Proceedings of ICFHR, pp. 643–648 (2014)

    Google Scholar 

  11. Adak, C., Chaudhuri, B.B., Blumenstein, M.: Impact of struck-out text on writer identification. In: Proceedings of IJCNN, pp. 1465–1471 (2017)

    Google Scholar 

  12. Nisa, H., Thom, J.A., Ciesielski, V., Tennakoon, R.: A deep learning approach to handwritten text recognition in the presence of struck-out text. In: Proceedings of IVCNZ (2019)

    Google Scholar 

  13. Qi, Y., Huang, W.R., Li, Q., DeGange, J.L.: DeepErase: weakly supervised ink artifact removal in document text images. In: Proceedings of WACV, pp. 3511–3519 (2020)

    Google Scholar 

  14. Bhattacharya, N., Frinken, V., Pal, U., Roy, P.P.: Overwriting repetition and crossing-out detection in online handwritten text. In: ACPR 2015, pp. 680–684 (2015)

    Google Scholar 

  15. Wan, Z., Yuxiang, Z., Gong, X.Z., Yu, B.: DenseNet model with RAdam optimization algorithm for cancer image classification. In: Proceedings of ICCECE, pp. 771–775 (2021)

    Google Scholar 

  16. Tong, W., Chen, W., Han, W., Li, X., Wang, L.: Channel-attention-based DenseNet network for remote sensing image scene classification. IEEE Trans. AEORS 13, 4121–4132 (2020).

    Google Scholar 

  17. Marti, U., Bunke, H.: The IAM-database: an english sentence database for off-line handwriting recognition. IJDAR 5, 39–46 (2002)

    Google Scholar 

  18. Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34, 211–224 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Palaiahnakote Shivakumara .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shivakumara, P. et al. (2021). A Connected Component-Based Deep Learning Model for Multi-type Struck-Out Component Classification. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science(), vol 12917. Springer, Cham. https://doi.org/10.1007/978-3-030-86159-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86159-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86158-2

  • Online ISBN: 978-3-030-86159-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics