Skip to main content

Enabling Text-Line Segmentation in Run-Length Encoded Handwritten Document Image Using Entropy-Driven Incremental Learning

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1022))

Abstract

In today’s digital era, archival and transmission of document images are generally carried out in a compressed form in order to avoid wastage of storage space and bandwidth. In the case of CCITT Group 3 and Group 4, the compressed representation is a stream of white and black pixel intensity values called runs, correspondingly indicating background and foreground regions of the document image. In this research paper, we propose a novel entropy-driven incremental learning technique that directly works on the compressed stream of runs, and subsequently facilitates text-line segmentation in handwritten document images using entropy and connected component analysis. Spatial Entropy Quantifier (SEQ) is extracted from the stream of runs based on a suitable window. Further, incremental entropy and connected component analysis are carried out thus separating text and non-text regions leading to automatic text-line segmentation. The proposed method is validated with the compressed dataset of handwritten document images and performance is reported.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. T.4-Recommedation Standardization of group 3 facsimile apparatus for document transmission, terminal equipments and protocols for telematic services, vol. vii, fascicle, vii. 3, Geneva. Technical report (1985)

    Google Scholar 

  2. T.6-Recommendation Standardization of group 4 facsimile apparatus for document transmission, terminal equipments and protocols for telematic services, vol. vii, fascicle, vii. 3, Geneva. Technical report (1985)

    Google Scholar 

  3. Amarnath, R., Nagabhushan, P.: Spotting separator points at line terminals in compressed document images for text-line segmentation. Int. J. Comput. Appl. 172(4) (2017)

    Google Scholar 

  4. Javed, M., Krishnanand, S.H., Nagabhushan, P., Chaudhuri, B.B.: Visualizing CCITT Group 3 and Group 4 TIFF Documents and Transforming to Run-Length Compressed Format Enabling Direct Processing in Compressed Domain International Conference on Computational Modeling and Security (CMS 2016) Procedia Computer Science 85 213 – 221. Elsevier. (2016)

    Google Scholar 

  5. Javed, M., Nagabhushan, P.: A review on document image analysis techniques directly in the compressed domain. Artif Intell Rev. s10462-017-9551-9. Springer Science+Business Media Dordrecht (2017)

    Google Scholar 

  6. Gowda, S.D., Nagabhushan, P.: Entropy Quantifiers Useful for Establishing Equivalence between Text Document Images International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007)

    Google Scholar 

  7. Javed, M., Nagabhushan, P., Chaudhuri, B.B.: Entropy computations of document images in run-length compressed domain. In: Fifth International Conference on Signal and Image Processing (2014)

    Google Scholar 

  8. Sindhushree, G.S., Amarnath, R., Nagabhushan, P.: Entropy based approach for enabling text line segmentation in handwritten documents. In: First International Conference on Data Analytics and Learning (DAL), Mysore (2018). In Press Springer, LNNS

    Google Scholar 

  9. Preeti M., P. Nagabhushan, P.: Incremental feature transformation for temporal space. Int. J. Comput. 145(8), Appl. 0975–8887 (2016)

    Google Scholar 

  10. https://en.wikipedia.org/wiki/Transmission_Control_Protocol. Accessed from 31 Mar 2018

  11. Alaei, A., Pal, U., Nagabhushan, P., Kimura, F.: Painting based technique for skew estimation of scanned documents. In: International Conference on Document Analysis and Recognition (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Amarnath .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Amarnath, R., Nagabhushan, P., Javed, M. (2020). Enabling Text-Line Segmentation in Run-Length Encoded Handwritten Document Image Using Entropy-Driven Incremental Learning. In: Chaudhuri, B., Nakagawa, M., Khanna, P., Kumar, S. (eds) Proceedings of 3rd International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 1022. Springer, Singapore. https://doi.org/10.1007/978-981-32-9088-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-32-9088-4_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-32-9087-7

  • Online ISBN: 978-981-32-9088-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics