Skip to main content

Direct Unsupervised Text Line Extraction from Colored Historical Manuscript Images Using DCT

  • Conference paper
  • First Online:
Image Analysis and Recognition (ICIAR 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9730))

Included in the following conference series:

  • 2736 Accesses

Abstract

Extracting lines of text from a manuscript is an important preprocessing step in many digital paleography applications. These extracted lines play a fundamental part in the identification of the author and/or age of the manuscript. In this paper we present an unsupervised approach to text line extraction in historical manuscripts that can be applied directly to a color manuscript image. Each of the red, green and blue channels are processed separately by applying DCT on them individually. One of the key advantages of this approach is that it can be applied directly to the manuscript image without any preprocessing, training or tuning steps. Extensive testing on complex Arabic handwritten manuscripts shows the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shahab, S.A., Al-Khatib, W.G., Mahmoud, S.A.: Computer Aided Indexing of Historical Manuscripts. In: International Conference on Computer Graphics, Imaging and Visualisation, Sydney, 2006, pp. 287–295. doi:10.1109/CGIV.2006.31

  2. Fiel, S., Hollaus, F., Gau, M., Sablatnig, R.: Writer identification on historical Glagolitic documents. In: SPIE Proceedings on Document Recognition and Retrieval XXI, p. 902102 (2013). doi:10.1117/12.2042338

  3. He, S., Sammara, P., Burgers, J., Schomaker, L.: Towards style-based dating of historical documents. In: 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), Heraklion, pp. 265–270 (2014). doi:10.1109/ICFHR.2014.52

  4. Antonacopoulos, A., Clausner, C., Papadopoulous, C., Pletschacher, S.: Historical document layout analysis competition. In: IEEE International Conference on Document Analysis and Recognition (ICDAR), pp. 1516–1520 (2011)

    Google Scholar 

  5. Antonacopoulos, A., Clausner, C., Papadopoulous, C., Pletschacher, S.: ICDAR 2013 competition on Historical Newspaper Layout Analysis (HNLA 2013). In: IEEE International Conference on Document Analysis and Recognition (ICDAR), pp. 1454–1458 (2013)

    Google Scholar 

  6. Sulem, L.L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recogn. (IJDAR) 9(2–4), 123–138 (2007). doi:10.1007/s10032-006-0023-z

    Article  Google Scholar 

  7. Liwicki, M., Indermuhle, E., Bunke, H.: On-line handwritten text line detection using dynamic programming. In: International Conference on Document Analysis and Recognition, vol. 1, pp. 447–451 (2007)

    Google Scholar 

  8. Fischer, A., Wuthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: International Conference on Virtual Systems and Multimedia, pp. 137–142 (2009)

    Google Scholar 

  9. Fischer, A., Indermühle, E., Bunke, H., Viehhauser, G., Stolz, M.: Ground truth creation for handwriting recognition in historical documents. In: IAPR International Workshop on Document Analysis Systems, pp. 3–10 (2010)

    Google Scholar 

  10. Bulacu, M., van Koert, R., Schomaker, L., van der Zant, T.: Layout analysis of handwritten historical documents for searching the archive of the cabinet of the Dutch Queen. In: International Conference on Document Analysis and Recognition, pp. 357–361 (2007)

    Google Scholar 

  11. Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Pattern Recogn. Artif. Intell. 15(01), 65–90 (2001)

    Article  Google Scholar 

  12. Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)

    Article  MATH  Google Scholar 

  13. Likforman-Sulem, L., Hanimyan, A., Faure, C.: A Hough based algorithm for extracting text lines in handwritten documents. In: International Conference on Document Analysis and Recognition, vol. 2, pp. 774–777 (1995)

    Google Scholar 

  14. Nikolaou, N., Makridis, M., Gatos, B., Stamatopoulos, N., Papamarkos, N.: Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image Vis. Comput. 28(4), 590–604 (2010)

    Article  Google Scholar 

  15. Arvanitopoulos, N., Susstrunk, S.: Seam carving for text line extraction on color and grayscale historical manuscripts. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, pp. 726–731, December 2014

    Google Scholar 

  16. Arvanitopoulos, N., Susstrunk, S.: Seam carving for text line extraction on color and grayscale historical manuscripts. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, pp. 726–731, December 2014

    Google Scholar 

  17. Alaql, O., Lu, C.C.: Text line extraction for historical document images using steerable directional filters. In: 2014 International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, 2014, pp. 312–317. doi:10.1109/ICALIP.2014.7009807

  18. Shin, H.K.: Fast text line segmentation model based on DCT for color image. Korean Inf. Process. Soc. Trans. 17D(6), 463–470 (2010). doi:10.3745/KIPSTD.2010.17D.6.463

    Google Scholar 

  19. Ahmed, N., Natarajan, T., Rao, K.R.: Discrete Cosine Transform. IEEE Trans. Comput. C-32, 90–93 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  20. Strang, G.: The Discrete Cosine Transform. SIAM Rev. 41(1), 135–147 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  21. Hung, A.C., Meng, T.H.-Y.: A comparison of fast DCT algorithms. Multimed. Syst. 2(5), 204–217 (1994)

    Article  Google Scholar 

  22. Haque, M.A.: A two-dimensional fast cosine transform. IEEE Trans. Acoust. Speech Signal Process. ASSP-33, 1532–1539 (1985)

    Article  MATH  Google Scholar 

Download references

Acknowledgement

This publication was made possible by NPRP grant # NPRP NPRP7-442-1-082 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asim Baig .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Baig, A., Al-Maadeed, S., Bouridane, A., Cheriet, M. (2016). Direct Unsupervised Text Line Extraction from Colored Historical Manuscript Images Using DCT. In: Campilho, A., Karray, F. (eds) Image Analysis and Recognition. ICIAR 2016. Lecture Notes in Computer Science(), vol 9730. Springer, Cham. https://doi.org/10.1007/978-3-319-41501-7_84

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41501-7_84

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41500-0

  • Online ISBN: 978-3-319-41501-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics