Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 515))

Abstract

Most of the researchers around the world focus on developing monolingual Optical Character Recognition (OCR) systems. But in a multilingual country like India, it is quite common that a single document page includes text words written in more than one script. Therefore, OCRing such documents need a script identification module as a prerequisite. This paper reports a complete script recognition system for handwritten mixed-script documents. The document pages are first segmented into their corresponding text-lines and words. Then, the script recognition is done at word-level using texture-based features. The present technique is applied on 100 mixed-script document pages written in Bangla or Devanagari text mixed with English words. Encouraging outcomes would motivate more researchers to work on multilingual handwriting recognition domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Singh, P.K., Sarkar, R., Nasipuri, M.: Offline Script Identification from Multilingual Indic-script Documents: A state-of-the-art. Computer Science Review (Elsevier). 15–16, 1–28 (2015).

    Google Scholar 

  2. Obaidullah, S.M., Kundu, S.K., Roy, K.: A System for Handwritten Script Identification from Indian Document. Journal of Pattern Recognition Research. 8, 1–12 (2013).

    Google Scholar 

  3. Padma, M.C., Vijaya, P.A.: Global Approach for Script Identification using Wavelet Packet Based Features. International Journal of Signal Processing, Image Processing and Pattern Recogntion. 20, 29–40 (2010).

    Google Scholar 

  4. Hiremath, P.S., Shivshankar, S., Pujari, J.D., Mouneswara, V.: Script identification in a handwritten document image using texture features. In: IEEE 2nd International Conference on Advance Computing. pp. 110–114 (2010).

    Google Scholar 

  5. Hangarge, M., Dhandra, B. V: Offline Handwritten Script Identification in Document Images. International Journal of Computer Applications (IJCA). 4, (2010).

    Google Scholar 

  6. Singh, P.K., Sarkar, R., Nasipuri, M.: Line-level Script Identification for six handwritten scripts using texture based features. In: 2nd Information Systems Design and In-telligent Applications, AISC. pp. 285–293 (2015).

    Google Scholar 

  7. Sarkar, R., Das, N., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: Word level script Identification from Bangla and Devnagari Handwritten texts mixed with Roman scripts. Journal of Computing. 2, 103–108 (2010).

    Google Scholar 

  8. Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Rec-ognition Letters. 29, 1218–1229 (2008).

    Google Scholar 

  9. Singh, P.K., Sarkar, R., Das, N., Basu, S., Nasipuri, M.: Identification of Devnagari and Roman script from Multiscript Handwritten documents. In: 5th International Conference on PReMI, LNCS 8251. pp. 509–514 (2013).

    Google Scholar 

  10. Singh, P.K., Mondal, A., Bhowmik, S., Sarkar, R., Nasipuri, M.: Word-level Script Identification from Multi-script Handwritten Documents. In: 3rd International Conference on Frontiers in Intelligent Computing Theory and Applications (FICTA). pp. 551–558 (2014).

    Google Scholar 

  11. Saabni, R., Asi, A., El-Sana, J.: Text line extraction for historical document images. Pattern Recognition Letters. 35, 23–33 (2014).

    Google Scholar 

  12. Saabni, R., El-Sana, J.: Language-independent text lines extraction using seam carving. In: IEEE International Conference on Document Analysis and Recognition. pp. 563–568 (2011).

    Google Scholar 

  13. Singh, P.K., Chowdhury, S.P., Sinha, S., Eum, S., Sarkar, R.: Page-to-Word Extraction from Unconstrained Handwritten Document Images. In: 1st International Conference on Intelligent Computing and Communication(ICIC2) (2016).

    Google Scholar 

  14. Harris, C., Stephens, M.: A combined corner and edge detector. Alvey vision Conference. 15, (1988).

    Google Scholar 

  15. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining. pp. 226–231 (1996).

    Google Scholar 

  16. Laws, K.: Rapid Texture Identification. Image Processing for Missile Guidance. SPIE. 238, 376–380 (1980).

    Google Scholar 

  17. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. vol. 1, Prentice-hall, (1992).

    Google Scholar 

  18. Tamura, H., Mori, S., Yamawaki, T.: Textural Features Corresponding to Visual Perception. IEEE Transactions on Systems, Man, and Cybernetics. 8, 460–473 (1978).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pawan Kumar Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Singh, P.K., Das, S., Sarkar, R., Nasipuri, M. (2017). Handwritten Mixed-Script Recognition System: A Comprehensive Approach. In: Satapathy, S., Bhateja, V., Udgata, S., Pattnaik, P. (eds) Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications . Advances in Intelligent Systems and Computing, vol 515. Springer, Singapore. https://doi.org/10.1007/978-981-10-3153-3_78

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3153-3_78

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3152-6

  • Online ISBN: 978-981-10-3153-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics