Skip to main content

Statistical Textural Features for Text-Line Level Handwritten Indic Script Identification

  • Chapter
  • First Online:
Advanced Computing and Systems for Security

Abstract

As India is a multilingual country, hence, a variety of scripts are used here to write different languages. However, it becomes essential to recognize a particular script before the selection of an appropriate Optical Character Recognition (OCR) system. The research in this field is comparatively less explored and further research is required, particularly in the field of handwritten documents. This paper presents a robust script identification technique for 11 official handwritten Indic scripts namely, Bangla, Devanagari, Gujarati, Gurumukhi, Kannada, Malayalam, Manipuri, Oriya, Tamil, Telugu, Urdu along with Roman script. The recognition is performed at text-line level by using statistical textural features called Neighborhood Gray-Tone Difference Matrix along with Gray-level Run Length Matrix. The proposed method is experimented on a total dataset of 2400 handwritten text-lines of various scripts and yielded an identification rate of 97.69% using Multi Layer Perceptron (MLP) classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Singh, P.K.: Script identification from multi-script handwritten documents. M. Tech Thesis, CSE Department, Jadavpur University (2013)

    Google Scholar 

  2. Language in India. http://www.languageinindia.com/feb2011/vanishreemastersfinal.pdf. Accessed 05 Feb 2016

  3. Singh, P.K., Sarkar, R., Nasipuri, M.: Offline script identification from multilingual indic-script documents: a state-of-the-art. Comput. Sci. Rev. 15–16, 1–28 (2015)

    Article  MathSciNet  Google Scholar 

  4. Dhandra, B.V, Nagabhushan, P., Hangarge, M., Hegadi, R.: Script identification based on morphological reconstruction in document images. In: IEEE International Conference of Pattern Recognition, Hong Kong, pp. 950–953 (2006)

    Google Scholar 

  5. Padma, M.C., Vijaya, P.A.: Global approach for script identification using wavelet packet based features. Int. J. Signal Process. Image Process. Pattern Recogn. 3, 29–40 (2010)

    Google Scholar 

  6. Padma, M.C., Vijaya, P.A.: Wavelet packet based texture features for automatic script identification. Int. J. Image Process. 4, 53–65 (2010)

    Google Scholar 

  7. Pal, U., Chaudhuri, B.B.: Identification of different script lines from multi-script documents. Image Vis. Comput. 20, 945–954 (2002)

    Article  Google Scholar 

  8. Padma, M.C., Vijaya, P.A.: Identification of Telugu, Devnagari and English scripts using discriminating features. Int. J. Comput. Sci. Inf. Technol. 1 (2009)

    Google Scholar 

  9. Padma, M.C., Vijaya, P.A.: Script identification from trilingual documents using profile based features. Int. J. Comput. Sci. Appl. 7, 16–33 (2010)

    Google Scholar 

  10. Joshi, G.D., Garg, S., Sivaswamy, J.: Script identification from Indian documents. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinform.) LNCS 3872, 255–267 (2006)

    Google Scholar 

  11. Jindal, M., Hemrajani, N.: Script identification for printed document images at text-line level using DCT and PCA. IOSR J. Comput. Eng. 12, 97–102 (2013)

    Article  Google Scholar 

  12. Pal, U., Chaudhuri, B.B.: Automatic separation of words in multi lingual multi script indian documents. In: 4th International Conference on Document Analysis and Recognition (ICDAR). pp. 576–579 (1997)

    Google Scholar 

  13. Sinha, S., Pal, U., Chaudhuri, B.B.: Word-wise script identification from Indian documents. LNCS 3163, 310–321 (2004)

    Google Scholar 

  14. Hassan, E., Garg, R., Chaudhury, S., Gopal, M.: Script based text identification : a multi-level architecture. In: Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data, pp. 11:1–11:8 (2011)

    Google Scholar 

  15. Dhandra, B.V, Mallikarjun, H., Hegadi, R., Malemath, V.S.: Word-wise script identification from bilingual documents based on morphological reconstruction. In: IEEE International Conference on Digital Information Management, pp. 389–394 (2006)

    Google Scholar 

  16. Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Recogn. Lett. 29, 1218–1229 (2008)

    Article  Google Scholar 

  17. Dhanya, D., Ramakrishnan, A.G., Pati, P.B.: Script identification in printed bilingual documents. Sadhana Acad. Proc. Eng. Sci. 27, 73–82 (2002)

    Google Scholar 

  18. Singh, P.K., Dalal, S.K., Sarkar, R., Nasipur, M.: Page-level script identification from multi-script handwritten documents. In: 3rd IEEE International Conference on Computer, Communication, Control and Information Technology (C3IT), pp. 1–6 (2015)

    Google Scholar 

  19. Hangarge, M., Dhandra, B.V: Offline handwritten script identification in document images. Int. J. Comput. Appl. 4 (2010)

    Google Scholar 

  20. Singh, P.K., Sarkar, R., Nasipuri, M.: Line-level script identification for six handwritten scripts using texture based features. In: 2nd Information Systems Design and Intelligent Applications, Advances in Intelligent Systems and Computing, pp. 285–293 (2015)

    Google Scholar 

  21. Roy, K., Pal, U.: Word-wise Handwritten Script Separation for Indian postal automation. In: International Workshop on Frontiers in Handwriting Recognition, La Baule, pp. 521–526 (2006)

    Google Scholar 

  22. Sarkar, R., Das, N., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: Word level script identification from Bangla and Devnagari handwritten texts mixed with Roman scripts. J. Comput. 2, 103–108 (2010)

    Google Scholar 

  23. Singh, P.K., Sarkar, R., Das, N., Basu, S., Nasipuri, M.: Identification of Devnagari and Roman scripts from multi-script Handwritten documents. In: 5th International Conference on Pattern Recognition and Machine Intelligence (PReMI), pp. 509–514 (2013)

    Google Scholar 

  24. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice-Hall, India (1992)

    Google Scholar 

  25. Amadasun, M., King, R.: Textural features corresponding to textural properties. IEEE Trans. Syst. Man Cybern. 19, 1264–1274 (1989)

    Article  Google Scholar 

  26. Galloway, M.M.: Texture analysis using gray level run lengths. Comput. Graph. Image Process. 4, 172–179 (1975)

    Article  Google Scholar 

  27. Chu, A., Sehgal, C.M., Greenleaf, J.F.: Use of gray value distribution of run lengths for texture analysis. Pattern Recogn. Lett. 11, 415–420 (1990)

    Article  MATH  Google Scholar 

  28. Dasarathy, B.R., Holder, E.B.: Image characterizations based on joint gray-level run-length distributions. Pattern Recogn. Lett. 12, 497–502 (1991)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pawan Kumar Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter

Singh, P.K., Sarkar, R., Nasipuri, M. (2017). Statistical Textural Features for Text-Line Level Handwritten Indic Script Identification. In: Chaki, R., Saeed, K., Cortesi, A., Chaki, N. (eds) Advanced Computing and Systems for Security. Advances in Intelligent Systems and Computing, vol 568. Springer, Singapore. https://doi.org/10.1007/978-981-10-3391-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3391-9_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3390-2

  • Online ISBN: 978-981-10-3391-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics