Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 327))

Abstract

In this paper, a robust word-level handwritten script identification technique has been proposed. A combination of shape based and texture based features are used to identify the script of the handwritten word images written in any of five scripts namely, Bangla, Devnagari, Malayalam, Telugu and Roman. An 87-element feature set is designed to evaluate the present script recognition technique. The technique has been tested on 3000 handwritten words in which each script contributes about 600 words. Based on the identification accuracies of multiple classifiers, Multi Layer Perceptron (MLP) has been chosen as the best classifier for the present work. For 5-fold cross validation and epoch size of 500, MLP classifier produces the best recognition accuracy of 91.79% which is quite impressive considering the shape variations of the said scripts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chanda, S., Pal, S., Pal, U.: Word-wise Sinhala, Tamil and English Script Identification Using Gaussian Kernel SVM. IEEE (2008)

    Google Scholar 

  2. Chanda, S., Pal, S., Franke, K., Pal, U.: Two-stage Approach for Word-wise Script Identification. In: Proc. of 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 926–930 (2009)

    Google Scholar 

  3. Patil, S.B., Subbareddy, N.V.: Neural network based system for script identification in Indian documents. Sadhana 27(pt.1), 83–97 (2002)

    Google Scholar 

  4. Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Recognition Letters 29, 1218–1229 (2008)

    Article  Google Scholar 

  5. Dhanya, D., Ramakrishnan, A.G., Pati, P.B.: Script identification in printed bilingual documents. Sadhana 27(pt.1), 73–82 (2002)

    Google Scholar 

  6. Roy, K., Majumder, K.: Trilingual Script Separation of Handwritten Postal Document. In: Proc. of 6th Indian Conference on Computer Vision, Graphics & Image Processing, pp. 693–700 (2008)

    Google Scholar 

  7. Sarkar, R., Das, N., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: Word level script Identification from Bangla and Devnagari Handwritten texts mixed with Roman scripts. Journal of Computing 2(2), 103–108 (2010)

    Google Scholar 

  8. Singh, P.K., Sarkar, R., Das, N., Basu, S., Nasipuri, M.: Identification of Devnagari and Roman scripts from multi-script handwritten documents. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds.) PReMI 2013. LNCS, vol. 8251, pp. 509–514. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  9. Singh, P.K., Sarkar, R., Das, N., Basu, S., Nasipuri, M.: Statistical comparison of classifiers for script identification from multi-script handwritten documents. International Journal of Applied Pattern Recognititon 1(2), 152–172 (2014)

    Google Scholar 

  10. Zhang, D., Lu, G.: A comparative study of fourier descriptors for shape representation and retrieval. In: Proc. of 5th Asian Conference on Computer Vision (2002)

    Google Scholar 

  11. Kauppinen, H., Seppanen, T., Pietikainen, M.: An experimental comparison of auto-regressive and fourier-based descriptors in 2-D shape classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(2), 201–207 (1995)

    Article  Google Scholar 

  12. Mingqiang, Y., Kidiyo, K., Joseph, R.: A survey of shape feature extraction techniques. In: Yin, P.-Y. (ed.) Pattern Recognition, pp. 43–90 (2008)

    Google Scholar 

  13. Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection, http://lear.inrialpes.fr

  14. Ostu, N.: A thresholding selection method from gray-level histogram. IEEE Transaction on Systems Man Cybernet. SMC-8, 62–66 (1978)

    Google Scholar 

  15. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, vol. I. Prentice-Hall, India (1992)

    Google Scholar 

  16. http://www.cs.waikato.ac.nz/ml/weka/documentation.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pawan Kumar Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Singh, P.K., Mondal, A., Bhowmik, S., Sarkar, R., Nasipuri, M. (2015). Word-Level Script Identification from Handwritten Multi-script Documents. In: Satapathy, S., Biswal, B., Udgata, S., Mandal, J. (eds) Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. Advances in Intelligent Systems and Computing, vol 327. Springer, Cham. https://doi.org/10.1007/978-3-319-11933-5_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11933-5_62

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11932-8

  • Online ISBN: 978-3-319-11933-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics