Skip to main content

Script Identification Based on HSV Features

  • Conference paper
  • First Online:
Pattern Recognition (CCPR 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

Abstract

Many similar shaped scripts are used all over the world today. Scripts identification with similar shaped characters is one of the difficulties in script identification field and it need to be resolved. However, there are a little report about identification of Central Asian countries and Chinese Minority scripts, which identification of similar scripts. In this paper, a multi-script database was established, which are including 2200 plain document images with different resolution in 11 scripts such as English, Chinese, Arabic, Russian, Uyghur, Mongol, Tibet, Turkish, Kyrgyzstani, Uzbekistani and Tajikistani. Then, HSV features were extracted from each whole page image and they were classified by using BP neural network classifier. After experiment in our system, it is achieved 88.14 % of average identification rate and 99.0 % of highest identification rate in our experiment with the dataset. Experimental results indicated that HSV features were effective feature for identify these scripts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gopal, J., Saurabhand, D., Jayanthi, S.: A generalized framework for script identification. Int. J. Doc. Anal. Recogn. (IJDAR) 10(2), 55–68 (2007)

    Article  Google Scholar 

  2. Dhanya, D., Ramakrishnan, A.G.: Script identification in printed bilingual documents. In: Lopresti, D., Hu, J., Kashi, R. (eds.) DAS 2002. LNCS, vol. 2423, pp. 13–24. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. Pal, U., Sarkar, A.: Recognition of printed urdu script. In: Proceedings of the International Conference on Document Analysis and Recognition, Bangalore, pp. 1183–1187 (2003)

    Google Scholar 

  4. Spitz, A.L.: Script and language determination from document images. In: 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, USA, pp. 229–235 (1994)

    Google Scholar 

  5. Spitz, A.L.: Determination of the script and language content of document images. IEEE Trans. Pattern Anal. Mach. Intell. 19(3), 235–245 (1997)

    Article  Google Scholar 

  6. Ul-Hasan, A., Afzal, M.Z., Shafait, F., Liwicki, M., Breuel, T.M.: A sequence learning approach for multiple script identification. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, pp. 1046–1050 (2015)

    Google Scholar 

  7. Pal, U., Chaudhuri, B.: Identification of different script lines from multi-script documents. Image Vis. Comput. 20(13–14), 945–954 (2002)

    Article  Google Scholar 

  8. Shi, B., Bai, X., Cong, Y.: Script identification in the wild via discriminative convolutional neural network. Pattern Recogn. 52, 448–458 (2016)

    Article  Google Scholar 

  9. Androutsos, D., Plataniotis, K.N., Venetsanopoulos, A.N.: A novel vector-based approach to color image retrieval using a vector angular-based distance measure. Comput. Vis. Image Underst. 75(1–2), 46–58 (1999). ISSN 1077-3142

    Article  Google Scholar 

  10. Ferrer, M.A., Morales, A., Rodríguez, N., Pal, U.: Multiple training-one test methodology for handwritten word-script identification. In: 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), Heraklion, pp. 754–759 (2014)

    Google Scholar 

  11. Pawan, K.S., Ram, S., Mita, N.: Offline script identification from multilingual indic-script documents: a state-of-the-art. Comput. Sci. Rev. 15–16, 1–28 (2015)

    MathSciNet  Google Scholar 

  12. Obaidullah, S.M., Das, N., Halder, C., Roy, K.: Indic script identification from handwritten document images-An unconstrained block-level approach. In: 2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS), Kolkata, pp. 213–218 (2015)

    Google Scholar 

  13. Stricker, M., Orengo, M.: Similarity of color images. In: SPIE Storage and Retrieval for Image and Video Databases, vol. 2420, pp. 381–392(1995)

    Google Scholar 

  14. Roy, K., Pal, U., Chaudhuri, B.B.: Neural network based word-wise handwritten script identification system for Indian postal automation. In: Proceedings of 2005 International Conference on Intelligent Sensing and Information Processing, pp. 240–245(2005)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 61363064, 61563052, 61163028), College Scientific Research Plan Project of Xinjiang Uyghur Autonomous Region (No. XJEDU2013I11), and Special Training Plan Project of Xinjiang Uyghur Autonomous Region’s Minority Science and Technological Talents (No. 201323121).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kurban Ubul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Mijit, B., Aysa, A., Yadikar, N., Han, Xk., Ubul, K. (2016). Script Identification Based on HSV Features. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_48

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3005-5_48

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3004-8

  • Online ISBN: 978-981-10-3005-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics