Skip to main content
Log in

Script identification algorithms: a survey

  • Trends and Surveys
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Script identification is being widely accepted techniques for selection of the particular script OCR (Optical Character Recognition) in multilingual document images. Extensive research has been done in this field, but still it suffers from low identification accuracy. This is due to the presence of faded document images, illuminations and positions while scanning. Noise is also a major obstacle in the script identification process. However, it can only be minimized up to a level, but cannot be removed completely. In this paper, an attempt is made to analyze and classify various script identification schemes for document images. The comparison is also made between these schemes, and discussion is made based upon their merits and demerits on a common platform. This will help the researchers to understand the complexity of the issue and identify possible directions for research in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Namboodiri AM, Jain AK (2004) Online handwritten script recognition. IEEE Trans Pattern Anal Mach Intell 26:124–130. doi:10.1109/TPAMI.2004.1261096

    Article  Google Scholar 

  2. Pati PB, Ramakrishnan AG (2008) Word level multi-script identification. Pattern Recogn Lett 29:1218–1229. doi:10.1016/j.patrec.2008.01.027

    Article  Google Scholar 

  3. Sharma N, Pal U, Blumenstein M (2014) A study on word-level multi-script identification from video frames. In: International joint conference on neural networks, Beijing, pp 1827–1833. doi:10.1109/IJCNN.2014.6889906

  4. Shijian L, Tan CL (2008) Script and language identification in noisy and degraded document Images. IEEE Trans Pattern Anal Mach Intell 30:14–24. doi:10.1109/TPAMI.2007.1158

    Article  Google Scholar 

  5. Patil SB, Subbareddy NV (2002) Neural network based system for script identification in Indian documents. Sadhana 27:83–97. doi:10.1007/BF02703314

    Article  Google Scholar 

  6. Zhu G, Yu X, Li Y, Doermann D (2009) Language identification for handwritten document images using a shape codebook. Pattern Recogn 42:3184–3191. doi:10.1016/j.patcog.2008.12.022

    Article  MATH  Google Scholar 

  7. Joshi GD, Garg S, Sivaswamy J (2007) A generalised framework for script identification. Int J Doc Anal Recogn 10:55–68. doi:10.1007/s10032-007-0043-3

    Article  Google Scholar 

  8. Shivakumara P, Yuan Z, Zhao D, Lu T, Tan CL (2015) New gradient-spatial-structural features for video script identification. Comput Vis Image Underst 130:35–53. doi:10.1016/j.cviu.2014.09.003

    Article  Google Scholar 

  9. Hochberg J, Bowers K, Cannon M, Kelly P (1999) Script and language identification for handwritten document images. Int J Doc Anal Recogn 2:45–52. doi:10.1007/s100320050036

    Article  Google Scholar 

  10. Li Y, Zheng Y, Doermann D, Jaeger S (2008) Script-independent text line segmentation in freestyle handwritten documents. IEEE Trans Pattern Anal Mach Intell 30:1313–1329. doi:10.1109/TPAMI.2007.70792

    Article  Google Scholar 

  11. Marti U, Bunke H (2006) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5:39–46. doi:10.1007/s100320200071

    Article  MATH  Google Scholar 

  12. Lu S, Li L, Tan CL (2010) Identification of scripts and orientations of degraded document images. Pattern Anal Appl 13:469–475. doi:10.1007/s10044-009-0169-7

    Article  MathSciNet  Google Scholar 

  13. Tan TT (1998) Rotation invariant texture features and their use in automatic script identification. IEEE Trans Pattern Anal Mach Intell 20:751–756. doi:10.1109/34.689305

    Article  MathSciNet  Google Scholar 

  14. Busch A, Boles WW, Sridharan S (2005) Texture for script identification. IEEE Trans Pattern Anal Mach Intell 27:1720–1732. doi:10.1109/TPAMI.2005.227

    Article  Google Scholar 

  15. Hiremath PS, Shivashankar S (2008) Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image. Pattern Recogn Lett 29:1182–1189. doi:10.1016/j.patrec.2008.01.012

    Article  Google Scholar 

  16. Singh PK, Dalal SK, Sarkar R, Nasipuri M (2015) Page-level script identification from multi- script handwritten documents. In: 3rd international conference on computer, communication, control and information technology, Hooghly, pp 1–6. doi:10.1109/C3IT.2015.7060113

  17. Benjelil M, Kanoun S, Mullot R, Alimi AM (2009) Arabic and Latin script identification in printed and handwritten types based on steerable pyramid features. In: 10th international conference on document analysis and recognition, Barcelona, pp 591–595. doi:10.1109/ICDAR.2009.287

  18. Zhou L, Ping XJ, Zheng EG, Guo L (2010) Script identification based on wavelet energy histogram moment features. In: IEEE 10th international conference on signal processing, Beijing, pp 980–983. doi:10.1109/ICOSP.2010.5655843

  19. Peake GS, Tan TN (1997) Script and language identification from document images. In: Proceedings of workshop on document image analysis, Washington DC, pp 10–17, doi:10.1109/DIA.1997.627086

  20. Pan WM, Suen CY, Bui TD (2005) Script identification using steerable Gabor filters. In: Proceedings of the eight international conference on document analysis and recognition, Seoul, pp 883–887. doi:10.1109/ICDAR.2005.206

  21. Singhal V, Navin N, Ghosh D (2003) Script-based classification of hand-written text documents in a multilingual environment. In: Proceedings of 13th international workshop on research issues in data engineering: multi-lingual information management, Hyderabad, pp 47–54. doi:10.1109/RIDE.2003.1249845

  22. Rajput GG, Anita HB (2010) Handwritten script recognition using dct and wavelet features at block level. IJCA, Special issue on RTIPPR 3:158–163

    Google Scholar 

  23. Lee WS, Kim NC, Jang IH (2010) Texture feature-based language identification using wavelet-domain bdip, bvlc, and nrma features. In: IEEE international workshop on machine learning for signal processing, Finland, pp 444–449. doi:10.1109/MLSP.2010.5588751

  24. Valkealahti K, Oja E (2007) Reduced multidimensional co-occurrence histograms in texture classification. IEEE Trans Pattern Anal Mach Intell 20:90–95. doi:10.1109/34.655653

    Article  Google Scholar 

  25. Brodić D, Milivojević ZN, Maluckov CA (2015) An approach to the script discrimination in the Slavic documents. Soft Comput 19:2655–2665. doi:10.1007/s00500-014-1435-1

    Article  Google Scholar 

  26. Hochberg J, Kelly P, Thomas T, Kerns LL (1997) Automatic script identification from document images using cluster-based templates. IEEE Trans Pattern Anal Mach Intell 19:176–181. doi:10.1109/34.574802

    Article  Google Scholar 

  27. Silva C, Ribeiro B (2007) On text-based mining with active learning and background knowledge using SVM. Soft Comput 11:519–530. doi:10.1007/s00500-006-0080-8

    Article  Google Scholar 

  28. Pal U, Chaudhuri BB (2002) Identification of different script lines from multi-script documents. Image Vis Comput 20:945–954. doi:10.1016/S0262-8856(02)00101-4

    Article  Google Scholar 

  29. Pal U, Chaudhuri BB (2001) Automatic identification of English, Chinese, Arabic, Devnagari and Bangla script line. In: Proceedings of sixth international conference on document analysis and recognition, Seattle, pp 790–794. doi:10.1109/ICDAR.2001.953896

  30. Gopakumar R, Subbareddy NV, Makkithaya K, Acharya UD (2010) Script identification from multilingual Indian documents using structural features. J Comput 2:106–111

    Google Scholar 

  31. Gopakumar R, Subbareddy NV, Makkithaya K, Acharya UD (2010) Zone-based structural feature extraction for script identification from Indian documents. In: 5th international conference on industrial and information systems, Mangalore, pp 420–425. doi:10.1109/ICIINFS.2010.5578668

  32. Padma MC, Vijaya PA (2010) Script identification from trilingual documents using profile based features. Int J Comput Sci Appl 7:16–33

    Google Scholar 

  33. Aithal PK, Rajesh G, Acharya DU, Krishnamoorthi M, Subbareddy NV (2011) Script identification for a tri-lingual document. In: 2nd international conference on advances in communication, network, and computing, pp 434–439. doi:10.1007/978-3-642-19542-6_82

  34. Aithal PK, Rajesh G, Acharya DU, Krishnamoorthi M, Subbareddy NV (2010) Text line script identification for a tri-lingual document. In: 2nd international conference on computing, communication and networking technologies, Karur, pp 1–3. doi:10.1109/ICCCNT.2010.5592562

  35. Prakash O, Shrivastava V, Kumar A (2013) An efficient approach for script identification. Int J Comput Trends Technol 4:1626–1631

    Google Scholar 

  36. Phan TQ, Shivakumara P, Ding Z, Lu S, Tan CL (2011) Video script identification based on text lines. In: International conference on document analysis and recognition, Beijing, pp 1240–1244. doi:10.1109/ICDAR.2011.250

  37. Tan GX, Gaudin CV, Kot AC (2009) Information retrieval model for online handwritten script identification. In: 10th international conference on document analysis and recognition, Barcelona, pp 336–340. doi:10.1109/ICDAR.2009.162

  38. Bashir R, Quadri SMK (2014) Entropy based script identification of a multilingual document image. In: International conference on computing for sustainable global development, New Delhi, pp 19–23. doi:10.1109/IndiaCom.2014.6828005

  39. Bashir R, Quadri SMK (2013) Identification of Kashmiri script in a bilingual document image. In: Proceedings of the IEEE second international conference on image information processing, Waknaghat, pp 575–579. doi:10.1109/ICIIP.2013.6707658

  40. Bashir R, Quadri SMK (2015) Density based script identification of a multilingual document image. Int J Image Graph Signal Process 2:8–14. doi:10.5815/ijigsp.2015.02.02

    Article  Google Scholar 

  41. Ghosh S, Chaudhuri BB (2011) Composite script identification and orientation detection for Indian text images. In: International conference on document analysis and recognition, Beijing, pp 294–298. doi:10.1109/ICDAR.2011.67

  42. Cheng J, Ping X, Zhou G, Yang Y (2006) Script identification of document image analysis. In: Proceedings of the 1st international conference on innovative computing, information and control, Beijing, pp 178–181. doi:10.1109/ICICIC.2006.518

  43. Moussa SB, Zahour A, Benabdelhafid A, Alimi AM (2008) Fractal-based system for Arabic/Latin, printed/handwritten script identification. In: 19th international conference on pattern recognition, Florida, pp 1–4. doi:10.1109/ICPR.2008.4761838

  44. Padma MC, Vijaya PA (2009) Monothetic separation of Telugu, Hindi and English text lines from a multi script document. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, San, Antonio, pp 4870–4875. doi:10.1109/ICSMC.2009.5346045

  45. Rajput GG, Anita HB (2011) Handwritten script identification from a bi-script document at line level using Gabor filters. In: Proceeding of SCAKD, pp 94–101

  46. Jindal M, Hemrajani N (2013) Script identification for printed document images at text-line level using dct and pca. IOSR J Comput Eng 12:97–102

    Article  Google Scholar 

  47. Obaidullah SM, Nibaran D, Roy K (2014) Gabor filter based technique for offline Indic script identification from handwritten document images. In: International conference on devices, circuits and communications, Ranchi, pp 1–5. doi:10.1109/ICDCCom.2014.7024723

  48. Lu S, Li L, Tan CL (2007) Identification of Latin-based languages through character stroke categorization. In: 9th international conference on document analysis and recognition, Brazil, pp 352–356. doi:10.1109/ICDAR.2007.4378731

  49. Spitz AL (1997) Determination of the script and language content of document images. IEEE Trans Pattern Anal Mach Intell 19:235–345. doi:10.1109/34.584100

    Article  Google Scholar 

  50. Das MS, Rani DS, Reddy CRK (2012) Heuristic based script identification from multilingual text documents. In: 1st international conference on recent advances in information technology, Dhanbad, pp 487–492. doi:10.1109/RAIT.2012.6194627

  51. Yeotikar PP, Deshmukh PR (2013) Script identification of text words from multilingual Indian document. Int J Comput Appl 1:22–29

    Google Scholar 

  52. Dhandra BV, Hangarge M (2011) Morphological reconstruction for word level script identification. Int J Comput Sci Secur 1:41–51

    Google Scholar 

  53. Chanda S, Pal S, Franke K, Pal U (2009) Two-stage approach for word-wise script identification. In: 10th international conference on document analysis and recognition, Barcelona, pp 926–930. doi:10.1109/ICDAR.2009.239

  54. Chanda S, Pal U, Franke K, Kimura F (2010) Script identification—a Han and Roman script perspective. In: 20th international conference on pattern recognition, Istanbul, pp 2708–2711. doi:10.1109/ICPR.2010.1127

  55. Roy K, Alaei A, Pal U (2010) Word-wise handwritten Persian and Roman script identification. In: International conference on frontiers in handwriting recognition, Kolkata, pp 628–633. doi:10.1109/ICFHR.2010.103

  56. Roy K, Das SK, Obaidullah SM (2011) Script identification from handwritten document. In: 3rd national conference on computer vision, pattern recognition, image processing and graphics, Hubli, pp 66–69. doi:10.1109/NCVPRIPG.2011.22

  57. Obaidullah SM, Roy K, Das N (2013) Comparison of different classifiers for script identification from handwritten document. In: IEEE international conference on signal processing, computing and control, Waknaghat, pp 1–6. doi:10.1109/ISPCC.2013.6663388

  58. Piao M, Cui RR (2013) An approach to script identification in multi-language text image. In: 6th international conference on intelligent networks and intelligent systems, Shenyang, pp 248–251. doi:10.1109/ICINIS.2013.70

  59. Chanda S, Terrades OR, Pal U (2007) SVM based scheme for Thai and English script identification. In: 9th international conference on document analysis and recognition, Brazil, pp 551–555. doi:10.1109/ICDAR.2007.4378770

  60. Chanda S, Pal U, Kimura F (2007) Identification of Japanese and English script from a single document page. In: 7th IEEE international conference on computer and information technology, Fukushima, pp 656–661. doi:10.1109/CIT.2007.109

  61. Dhandra BV, Hangarge M (2007) Global and local features based handwritten text words and numerals script identification. In: International conference on conference on computational intelligence and multimedia applications, Sivakasi, pp 471–475. doi:10.1109/ICCIMA.2007.125

  62. Singh S, Kumar A, Shaw DK, Ghosh D (2014) Script separation in machine printed bilingual (Devnagari and Gurumukhi) documents using morphological approach. In: 20th national conference on communications, Kanpur, pp 1–5. doi:10.1109/NCC.2014.6811361

  63. Lin XR, Guo CY, Chang F (2011) Classifying textual components of bilingual documents with decision-tree support vector machines. In: International conference on document analysis and recognition, Beijing, pp 498–502. doi:10.1109/ICDAR.2011.106

  64. Echi AK, Saidani A, Belaid A (2014) How to separate between machine-printed/handwritten and Arabic/Latin Words? Electron Lett Comput Vis Image Anal 13:1–16. doi:10.5565/rev/elcvia.572

    Google Scholar 

  65. Haboubi S, Maddouri SS, Amiri H (2011) Separation between Arabic and Latin scripts from bilingual text using structural features. In: 1st international conference innovative computing technology, Brazil, pp 132–143. doi:10.1007/978-3-642-22247-4_12

  66. Sharma N, Chanda S, Pal U, Blumenstein M (2013) Word-wise script identification from video frames. In: 12th international conference on document analysis and recognition, Washington DC, pp 867–871. doi:10.1109/ICDAR.2013.177

  67. Ma H, Doermann D (2004) Word level script identification for scanned document images. In: Proceeding of international conference on document recognition and retrieval, San Jose, pp 178–191

  68. Ferrer MA, Morales A, Rodríguez N, Pal U (2014) Multiple training—one test methodology for handwritten word-script identification. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 754–759. doi:10.1109/ICFHR.2014.132

  69. Singh PK, Khan A, Sarkar R, Nasipuri M (2014) A texture based approach to word-level script identification from multi-script handwritten documents. In: International conference on computational intelligence and communication networks, Udaipur, pp 228–232. doi:10.1109/CICN.2014.60

  70. Angadi SA, Kodabagi MM (2013) A fuzzy approach for word level script identification of text in low resolution display board images using wavelet features. In: International conference on advances in computing, communications and informatics, Mysore, pp 1804–1811. doi:10.1109/ICACCI.2013.6637455

  71. Pechwitz M, Maddouri SS, Märgner V, Ellouze N, Amiri H (2002) IFN/ENIT-database of handwritten ARABIC words. In: 7th colloque international francophone Sur l’Ecrit et le Document, Tunis, pp 129–136

  72. Malemath VS, Kulkarni AH, Mallikarjun H (2014) Word-wise script identification in document images based on steerable Gaussian filtering technique. Int J Adv Res Comput Commun Eng 3:6844–6848

    Google Scholar 

  73. Rezaee H, Geravanchizadeh M, Razzazi F (2009) Automatic language identification of bilingual English and Farsi scripts. In: International conference on application of information and communication technologies, Baku, pp 1–4. doi:10.1109/ICAICT.2009.5372532

  74. Rani R, Dhir R, Lehal GS (2013) Script identification of pre-segmented multi-font characters and digits. In: 12th international conference on document analysis and recognition, Washington DC, pp 1150–154. doi:10.1109/ICDAR.2013.233

  75. Pal S, Alireza A, Pal U, Blumenstein M (2012) Multi-script off-line signature identification. In: 12th international conference on hybrid intelligent systems, Pune, pp 236–240. doi:10.1109/HIS.2012.6421340

  76. Obaidullah SM, Halder C, Das N, Roy K (2015) Numeral script identification from handwritten document images. In: 11th international multi-conference on information processing, Bangalore, pp 585–594. doi:10.1016/j.procs.2015.06.067

  77. Hangarge M, Santosh KC, Pardeshi R (2013) Directional discrete cosine transform for handwritten script identification. In: 12th international conference on document analysis and recognition, Washington DC, pp 344–348. doi:10.1109/ICDAR.2013.76

  78. Hangarge M, Santosh KC (2014) Word-level handwritten script identification from multi-script documents. In: Recent advances in information technology, advances in intelligent systems and computing, Dhanbad, pp 49–55. doi:10.1007/978-81-322-1856-2_6

  79. Pardeshi R, Chaudhuri BB, Hangarge M, Santosh KC (2014) Automatic handwritten Indian scripts identification. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 375–380. doi:10.1109/ICFHR.2014.69

  80. Marti U, Bunke H (1999) A full English sentence database for off-line handwriting recognition. In: Proceedings of the 5th international conference on document analysis and recognition, Bangalore, pp 705–708. doi:10.1109/ICDAR.1999.791885

  81. Sarkar R, Das N, Basu S, Kundu M, Nasipuri M, Basu DK (2012) Cmaterdb1: a database of unconstrained handwritten Bangla and Bangla English mixed script document image. Int J Doc Anal Recogn 15:71–83. doi:10.1007/s10032-011-0148-6

    Article  Google Scholar 

  82. Selamat A, Ng CC (2011) Arabic script web page language identifications using decision tree neural networks. Pattern Recogn 44:133–144. doi:10.1016/j.patcog.2010.07.009

    Article  MATH  Google Scholar 

  83. Ng CC, Selamat A (2009) Improved letter weighting feature selection on Arabic script language identification. In: 1st Asian conference on intelligent information and database systems, Vietnam, pp 150–154. doi:10.1109/ACIIDS.2009.33

  84. Selamat A, Lee ZS (2008) Language identifications of Arabic script web documents using independent component analysis. In: 2nd Asia international conference on modeling and simulation, Kuala Lumpur, pp 427–432. doi:10.1109/AMS.2008.46

  85. Shi B, Bai X, Yao C (2016) Script identification in the wild via discriminative convolutional neural network. Pattern Recogn 52:448–458. doi:10.1016/j.patcog.2015.11.005

    Article  Google Scholar 

  86. Behrad A, Khoddami M, Salehpour M (2010) A novel framework for Farsi and Latin script identification and Farsi handwritten digit recognition. J Autom Control 20:17–25. doi:10.2298/JAC1001017B

    Article  Google Scholar 

  87. Rani R, Dhir R, Lehal GS (2011) Comparative analysis of Gabor and discriminating feature extraction techniques for script identification. In: International conference on information systems for Indian languages, Patiala, pp 174–179. doi:10.1007/978-3-642-19403-0_27

  88. Mezghani A, Slimane F, Kanoun S, Margner V (2014) Identification of Arabic/French–handwritten/printed words using Gmm-based system. In: Proceedings of CIFED, France, pp 371–374

  89. Abainia K, Ouamour S, Sayoud H (2014) Robust language identification of noisy texts: proposal of hybrid approaches. In: 25th international workshop on database and expert systems applications, Munich, pp 228–232. doi:10.1109/DEXA.2014.55

  90. Yadav P, Kaur S (2013) Language identification and correction in corrupted texts of regional Indian languages. In: International conference oriental held jointly with conference on Asian spoken language research and evaluation, Gurgaon, pp 1–5. doi:10.1109/ICSDA.2013.6709877

  91. Hebert D, Barlas P, Chatelain C, Adam S, Paquet T (2014) Writing type and language identification in heterogeneous and complex documents. In: 14th international conference on frontiers in handwriting recognition, Greece, pp 411–416. doi:10.1109/ICFHR.2014.75

  92. Ablavsky V, Stevens MR (2003) Automatic feature selection with applications to script identification of degraded documents. In: Proceedings of 7th international conference on document analysis and recognition, Edinburgh, pp 750–754. doi:10.1109/ICDAR.2003.1227762

  93. Obaidullah SM, Mondal A, Roy K (2014) Structural feature based approach for script identification from printed Indian document. In: International conference on signal processing and integrated networks, Noida, pp 120–124. doi:10.1109/SPIN.2014.6776933

  94. Obaidullah SM, Mondal A, Das N, Roy K (2014) Script identification from printed Indian document images and performance evaluation using different classifiers. Appl Comput Intell Soft Comput. doi:10.1155/2014/896128

    Google Scholar 

  95. Dhanya D, Ramakrishnan AG, Pati PB (2002) Script identification in printed bilingual documents. Sadhana 27:73–82. doi:10.1007/3-540-45869-7_2

    Article  MATH  Google Scholar 

  96. Singh PK, Mondal A, Bhowmik S, Sarkar R, Nasipuri M (2014) Word-level script identification from handwritten multi-script documents. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications, Bhubaneswar, pp 551–558. doi:10.1007/978-3-319-11933-5_62

  97. Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X (2015) Automatic script identification in the wild. In: Proceedings of international conference on document analysis and recognition, Nancy

  98. Mezghani A, Kanoun S, Khemakhem M, El AH (2012) A database for Arabic handwritten text image recognition and writer identification. In: International conference on frontiers in handwriting recognition, Bari, pp 399–402. doi:10.1109/ICFHR.2012.155

  99. Grosicki E, Carré M, Brodin JM, Geoffrois E (2009) Results of the RIMES evaluation campaign for handwritten mail processing. In: International conference on document analysis and recognition, Barcelona, pp 941–945. doi:10.1109/ICDAR.2009.224

  100. Slimane F, Ingold R, Kanoun S, Alimi AM, Hennebert J (2009) A new Arabic printed text image database and evaluation protocols. In: International conference on document analysis and recognition, Barcelona, pp 946–950. doi:10.1109/ICDAR.2009.155

  101. Gomez L, Nicolaou A, Karatzas D (2017) Improving patch-based scene text script identification with ensembles of conjoined networks. Pattern Recogn 67:85–96. doi:10.1016/j.patcog.2017.01.032

    Article  Google Scholar 

  102. Sharma N, Mandal R, Sharma R, Pal U, Blumenstein M (2015) ICDAR2015 competition on video script identification (CVSI 2015). In: IEEE 13th international conference on document analysis and recognition (ICDAR), 2015, Tunis, pp 1196–1200. doi:10.1109/ICDAR.2015.7333950

  103. Arabnejad E, Moghaddam RF, Cheriet M (2017) PSI: Patch-based script identification using non-negative matrix factorization. Pattern Recogn 67:328–339. doi:10.1016/j.patcog.2017.02.020

    Article  Google Scholar 

  104. Saba T, Rehman A, Altameem A, Uddin M (2014) Annotated comparisons of proposed preprocessing techniques for script recognition. Neural Comput Appl 25:1337–1347. doi:10.1007/s00521-014-1618-9

    Article  Google Scholar 

  105. Kacem A, Asma S (2016) A texture-based approach for word script and nature identification. Pattern Anal Appl. doi:10.1007/s10044-016-0555-x

    Google Scholar 

  106. Obaidullah SM, Halder C, Santosh KC, Das N, Roy K (2017) PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimed Tools Appl. doi:10.1007/s11042-017-4373-y

    Google Scholar 

  107. Singh PK, Sarkar R, Das N, Basu S, Kundu M, Nasipuri M (2017) Benchmark databases of handwritten Bangla-Roman and Devanagari-Roman mixed-script document images. Multimed Tools Appl. doi:10.1007/s11042-017-4745-3

    Google Scholar 

  108. Brodic’ D, Amelio A, Milivojevic’ ZN (2016) Language discrimination by texture analysis of the image corresponding to the text. Neural Comput Appl. doi:10.1007/s00521-016-2527-x

    Google Scholar 

  109. Brodić D, Amelio A, Milivojević ZN (2016) Identification of Fraktur and Latin scripts in German historical documents using image texture analysis. Appl Artif Intell Int J 30(5):379–395. doi:10.1080/08839514.2016.1185855

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parul Sahare.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sahare, P., Dhok, S.B. Script identification algorithms: a survey. Int J Multimed Info Retr 6, 211–232 (2017). https://doi.org/10.1007/s13735-017-0130-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-017-0130-2

Keywords

Navigation