skip to main content
research-article

Study on Automated Approach to Recognize Characters for Handwritten and Historical Document

Authors Info & Claims
Published:12 August 2021Publication History
Skip Abstract Section

Abstract

Script recognition is the mechanism of automatic script analysis and recognition whereby intensive study has been carried out and a significant amount of papers on this problem have been released over the past. But there are still a few issues to be solved, particularly in Indian historical manuscripts. This literature examines the Script recognition with reference to multi-script document and different historical scripts such as Kurdish-Latin, Devanagari, Grantha, Arabic handwritten characters, Bangladesh, Devanagari and Gurumukhi, ancient Chinese, Arabic, Nam Character, Greek, Nastalique Urdu, Georgian handwritten, Nandinagari, and Hebrew, which provide the course of study that focuses on the framework for script recognition. This review concentrates on scope of prediction, dataset type, the methods used for data preprocessing, and measures of performance used for analysis. On the basis of this survey, Current research constraints have been recognized and future study specifications are emphasized in the area of modeling historical manuscripts.

CCS Concepts:

References

  1. Zebardast Behnam and Isa Maleki. 2013. A new radial basis function artificial neural network based recognition for kurdish manuscript. Int. J. Appl. Evol. Computat. 4, 4 (2013), 72–87.Google ScholarGoogle ScholarCross RefCross Ref
  2. Soora Narasimha Reddy and Parag S. Deshpande. 2018. A novel local skew correction and segmentation approach for printed multilingual indian documents. Alexandria Eng. J. 57, 3 (2018), 1609–1618.Google ScholarGoogle ScholarCross RefCross Ref
  3. Varghese K. Sonu, Ajay James, and Saravanan Chandran. 2016. A novel tri-stage recognition scheme for handwritten Malayalam character recognition. Procedia Technol. 24 (2016), 1333–1340.Google ScholarGoogle ScholarCross RefCross Ref
  4. Zhangrila Louis Lady. 2018. Accuracy level of $ P algorithm for Javanese script detection on Android-based application. Procedia Comput. Sci. 135 (2018), 416–424.Google ScholarGoogle ScholarCross RefCross Ref
  5. Idicula Sumam Mary. 2012. An online character recognition system to convert Grantha script to Malayalam. Arxiv Preprint Arxiv:1208.4316 (2012).Google ScholarGoogle Scholar
  6. Khaled S. Younis. 2017. Arabic handwritten character recognition based on deep convolutional neural networks. Jordanian J. Comput. Inf. Technol. 3, 3 (2017).Google ScholarGoogle Scholar
  7. S. L. Feng and Raghavan Manmatha. 2005. Classification models for historical manuscript recognition. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). IEEE, 528–532.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chaudhari Shailesh and Ravi M. Gulati. 2016. Script identification using Gabor feature and SVM classifier. Procedia Comput. Sci. 79 (2016), 85–92.Google ScholarGoogle ScholarCross RefCross Ref
  9. Bhunia Ayan Kumar, Partha Pratim Roy, Akash Mohta, and Umapada Pal. 2018. Cross-language framework for word recognition and spotting of Indic scripts. Pattern Recog. 79 (2018), 12–31.Google ScholarGoogle ScholarCross RefCross Ref
  10. Daggumati Shruti and Peter Z. Revesz. 2018. Data mining ancient script image data using convolutional neural networks. In Proceedings of the 22nd International Database Engineering & Applications Symposium. ACM, 267–272.Google ScholarGoogle Scholar
  11. Shi Zhixin, Srirangaraj Setlur, and Venu Govindaraju. 2004. Digital enhancement of palm leaf manuscript images using normalization techniques. In Proceedings of the 5th International Conference on Knowledge-based Computer Systems. 19–22.Google ScholarGoogle Scholar
  12. Ghosh Rajib, Chirumavila Vamshi, and Prabhat Kumar. 2019. RNN-based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning. Pattern Recog. 92 (2019), 203–218.Google ScholarGoogle ScholarCross RefCross Ref
  13. Amasyali Kadir and Nora M. El-Gohary. 2018. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 81 (2018), 1192–1205.Google ScholarGoogle ScholarCross RefCross Ref
  14. Bhunia Ankan Kumar, Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Partha P. Roy, and Umapada Pal. 2019. Script identification in natural scene image and video frames using an attention based convolutional-LSTM network. Pattern Recog. 85 (2019), 172–184.Google ScholarGoogle ScholarCross RefCross Ref
  15. Shi Baoguang, Xiang Bai, and Cong Yao. 2016. Script identification in the wild via discriminative convolutional neural network. Pattern Recog. 52 (2016), 448–458.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Sadanand A. Kulkarni, L. Borde Prashant, R. Manza Ramesh, and L. Yannawar Pravin. 2015. Impact of zoning on Zernike moments for handwritten MODI character recognition. In Proceedings of the International Conference on Computer, Communication and Control (IC4’15). IEEE, 1–6.Google ScholarGoogle Scholar
  17. Dey Sounak, Palaiahnakote Shivakumara, K. S. Raghunandan, Umapada Pal, Tong Lu, G. Hemantha Kumar, and Chee Seng Chan. 2017. Script independent approach for multi-oriented text detection in scene image. Neurocomputing 242 (2017), 96–112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nguyen Cong Kha, Cuong Tuan Nguyen, and Nakagawa Masaki. 2017. Tens of thousands of nom character recognition by deep convolution neural networks. In Proceedings of the 4th International Workshop on Historical Document Imaging and Processing. ACM, 37–41.Google ScholarGoogle Scholar
  19. Zhong Guoqiang and Mohamed Cheriet. 2015. Tensor representation learning based image patch analysis for text identification and recognition. Pattern Recog. 48, 4 (2015), 1211–1224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Naz Saeeda, Khizar Hayat, Muhammad Imran Razzak, Muhammad Waqas Anwar, Sajjad A. Madani, and Samee U. Khan. 2014. The optical character recognition of Urdu-like cursive scripts. Pattern Recog. 47, 3 (2014), 1229–1248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. T. Mane and U. V. Kulkarni. 2018. Visualizing and understanding customized convolutional neural network for recognition of handwritten Marathi numerals. Procedia Comput. Sci. 132 (2018), 1123–1137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Naz Saeeda, Arif I. Umar, Riaz Ahmad, Imran Siddiqi, Saad B. Ahmed, Muhammad I. Razzak, and Faisal Shafait. 2017. Urdu Nastaliq recognition using convolutional–recursive deep learning. Neurocomputing 243 (2017), 80–87.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Chanda Sukalpa, Umapada Pal, and Oriol Ramos Terrades. 2009. Word-wise Thai and Roman script identification. ACM Trans. Asian Lang. Inf. Proc. 8, 3 (2009), 11.Google ScholarGoogle Scholar
  24. Naz Saeeda, Saad Bin Ahmed, Riaz Ahmad, and Muhammad Imran Razzak. 2016. Zoning features and 2DLSTM for Urdu text-line recognition. Procedia Comput. Sci. 96 (2016), 16–22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Qian You, Wang Xichang, Zhang Huaying, Sun Zhen, and Liu Jiang. 2013. Recognition method for handwritten digits based on improved chain code histogram feature. In Proceedings of the 3rd International Conference on Multimedia Technology (ICMT’13). Atlantis Press.Google ScholarGoogle Scholar
  26. Diem Markus and Robert Sablatnig. 2010. Recognizing characters of ancient manuscripts. In Computer Vision and Image Analysis of Art, Vol. 7531, 753106. International Society for Optics and Photonics.Google ScholarGoogle ScholarCross RefCross Ref
  27. Al-Aziz, Ahmad M. Abd, Mervat Gheith, and Ayman F. Sayed. 2011. Recognition for old Arabic manuscripts using spatial gray level dependence (SGLD). Egyptian Inf. J. 12, 1 (2011) 37–43.Google ScholarGoogle ScholarCross RefCross Ref
  28. Elleuch Mohamed, Najiba Tagougui, and Monji Kherallah. 2017. Optimization of DBN using regularization methods applied for recognizing Arabic handwritten script. Procedia Comput. Sci. 108 (2017), 2292–2297.Google ScholarGoogle ScholarCross RefCross Ref
  29. Soselia Davit, Magda Tsintsadze, Levan Shugliashvili, Irakli Koberidze, Shota Amashukeli, and Sandro Jijavadze. 2018. On Georgian handwritten character recognition. IFAC-PapersOnLine 51, 30 (2018), 161–165.Google ScholarGoogle ScholarCross RefCross Ref
  30. Shivakumara Palaiahnakote, Zehuan Yuan, Danni Zhao, Tong Lu, and Chew Lim Tan. 2015. New gradient-spatial-structural features for video script identification. Comput. Vis. Image Underst. 130 (2015), 35–53.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Samir Benbakreti and Aoued Boukelif. 2018. New approach for online Arabic manuscript recognition by deep belief network. (2018).Google ScholarGoogle Scholar
  32. Guruprasad Prathima and Jharna Majumdar. 2016. Multimodal recognition framework: An accurate and powerful Nandinagari handwritten character recognition model. Procedia Comput. Sci. 89 (2016), 836–844.Google ScholarGoogle ScholarCross RefCross Ref
  33. V. N. Aradhya, G. Manjunath, Hemantha Kumar, and S. Noushath. 2008. Multilingual OCR system for South Indian scripts and English documents: An approach based on Fourier transform and principal component analysis. Eng. Applic. Artif. Intell. 21, 4 (2008), 658–668.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Vijayaraghavan Prashanth and Misha Sra. 2014. Handwritten Tamil recognition using a convolutional neural network. (2014).Google ScholarGoogle Scholar
  35. Peter W. Frey and David J. Slate. 1991. Letter recognition using Holland-style adaptive classifiers. Mach. Learn. 6, 2 (1991), 161–182.Google ScholarGoogle ScholarCross RefCross Ref
  36. U. Pal and B. B. Chaudhuri. 2004. Indian script character recognition: A survey. Pattern Recog. 37, 9 (2004), 1887–1899.Google ScholarGoogle ScholarCross RefCross Ref
  37. Papaodysseus Constantin, Panayiotis Rousopoulos, Fotios Giannopoulos, Solomon Zannos, Dimitris Arabadjis, Mihalis Panagopoulos, E. Kalfa, Christopher Blackwell, and Stephen Tracy. 2014. Identifying the writer of ancient inscriptions and Byzantine codices. A novel approach. Comput. Vis. Image Underst. 121 (2014), 57–73.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. U. Pal and B. B. Chaudhuri. 2002. Identification of different script lines from multi-script documents. Image Vis. Comput. 20, 13--14 (2002), 945–954.Google ScholarGoogle ScholarCross RefCross Ref
  39. P. Rajan and S. Sridhar. 2017. Identification of ancient Tamil letters and its characters: Automatic date fixation based on contour-let technique. In Proceedings of the International Conference on Graphics and Signal Processing. ACM, 40–43.Google ScholarGoogle Scholar
  40. Thomas M. Breuel. 2008. The OCRopus open source OCR system. In Document Recognition and Retrieval XV, Vol. 6815, 68150F. International Society for Optics and Photonics, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  41. Roy Partha Pratim, Ayan Kumar Bhunia, Ayan Das, Prasenjit Dey, and Umapada Pal. 2016. HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recog. 60 (2016), 1057–1075.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Valdenegro-Toro Matias, Paul Plöger, Stefan Eickeler, and Iuliu Konya. 2016. Histograms of stroke widths for multi-script text detection and verification in road scenes. IFAC-PapersOnLine 49, 15 (2016), 100–107.Google ScholarGoogle ScholarCross RefCross Ref
  43. Pal Umapada, Ramachandran Jayadevan, and Nabin Sharma. 2012. Handwriting recognition in Indian regional scripts: A survey of offline techniques. ACM Trans. Asian Lang. Inf. Proc. 11, 1 (2012), 1.Google ScholarGoogle Scholar
  44. Raj V. Amrutha, R. L. Jyothi, and A. Anilkumar. 2017. Grantha script recognition from ancient palm leaves using histogram of orientation shape context. In Proceedings of the International Conference on Computing Methodologies and Communication (ICCMC’17). IEEE, 790–794.Google ScholarGoogle Scholar
  45. Denis G. Pelli, Catherine W. Burns, Bart Farell, and Deborah C. Moore-Page. 2016. Feature detection and letter identification. Vis. Res. 46, 28 (2006), 4646–4674.Google ScholarGoogle ScholarCross RefCross Ref
  46. Rahman Md Mahbubar, M. A. H. Akhand, Shahidul Islam, Pintu Chandra Shill, and M. H. Rahman. 2015. Bangla handwritten character recognition using convolutional neural network. Int. J. Image, Graph. Sig. Proc. 7, 8 (2015), 42–49.Google ScholarGoogle Scholar
  47. Van Phan Truyen, Bilan Zhu, and Masaki Nakagawa. 2012. Collecting handwritten nom character patterns from historical document pages. In Proceedings of the 10th IAPR International Workshop on Document Analysis Systems. IEEE, 344–348.Google ScholarGoogle Scholar
  48. Pan Xingyu and Laure Tougne. 2017. A new database of digits extracted from coins with hard-to-segment foreground for optical character recognition evaluation. Front. ICT 4 (2017), 9.Google ScholarGoogle ScholarCross RefCross Ref
  49. Sarkhel Ritesh, Nibaran Das, Aritra Das, Mahantapas Kundu, and Mita Nasipuri. 2017. A multi-scale deep quad tree based feature extraction method for the recognition of isolated handwritten characters of popular Indic scripts. Pattern Recog. 71 (2017), 78–93.Google ScholarGoogle ScholarCross RefCross Ref
  50. Naz Saeeda, Arif I. Umar, Riaz Ahmad, Saad B. Ahmed, Syed H. Shirazi, Imran Siddiqi, and Muhammad I. Razzak. 2016. Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing 177 (2016), 228–241.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Lakshmi T. R. Vijaya, Panyam Narahari Sastry, and T. V. Rajinikanth. 2017. A novel 3D approach to recognize Telugu palm leaf text. Eng. Sci. Technol. Int. J. 20, 1 (2017), 143–150.Google ScholarGoogle Scholar
  52. Sarma Kandarpa Kumar. 2009. Bi-lingual handwritten character and numeral recognition using multi-dimensional recurrent neural networks (MDRNN). Int. J. Elect. Electron. Eng. 3, 7 (2009).Google ScholarGoogle Scholar
  53. Sarkar Ram, Nibaran Das, Subhadip Basu, Mahantapas Kundu, Mita Nasipuri, and Dipak Kumar Basu. 2010. Word level script identification from Bangla and Devanagri handwritten texts mixed with Roman script. Arxiv Preprint Arxiv:1002.4007 (2010).Google ScholarGoogle Scholar
  54. G. G. Rajput and H. B. Anita. 2010. Handwritten script recognition using DCT and wavelet features at block level. IJCA, Spec. Iss. RTIPPR 3 (2010), 158–163.Google ScholarGoogle Scholar
  55. Zhang Jianshu, Jun Du, Shiliang Zhang, Dan Liu, Yulong Hu, Jinshui Hu, Si Wei, and Lirong Dai. 2017. Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recog. 71 (2017), 196–206.Google ScholarGoogle ScholarCross RefCross Ref
  56. U. Bhattacharya, S. K. Parui, B. Shaw, and K. Bhattacharya. 2006. Neural combination of ANN and HMM for handwritten Devanagari numeral recognition. 2006.Google ScholarGoogle Scholar
  57. Basu Subhadip, Nibaran Das, Ram Sarkar, Mahantapas Kundu, Mita Nasipuri, and Dipak Kumar Basu. 2009. A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recog. 42, 7 (2009), 1467–1484.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Bhowmik Tapan Kumar, Ujjwal Bhattacharya, and Swapan K. Parui. 2004. Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In Proceedings of the International Conference on Neural Information Processing. Springer, Berlin, 814–819.Google ScholarGoogle Scholar
  59. P. Chinnuswamy and Suban G. Krishnamoorthy. 1980. Recognition of handprinted Tamil characters. Pattern Recog. 12, 3 (1980), 141–152.Google ScholarGoogle ScholarCross RefCross Ref
  60. Garg Naveen and Sandeep Kaur. 2011. Improvement in efficiency of recognition of handwritten gurumukhi script 1. (2011).Google ScholarGoogle Scholar
  61. P. B. Khanale and S. D. Chitnis. 2011. Handwritten Devanagari character recognition using artificial neural network. J. Artif. Intell. 4, 1 (2011), 55–62.Google ScholarGoogle ScholarCross RefCross Ref
  62. Sural Shamik and P. K. Das. 1999. An MLP using Hough transform based fuzzy feature extraction for Bengali script recognition. Pattern Recog. Lett. 20, 8 (1999), 771–782.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Tiji M. Jose and Amitabh Wahi. 2013. Recognition of Tamil handwritten characters using Daubechies wavelet transforms and feed-forward backpropagation network. Int. J. Comput. Applic. 64, 8 (2013).Google ScholarGoogle Scholar
  64. H. Y. Abdelazim and M. A. Hashish. 1989. Automatic recognition of handwritten Hindi numerals. COMPEURO 89 Proceedings VLSI and Computer Peripherals. IEEE, 287–298.Google ScholarGoogle Scholar
  65. Al-Badr Badr and Sabri A. Mahmoud. 1995. Survey and bibliography of Arabic optical text recognition. Sig. Proc. 41, 1 (1995), 49–77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Mohammad S. Khorsheed and William F. Clocksin. 2000. Multi-font Arabic word recognition using spectral features. In Proceedings of the 15th International Conference on Pattern Recognition. (ICPR’00). IEEE, 543–546.Google ScholarGoogle Scholar
  67. Al-Badr Badr and Robert M. Haralick. 1998. A segmentation-free approach to text recognition with application to Arabic text. Int. J. Doc. Anal. Recog. 1, 3 (1998), 147–166.Google ScholarGoogle ScholarCross RefCross Ref
  68. Kannan R. Jagadeesh and S. Subramanian. 2015. An adaptive approach of Tamil character recognition using deep learning with big data—A survey. In Proceedings of the 49th Convention of the Computer Society of India (CSI’15). Springer, Cham, 557–567.Google ScholarGoogle Scholar
  69. Ahmed M. Zeki. 2005. The segmentation problem in Arabic character recognition—The state of the art. In Proceedings of the International Conference on Information and Communication Technologies. IEEE, 11–26.Google ScholarGoogle ScholarCross RefCross Ref
  70. Plamondon Réjean and Sargur N. Srihari. 2000. Online and off-line handwriting recognition: A comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1 (2000), 63–84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. El-Mahallawy and Mohamed Saad Mostafa. 2008. A large scale HMM-based omni front-written OCR system for cursive scripts. Cairo University, Faculty of Engineering (2008).Google ScholarGoogle Scholar
  72. B. A. Srinivas, A. Agarwal, and C. R. Rao. 2008. An overview of OCR research in Indian scripts. 2, 2 (2008).Google ScholarGoogle Scholar
  73. Rani Simpel and Gurpreet Singh Lehal. 2016. Recognition based classification of Gurmukhi manuscripts. In Proceedings of the Symposium on Colossal Data Analysis and Networking (CDAN’16). IEEE, 1–5.Google ScholarGoogle Scholar
  74. Cheriet Mohamed, Nawwaf Kharma, Cheng-Lin Liu, and Ching Suen. 2007. Character Recognition Systems: A Guide for Students and Practitioners. John Wiley & Sons, 2007.Google ScholarGoogle Scholar
  75. Agrawal Mudit, Ajay S. Bhaskarabhatla, and Sriganesh Madhvanath. 2004. Data collection for handwriting corpus creation in Indic scripts. In Proceedings of the International Conference on Speech and Language Technology and Oriental (ICSLT-COCOSDA’04).Google ScholarGoogle Scholar
  76. Trier Øivind Due, Anil K. Jain, and Torfinn Taxt. 1996. Feature extraction methods for character recognition—A survey. Pattern Recog. 29, 4 (1996), 641–662.Google ScholarGoogle ScholarCross RefCross Ref
  77. Kesiman Made Windu Antara, Sophea Prum, Jean-Christophe Burie, and Jean-Marc Ogier. 2016. Study on feature extraction methods for character recognition of Balinese script on palm leaf manuscript images. In Proceedings of the 23rd International Conference on Pattern Recognition (ICPR’16). IEEE, 4017–4022.Google ScholarGoogle Scholar
  78. Liana M. Lorigo and Venugopal Govindaraju. 2006. Offline Arabic handwriting recognition: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 28, 5 (2006), 712–724.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Robert M. Haralick, Karthikeyan Shanmugam, and Its'Hak Dinstein. 1973. Textural features for image classification. IEEE Trans. Syst., Man, Cybern. 6 (1973), 610–621.Google ScholarGoogle ScholarCross RefCross Ref
  80. Aggarwal Ashutosh, Karamjeet Singh, and Kamalpreet Singh. 2015 Use of gradient technique for extracting features from handwritten Gurmukhi characters and numerals. Procedia Comput. Sci. 46 (2015), 1716–1723.Google ScholarGoogle ScholarCross RefCross Ref
  81. Katsouros Vassilis, Vassilis Papavassiliou, Fotini Simistira, and Basilis Gatos. Recognition of Greek polytonic on historical degraded texts using HMMs. In Proceedings of the 12th IAPR Workshop on Document Analysis Systems (DAS’16). IEEE, 346–351.Google ScholarGoogle Scholar
  82. Kumar Satish. 2011. Study of features for hand-printed recognition. Int. J. Comput. Electr. Autom. Control Inf. Eng. 5 (2011).Google ScholarGoogle Scholar
  83. Neha J. Pithadia and Dr. Vishal D. Nimavat. 2015. A review on feature extraction techniques for optical character recognition. Int. J. Innov. Res. Comput. Commun. Eng. 3 (2015).Google ScholarGoogle Scholar
  84. Echi Afef Kacem and Abdel Belaïd. 2017. Impact of features and classifiers combinations on the performances of Arabic recognition systems. In Proceedings of the 1st International Workshop on Arabic Script Analysis and Recognition (ASAR’17). IEEE, 85–89.Google ScholarGoogle Scholar
  85. A. S. Kavitha, P. Shivakumara, and G. Hemantha Kumar. 2013. Skewness and nearest neighbour based approach for historical document classification. In Proceedings of the International Conference on Communication Systems and Network Technologies. IEEE, 602–606.Google ScholarGoogle Scholar
  86. Rehman Amjad and Tanzila Saba. 2012. Off-line cursive script recognition: Current advances, comparisons and remaining problems. Artif. Intell. Rev. 37, 4 (2012), 261–288.Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Pan Chen, Dong Sun Park, Sook Yoon, and Ju Cheng Yang. 2012. Leukocyte image segmentation using simulated visual attention. Exp. Systems with Applic. 39, 8 (2012), 7479–7494.Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Xu Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning. 2048–2057.Google ScholarGoogle Scholar
  89. You Quanzeng, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo. 2016. Image captioning with semantic attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4651–4659.Google ScholarGoogle Scholar
  90. Bahdanau Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. Arxiv Preprint Arxiv:1409.0473 (2014).Google ScholarGoogle Scholar
  91. Al-Badr Badr and Robert M. Haralick. 1995. Segmentation-free word recognition with application to Arabic. In Proceedings of the 3rd International Conference on Document Analysis and Recognition. IEEE, 355–359.Google ScholarGoogle Scholar
  92. Jumari Kasmiran and Mohamed A. Ali. 2002. A survey and comparative evaluation of selected off-line Arabic handwritten character recognition systems. Jurn. Teknol. 36, 1 (2002), 1–18.Google ScholarGoogle Scholar
  93. Pavlidis Theo. 1993. Recognition of printed text under realistic conditions. Pattern Recog. Lett. 14, 4 (1993), 317–326.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Cho Wongyu, Seong-Whan Lee, and Jin H. Kim. 1995. Modeling and recognition of cursive words with hidden Markov models. Pattern Recog. 28, 12 (1995), 1941–1953.Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Drira Fadoua and Franck Lebourgeois. 2012. Denoising textual images using local/non-local smoothing filters: A comparative study. In Proceedings of the International Conference on Frontiers in Handwriting Recognition. IEEE, 521–526.Google ScholarGoogle Scholar
  96. Atallah M. Al-Shatnawi and Khairuddin Omar. 2009. Skew detection and correction technique for Arabic document images based on centre of gravity. J. Comput. Sci. 5, 5 (2009), 363.Google ScholarGoogle ScholarCross RefCross Ref
  97. Abuhaiba S. I. Ibrahim. 2003. Skew correction of textural documents. J. King Saud Univ.-Comput. Inf. Sci. 15 (2003), 73–93.Google ScholarGoogle Scholar
  98. Khairuddin bin Omar, Ramlan bin Mahmoud, Md Nasir bin Sulaiman, and Abd Rahman bin Ramli. 2000. The removal of secondaries of Jawi characters. In 2000 TENCON Proceedings: Intelligent Systems and Technologies for the New Millennium (Cat. No. 00CH37119). IEEE, 49–152.Google ScholarGoogle ScholarCross RefCross Ref
  99. Arica Nafiz and Fatos T. Yarman-Vural. 2002. Optical character recognition for cursive handwriting. IEEE Trans. Patt. Anal. Mach. Intell. 24, 6 (2002), 801–813.Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Shafait Faisal, Daniel Keysers, and Thomas M. Breuel. 2006. Layout analysis of Urdu document images. In Proceedings of the IEEE International Multitopic Conference. IEEE, 293–298.Google ScholarGoogle Scholar
  101. Jim R. Parker. 2010. Algorithms for Image Processing and Computer Vision. John Wiley & Sons, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Richard G. Casey and Eric Lecolinet. 1996. A survey of methods and strategies in character segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18, 7 (1996), 690–706.Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Lu Yi and Malayappan Shridhar. 1996. Character segmentation in handwritten words—An overview. Pattern Recog. 29, 1 (1996), 77–96.Google ScholarGoogle ScholarCross RefCross Ref
  104. Kudo Mineichi and Jack Sklansky. 2000. Comparison of algorithms that select features for pattern classifiers. Pattern Recog. 33, 1 (2000), 25–41.Google ScholarGoogle ScholarCross RefCross Ref
  105. Mark A. Hall and Lloyd A. Smith. 1997. Feature subset selection: A correlation based filter approach. (1997), 855–858.Google ScholarGoogle Scholar
  106. Liu Huan and Rudy Setiono. 1996. A probabilistic approach to feature selection—A filter solution. In Proceedings of the International Conference on Machine Learning. 319–327.Google ScholarGoogle Scholar
  107. Yu Lei and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 856–863.Google ScholarGoogle Scholar
  108. David W. Aha and Richard L. Bankert. 1994. Feature selection for case-based classification of cloud types: An empirical comparison. In Proceedings of the AAAI-94 Workshop on Case-Based Reasoning.Google ScholarGoogle Scholar
  109. George H. John, Ron Kohavi, and Karl Pfleger. 1994. Irrelevant features and the subset selection problem. In Machine Learning Proceedings 1994. Morgan Kaufmann, 121–129.Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. Pierre A. Devijver and Josef Kittler. 1982. Pattern Recognition: A Statistical Approach. Prentice Hall.Google ScholarGoogle Scholar
  111. Ben-Bassat Moshe. 1982. Use of distance measures, information measures and error bounds in feature evaluation. In Handbook of Statistics, Vol. 2. Elsevier, 773–791.Google ScholarGoogle Scholar
  112. Dash Manoranjan and Huan Liu. 1997. Feature selection for classification. Intell. Data Anal. 1, 1--4 (1997), 131–156.Google ScholarGoogle Scholar
  113. B. B. Chaudhuri and U. Pal. 1997. An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi). In Proceedings of the 4th International Conference on Document Analysis and Recognition. IEEE, 1011–1015.Google ScholarGoogle Scholar
  114. Avrim L. Blum and Pat Langley. 1997. Selection of relevant features and examples in machine learning. Artif. Intell. 97, 1--2 (1997), 245–271.Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. Richard E. Bellman. 2015. Adaptive Control Processes: A Guided Tour. Princeton University Press.Google ScholarGoogle Scholar
  116. D. E. Rumelhart and J. L. McClelland. 1986. Learning internal representations by error propagation. In Parallel Distributed Processing. The MIT Press.Google ScholarGoogle Scholar
  117. Vladimir N. Vapnik. 1995. The nature of statistical learning. Theory (1995).Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Christopher J. C. Burges. 1998. A tutorial on support vector machines for pattern recognition. Data Mining Knowl. Discov. 2, 2 (1998), 121–167.Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. Chapelle Olivier, Vladimir Vapnik, Olivier Bousquet, and Sayan Mukherjee. 2002. Choosing multiple parameters for support vector machines. Mach. Learn. 46, 1--3 (2002), 131–159.Google ScholarGoogle Scholar
  120. Cristianini Nello and John Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, 2000.Google ScholarGoogle Scholar
  121. Y. Fataicha, J. Y. Nie Mohamed Cheriet, and Ching Y. Suen. 2006. Retrieving poorly degraded OCR documents. Int. J. Doc. Anal. Recog. 8, 1 (2006), 15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. Natarajan Prem, Shirin Saleem, Rohit Prasad, Ehry MacRostie, and Krishna Subramanian. 2006. Multi-lingual offline handwriting recognition using hidden Markov models: A script-independent approach. In Proceedings of the Summit on Arabic and Chinese Handwriting Recognition. Springer, Berlin, 231–250.Google ScholarGoogle Scholar
  123. Bassil Youssef and Mohammad Alwani. 2012. OCR post-processing error correction algorithm using Google online spelling suggestion. Arxiv Preprint Arxiv:1204.0191 (2012).Google ScholarGoogle Scholar
  124. Ching Y. Suen, Marc Berthod, and Shunji Mori. 1980. Automatic recognition of handprinted characters—the state of the art. Proc. IEEE 68, 4 (1980), 469–487.Google ScholarGoogle ScholarCross RefCross Ref
  125. J. Mantas. 1986. An overview of character recognition methodologies. Pattern Recog. 19, 6 (1986), 425–430.Google ScholarGoogle ScholarCross RefCross Ref
  126. V. K. Govindan and A. P. Shivaprasad. 1990. Character recognition—A review. Pattern Recog. 23, 7 (1990), 671–683.Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. Mori Shunji, Ching Y. Suen, and Kazuhiko Yamamoto. 1992. Historical review of OCR research and development. Proc. IEEE 80, 7 (1992), 1029–1058.Google ScholarGoogle ScholarCross RefCross Ref
  128. Bunke Horst and Patrick Shen-pei Wang. 1997. Handbook of Character Recognition and Document Image Analysis. World Scientific.Google ScholarGoogle Scholar
  129. Nagy George. 2000. Twenty years of document image analysis in PAMI. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2000), 38–62.Google ScholarGoogle Scholar
  130. Ubul Kurban, Gulzira Tursun, Alimjan Aysa, Donato Impedovo, Giuseppe Pirlo, and Tuergen Yibulayin. 2017. Script identification of multi-script documents: A survey. IEEE Access 5 (2017), 6546–6559.Google ScholarGoogle Scholar
  131. Peng Liangrui, Changsong Liu, Xiaoqing Ding, and Hua Wang. 2006. Multilingual document recognition research and its application in China. In Proceedings of the 2nd International Conference on Document Image Analysis for Libraries (DIAL’06). IEEE.Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. Nakanishi. 1980. Akira. Writing Systems of the World: Alphabets, Syllabaries, Pictograms. Tuttle Publishing.Google ScholarGoogle Scholar
  133. Silva Cláudia. 2011. Writing in Portuguese chats: A new writing system? Writ. Lang. Lite. 14, 1 (2011) 143–156.Google ScholarGoogle ScholarCross RefCross Ref
  134. Sk Md Obaidullah, Chayan Halder, K. C. Santosh, Nibaran Das, and Kaushik Roy. 2018. PHDIndic_11: Page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimedia Tools Applic. 77, 2 (2018), 1643–1678.Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. Alaei Alireza, Umapada Pal, and P. Nagabhushan. 2012. Dataset and ground truth for handwritten text in four different scripts. Int. J. Pattern Recog. Artif. Intell. 26, 4 (2012), 1253001.Google ScholarGoogle ScholarCross RefCross Ref
  136. Bhattacharya Ujjwal and B. B. Chaudhuri. 2005. Databases for research on recognition of handwritten characters of Indian scripts. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). IEEE, 789–793.Google ScholarGoogle Scholar
  137. Ghosh Debashis, Tulika Dube, and Adamane Shivaprasad. 2010. Script recognition—A review. IEEE Trans. Pattern Anal. Mach. Intell. 32, 12 (2010), 2142–2161.Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. Maitra Durjoy Sen, Ujjwal Bhattacharya, and Swapan K. Parui. 2015. CNN based common approach to handwritten character recognition of multiple scripts. In Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR’15). IEEE, 1021–1025.Google ScholarGoogle Scholar
  139. A. Soumya and G. Hemantha Kumar. 2014. Classification of ancient epigraphs into different periods using random forests. In Proceedings of the 5th International Conference on Signal and Image Processing. IEEE, 171–178.Google ScholarGoogle Scholar
  140. Easwaramoorthy Sathishkumar, Usha Moorthy, Chunduru Anil Kumar, S. Bharath Bhushan, and Vishnupriya Sadagopan. 2017. Content based image retrieval with enhanced privacy in cloud using Apache Spark. In Proceedings of the International Conference on Data Science Analytics and Applications. Springer, 114–128.Google ScholarGoogle Scholar
  141. Mehul Gupta, Patel Ankita, Dave Namrata, Goradia Rahul, and Saurin Sheth. 2014. Text-based image segmentation methodology. Procedia Technol. 14 (2014), 465–472.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Study on Automated Approach to Recognize Characters for Handwritten and Historical Document
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 3
        May 2021
        240 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3457152
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 August 2021
        • Online AM: 7 May 2020
        • Revised: 1 April 2020
        • Accepted: 1 April 2020
        • Received: 1 February 2020
        Published in tallip Volume 20, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format