research-article

Study on Automated Approach to Recognize Characters for Handwritten and Historical Document

Authors:
Dhivya S

School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India

School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India

0000-0002-6869-1560
View Profile

,
Usha Devi G

School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India

School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 20 Issue 3Article No.: 37pp 1–24https://doi.org/10.1145/3396167

Published:12 August 2021Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Script recognition is the mechanism of automatic script analysis and recognition whereby intensive study has been carried out and a significant amount of papers on this problem have been released over the past. But there are still a few issues to be solved, particularly in Indian historical manuscripts. This literature examines the Script recognition with reference to multi-script document and different historical scripts such as Kurdish-Latin, Devanagari, Grantha, Arabic handwritten characters, Bangladesh, Devanagari and Gurumukhi, ancient Chinese, Arabic, Nam Character, Greek, Nastalique Urdu, Georgian handwritten, Nandinagari, and Hebrew, which provide the course of study that focuses on the framework for script recognition. This review concentrates on scope of prediction, dataset type, the methods used for data preprocessing, and measures of performance used for analysis. On the basis of this survey, Current research constraints have been recognized and future study specifications are emphasized in the area of modeling historical manuscripts.

CCS Concepts:

References

Zebardast Behnam and Isa Maleki. 2013. A new radial basis function artificial neural network based recognition for kurdish manuscript. Int. J. Appl. Evol. Computat. 4, 4 (2013), 72–87.Google ScholarCross Ref
Soora Narasimha Reddy and Parag S. Deshpande. 2018. A novel local skew correction and segmentation approach for printed multilingual indian documents. Alexandria Eng. J. 57, 3 (2018), 1609–1618.Google ScholarCross Ref
Varghese K. Sonu, Ajay James, and Saravanan Chandran. 2016. A novel tri-stage recognition scheme for handwritten Malayalam character recognition. Procedia Technol. 24 (2016), 1333–1340.Google ScholarCross Ref
Zhangrila Louis Lady. 2018. Accuracy level of $ P algorithm for Javanese script detection on Android-based application. Procedia Comput. Sci. 135 (2018), 416–424.Google ScholarCross Ref
Idicula Sumam Mary. 2012. An online character recognition system to convert Grantha script to Malayalam. Arxiv Preprint Arxiv:1208.4316 (2012).Google Scholar
Khaled S. Younis. 2017. Arabic handwritten character recognition based on deep convolutional neural networks. Jordanian J. Comput. Inf. Technol. 3, 3 (2017).Google Scholar
S. L. Feng and Raghavan Manmatha. 2005. Classification models for historical manuscript recognition. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). IEEE, 528–532.Google ScholarDigital Library
Chaudhari Shailesh and Ravi M. Gulati. 2016. Script identification using Gabor feature and SVM classifier. Procedia Comput. Sci. 79 (2016), 85–92.Google ScholarCross Ref
Bhunia Ayan Kumar, Partha Pratim Roy, Akash Mohta, and Umapada Pal. 2018. Cross-language framework for word recognition and spotting of Indic scripts. Pattern Recog. 79 (2018), 12–31.Google ScholarCross Ref
Daggumati Shruti and Peter Z. Revesz. 2018. Data mining ancient script image data using convolutional neural networks. In Proceedings of the 22nd International Database Engineering & Applications Symposium. ACM, 267–272.Google Scholar
Shi Zhixin, Srirangaraj Setlur, and Venu Govindaraju. 2004. Digital enhancement of palm leaf manuscript images using normalization techniques. In Proceedings of the 5th International Conference on Knowledge-based Computer Systems. 19–22.Google Scholar
Ghosh Rajib, Chirumavila Vamshi, and Prabhat Kumar. 2019. RNN-based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning. Pattern Recog. 92 (2019), 203–218.Google ScholarCross Ref
Amasyali Kadir and Nora M. El-Gohary. 2018. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 81 (2018), 1192–1205.Google ScholarCross Ref
Bhunia Ankan Kumar, Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Partha P. Roy, and Umapada Pal. 2019. Script identification in natural scene image and video frames using an attention based convolutional-LSTM network. Pattern Recog. 85 (2019), 172–184.Google ScholarCross Ref
Shi Baoguang, Xiang Bai, and Cong Yao. 2016. Script identification in the wild via discriminative convolutional neural network. Pattern Recog. 52 (2016), 448–458.Google ScholarDigital Library
Sadanand A. Kulkarni, L. Borde Prashant, R. Manza Ramesh, and L. Yannawar Pravin. 2015. Impact of zoning on Zernike moments for handwritten MODI character recognition. In Proceedings of the International Conference on Computer, Communication and Control (IC4’15). IEEE, 1–6.Google Scholar
Dey Sounak, Palaiahnakote Shivakumara, K. S. Raghunandan, Umapada Pal, Tong Lu, G. Hemantha Kumar, and Chee Seng Chan. 2017. Script independent approach for multi-oriented text detection in scene image. Neurocomputing 242 (2017), 96–112.Google ScholarDigital Library
Nguyen Cong Kha, Cuong Tuan Nguyen, and Nakagawa Masaki. 2017. Tens of thousands of nom character recognition by deep convolution neural networks. In Proceedings of the 4th International Workshop on Historical Document Imaging and Processing. ACM, 37–41.Google Scholar
Zhong Guoqiang and Mohamed Cheriet. 2015. Tensor representation learning based image patch analysis for text identification and recognition. Pattern Recog. 48, 4 (2015), 1211–1224.Google ScholarDigital Library
Naz Saeeda, Khizar Hayat, Muhammad Imran Razzak, Muhammad Waqas Anwar, Sajjad A. Madani, and Samee U. Khan. 2014. The optical character recognition of Urdu-like cursive scripts. Pattern Recog. 47, 3 (2014), 1229–1248.Google ScholarDigital Library
D. T. Mane and U. V. Kulkarni. 2018. Visualizing and understanding customized convolutional neural network for recognition of handwritten Marathi numerals. Procedia Comput. Sci. 132 (2018), 1123–1137.Google ScholarDigital Library
Naz Saeeda, Arif I. Umar, Riaz Ahmad, Imran Siddiqi, Saad B. Ahmed, Muhammad I. Razzak, and Faisal Shafait. 2017. Urdu Nastaliq recognition using convolutional–recursive deep learning. Neurocomputing 243 (2017), 80–87.Google ScholarDigital Library
Chanda Sukalpa, Umapada Pal, and Oriol Ramos Terrades. 2009. Word-wise Thai and Roman script identification. ACM Trans. Asian Lang. Inf. Proc. 8, 3 (2009), 11.Google Scholar
Naz Saeeda, Saad Bin Ahmed, Riaz Ahmad, and Muhammad Imran Razzak. 2016. Zoning features and 2DLSTM for Urdu text-line recognition. Procedia Comput. Sci. 96 (2016), 16–22.Google ScholarDigital Library
Qian You, Wang Xichang, Zhang Huaying, Sun Zhen, and Liu Jiang. 2013. Recognition method for handwritten digits based on improved chain code histogram feature. In Proceedings of the 3rd International Conference on Multimedia Technology (ICMT’13). Atlantis Press.Google Scholar
Diem Markus and Robert Sablatnig. 2010. Recognizing characters of ancient manuscripts. In Computer Vision and Image Analysis of Art, Vol. 7531, 753106. International Society for Optics and Photonics.Google ScholarCross Ref
Al-Aziz, Ahmad M. Abd, Mervat Gheith, and Ayman F. Sayed. 2011. Recognition for old Arabic manuscripts using spatial gray level dependence (SGLD). Egyptian Inf. J. 12, 1 (2011) 37–43.Google ScholarCross Ref
Elleuch Mohamed, Najiba Tagougui, and Monji Kherallah. 2017. Optimization of DBN using regularization methods applied for recognizing Arabic handwritten script. Procedia Comput. Sci. 108 (2017), 2292–2297.Google ScholarCross Ref
Soselia Davit, Magda Tsintsadze, Levan Shugliashvili, Irakli Koberidze, Shota Amashukeli, and Sandro Jijavadze. 2018. On Georgian handwritten character recognition. IFAC-PapersOnLine 51, 30 (2018), 161–165.Google ScholarCross Ref
Shivakumara Palaiahnakote, Zehuan Yuan, Danni Zhao, Tong Lu, and Chew Lim Tan. 2015. New gradient-spatial-structural features for video script identification. Comput. Vis. Image Underst. 130 (2015), 35–53.Google ScholarDigital Library
Samir Benbakreti and Aoued Boukelif. 2018. New approach for online Arabic manuscript recognition by deep belief network. (2018).Google Scholar
Guruprasad Prathima and Jharna Majumdar. 2016. Multimodal recognition framework: An accurate and powerful Nandinagari handwritten character recognition model. Procedia Comput. Sci. 89 (2016), 836–844.Google ScholarCross Ref
V. N. Aradhya, G. Manjunath, Hemantha Kumar, and S. Noushath. 2008. Multilingual OCR system for South Indian scripts and English documents: An approach based on Fourier transform and principal component analysis. Eng. Applic. Artif. Intell. 21, 4 (2008), 658–668.Google ScholarDigital Library
Vijayaraghavan Prashanth and Misha Sra. 2014. Handwritten Tamil recognition using a convolutional neural network. (2014).Google Scholar
Peter W. Frey and David J. Slate. 1991. Letter recognition using Holland-style adaptive classifiers. Mach. Learn. 6, 2 (1991), 161–182.Google ScholarCross Ref
U. Pal and B. B. Chaudhuri. 2004. Indian script character recognition: A survey. Pattern Recog. 37, 9 (2004), 1887–1899.Google ScholarCross Ref
Papaodysseus Constantin, Panayiotis Rousopoulos, Fotios Giannopoulos, Solomon Zannos, Dimitris Arabadjis, Mihalis Panagopoulos, E. Kalfa, Christopher Blackwell, and Stephen Tracy. 2014. Identifying the writer of ancient inscriptions and Byzantine codices. A novel approach. Comput. Vis. Image Underst. 121 (2014), 57–73.Google ScholarDigital Library
U. Pal and B. B. Chaudhuri. 2002. Identification of different script lines from multi-script documents. Image Vis. Comput. 20, 13--14 (2002), 945–954.Google ScholarCross Ref
P. Rajan and S. Sridhar. 2017. Identification of ancient Tamil letters and its characters: Automatic date fixation based on contour-let technique. In Proceedings of the International Conference on Graphics and Signal Processing. ACM, 40–43.Google Scholar
Thomas M. Breuel. 2008. The OCRopus open source OCR system. In Document Recognition and Retrieval XV, Vol. 6815, 68150F. International Society for Optics and Photonics, 2008.Google ScholarCross Ref
Roy Partha Pratim, Ayan Kumar Bhunia, Ayan Das, Prasenjit Dey, and Umapada Pal. 2016. HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recog. 60 (2016), 1057–1075.Google ScholarDigital Library
Valdenegro-Toro Matias, Paul Plöger, Stefan Eickeler, and Iuliu Konya. 2016. Histograms of stroke widths for multi-script text detection and verification in road scenes. IFAC-PapersOnLine 49, 15 (2016), 100–107.Google ScholarCross Ref
Pal Umapada, Ramachandran Jayadevan, and Nabin Sharma. 2012. Handwriting recognition in Indian regional scripts: A survey of offline techniques. ACM Trans. Asian Lang. Inf. Proc. 11, 1 (2012), 1.Google Scholar
Raj V. Amrutha, R. L. Jyothi, and A. Anilkumar. 2017. Grantha script recognition from ancient palm leaves using histogram of orientation shape context. In Proceedings of the International Conference on Computing Methodologies and Communication (ICCMC’17). IEEE, 790–794.Google Scholar
Denis G. Pelli, Catherine W. Burns, Bart Farell, and Deborah C. Moore-Page. 2016. Feature detection and letter identification. Vis. Res. 46, 28 (2006), 4646–4674.Google ScholarCross Ref
Rahman Md Mahbubar, M. A. H. Akhand, Shahidul Islam, Pintu Chandra Shill, and M. H. Rahman. 2015. Bangla handwritten character recognition using convolutional neural network. Int. J. Image, Graph. Sig. Proc. 7, 8 (2015), 42–49.Google Scholar
Van Phan Truyen, Bilan Zhu, and Masaki Nakagawa. 2012. Collecting handwritten nom character patterns from historical document pages. In Proceedings of the 10th IAPR International Workshop on Document Analysis Systems. IEEE, 344–348.Google Scholar
Pan Xingyu and Laure Tougne. 2017. A new database of digits extracted from coins with hard-to-segment foreground for optical character recognition evaluation. Front. ICT 4 (2017), 9.Google ScholarCross Ref
Sarkhel Ritesh, Nibaran Das, Aritra Das, Mahantapas Kundu, and Mita Nasipuri. 2017. A multi-scale deep quad tree based feature extraction method for the recognition of isolated handwritten characters of popular Indic scripts. Pattern Recog. 71 (2017), 78–93.Google ScholarCross Ref
Naz Saeeda, Arif I. Umar, Riaz Ahmad, Saad B. Ahmed, Syed H. Shirazi, Imran Siddiqi, and Muhammad I. Razzak. 2016. Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks. Neurocomputing 177 (2016), 228–241.Google ScholarDigital Library
Lakshmi T. R. Vijaya, Panyam Narahari Sastry, and T. V. Rajinikanth. 2017. A novel 3D approach to recognize Telugu palm leaf text. Eng. Sci. Technol. Int. J. 20, 1 (2017), 143–150.Google Scholar
Sarma Kandarpa Kumar. 2009. Bi-lingual handwritten character and numeral recognition using multi-dimensional recurrent neural networks (MDRNN). Int. J. Elect. Electron. Eng. 3, 7 (2009).Google Scholar
Sarkar Ram, Nibaran Das, Subhadip Basu, Mahantapas Kundu, Mita Nasipuri, and Dipak Kumar Basu. 2010. Word level script identification from Bangla and Devanagri handwritten texts mixed with Roman script. Arxiv Preprint Arxiv:1002.4007 (2010).Google Scholar
G. G. Rajput and H. B. Anita. 2010. Handwritten script recognition using DCT and wavelet features at block level. IJCA, Spec. Iss. RTIPPR 3 (2010), 158–163.Google Scholar
Zhang Jianshu, Jun Du, Shiliang Zhang, Dan Liu, Yulong Hu, Jinshui Hu, Si Wei, and Lirong Dai. 2017. Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recog. 71 (2017), 196–206.Google ScholarCross Ref
U. Bhattacharya, S. K. Parui, B. Shaw, and K. Bhattacharya. 2006. Neural combination of ANN and HMM for handwritten Devanagari numeral recognition. 2006.Google Scholar
Basu Subhadip, Nibaran Das, Ram Sarkar, Mahantapas Kundu, Mita Nasipuri, and Dipak Kumar Basu. 2009. A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recog. 42, 7 (2009), 1467–1484.Google ScholarDigital Library
Bhowmik Tapan Kumar, Ujjwal Bhattacharya, and Swapan K. Parui. 2004. Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In Proceedings of the International Conference on Neural Information Processing. Springer, Berlin, 814–819.Google Scholar
P. Chinnuswamy and Suban G. Krishnamoorthy. 1980. Recognition of handprinted Tamil characters. Pattern Recog. 12, 3 (1980), 141–152.Google ScholarCross Ref
Garg Naveen and Sandeep Kaur. 2011. Improvement in efficiency of recognition of handwritten gurumukhi script 1. (2011).Google Scholar
P. B. Khanale and S. D. Chitnis. 2011. Handwritten Devanagari character recognition using artificial neural network. J. Artif. Intell. 4, 1 (2011), 55–62.Google ScholarCross Ref
Sural Shamik and P. K. Das. 1999. An MLP using Hough transform based fuzzy feature extraction for Bengali script recognition. Pattern Recog. Lett. 20, 8 (1999), 771–782.Google ScholarDigital Library
Tiji M. Jose and Amitabh Wahi. 2013. Recognition of Tamil handwritten characters using Daubechies wavelet transforms and feed-forward backpropagation network. Int. J. Comput. Applic. 64, 8 (2013).Google Scholar
H. Y. Abdelazim and M. A. Hashish. 1989. Automatic recognition of handwritten Hindi numerals. COMPEURO 89 Proceedings VLSI and Computer Peripherals. IEEE, 287–298.Google Scholar
Al-Badr Badr and Sabri A. Mahmoud. 1995. Survey and bibliography of Arabic optical text recognition. Sig. Proc. 41, 1 (1995), 49–77.Google ScholarDigital Library
Mohammad S. Khorsheed and William F. Clocksin. 2000. Multi-font Arabic word recognition using spectral features. In Proceedings of the 15th International Conference on Pattern Recognition. (ICPR’00). IEEE, 543–546.Google Scholar
Al-Badr Badr and Robert M. Haralick. 1998. A segmentation-free approach to text recognition with application to Arabic text. Int. J. Doc. Anal. Recog. 1, 3 (1998), 147–166.Google ScholarCross Ref
Kannan R. Jagadeesh and S. Subramanian. 2015. An adaptive approach of Tamil character recognition using deep learning with big data—A survey. In Proceedings of the 49th Convention of the Computer Society of India (CSI’15). Springer, Cham, 557–567.Google Scholar
Ahmed M. Zeki. 2005. The segmentation problem in Arabic character recognition—The state of the art. In Proceedings of the International Conference on Information and Communication Technologies. IEEE, 11–26.Google ScholarCross Ref
Plamondon Réjean and Sargur N. Srihari. 2000. Online and off-line handwriting recognition: A comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1 (2000), 63–84.Google ScholarDigital Library
El-Mahallawy and Mohamed Saad Mostafa. 2008. A large scale HMM-based omni front-written OCR system for cursive scripts. Cairo University, Faculty of Engineering (2008).Google Scholar
B. A. Srinivas, A. Agarwal, and C. R. Rao. 2008. An overview of OCR research in Indian scripts. 2, 2 (2008).Google Scholar
Rani Simpel and Gurpreet Singh Lehal. 2016. Recognition based classification of Gurmukhi manuscripts. In Proceedings of the Symposium on Colossal Data Analysis and Networking (CDAN’16). IEEE, 1–5.Google Scholar
Cheriet Mohamed, Nawwaf Kharma, Cheng-Lin Liu, and Ching Suen. 2007. Character Recognition Systems: A Guide for Students and Practitioners. John Wiley & Sons, 2007.Google Scholar
Agrawal Mudit, Ajay S. Bhaskarabhatla, and Sriganesh Madhvanath. 2004. Data collection for handwriting corpus creation in Indic scripts. In Proceedings of the International Conference on Speech and Language Technology and Oriental (ICSLT-COCOSDA’04).Google Scholar
Trier Øivind Due, Anil K. Jain, and Torfinn Taxt. 1996. Feature extraction methods for character recognition—A survey. Pattern Recog. 29, 4 (1996), 641–662.Google ScholarCross Ref
Kesiman Made Windu Antara, Sophea Prum, Jean-Christophe Burie, and Jean-Marc Ogier. 2016. Study on feature extraction methods for character recognition of Balinese script on palm leaf manuscript images. In Proceedings of the 23rd International Conference on Pattern Recognition (ICPR’16). IEEE, 4017–4022.Google Scholar
Liana M. Lorigo and Venugopal Govindaraju. 2006. Offline Arabic handwriting recognition: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 28, 5 (2006), 712–724.Google ScholarDigital Library
Robert M. Haralick, Karthikeyan Shanmugam, and Its'Hak Dinstein. 1973. Textural features for image classification. IEEE Trans. Syst., Man, Cybern. 6 (1973), 610–621.Google ScholarCross Ref
Aggarwal Ashutosh, Karamjeet Singh, and Kamalpreet Singh. 2015 Use of gradient technique for extracting features from handwritten Gurmukhi characters and numerals. Procedia Comput. Sci. 46 (2015), 1716–1723.Google ScholarCross Ref
Katsouros Vassilis, Vassilis Papavassiliou, Fotini Simistira, and Basilis Gatos. Recognition of Greek polytonic on historical degraded texts using HMMs. In Proceedings of the 12th IAPR Workshop on Document Analysis Systems (DAS’16). IEEE, 346–351.Google Scholar
Kumar Satish. 2011. Study of features for hand-printed recognition. Int. J. Comput. Electr. Autom. Control Inf. Eng. 5 (2011).Google Scholar
Neha J. Pithadia and Dr. Vishal D. Nimavat. 2015. A review on feature extraction techniques for optical character recognition. Int. J. Innov. Res. Comput. Commun. Eng. 3 (2015).Google Scholar
Echi Afef Kacem and Abdel Belaïd. 2017. Impact of features and classifiers combinations on the performances of Arabic recognition systems. In Proceedings of the 1st International Workshop on Arabic Script Analysis and Recognition (ASAR’17). IEEE, 85–89.Google Scholar
A. S. Kavitha, P. Shivakumara, and G. Hemantha Kumar. 2013. Skewness and nearest neighbour based approach for historical document classification. In Proceedings of the International Conference on Communication Systems and Network Technologies. IEEE, 602–606.Google Scholar
Rehman Amjad and Tanzila Saba. 2012. Off-line cursive script recognition: Current advances, comparisons and remaining problems. Artif. Intell. Rev. 37, 4 (2012), 261–288.Google ScholarDigital Library
Pan Chen, Dong Sun Park, Sook Yoon, and Ju Cheng Yang. 2012. Leukocyte image segmentation using simulated visual attention. Exp. Systems with Applic. 39, 8 (2012), 7479–7494.Google ScholarDigital Library
Xu Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning. 2048–2057.Google Scholar
You Quanzeng, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo. 2016. Image captioning with semantic attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4651–4659.Google Scholar
Bahdanau Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. Arxiv Preprint Arxiv:1409.0473 (2014).Google Scholar
Al-Badr Badr and Robert M. Haralick. 1995. Segmentation-free word recognition with application to Arabic. In Proceedings of the 3rd International Conference on Document Analysis and Recognition. IEEE, 355–359.Google Scholar
Jumari Kasmiran and Mohamed A. Ali. 2002. A survey and comparative evaluation of selected off-line Arabic handwritten character recognition systems. Jurn. Teknol. 36, 1 (2002), 1–18.Google Scholar
Pavlidis Theo. 1993. Recognition of printed text under realistic conditions. Pattern Recog. Lett. 14, 4 (1993), 317–326.Google ScholarDigital Library
Cho Wongyu, Seong-Whan Lee, and Jin H. Kim. 1995. Modeling and recognition of cursive words with hidden Markov models. Pattern Recog. 28, 12 (1995), 1941–1953.Google ScholarDigital Library
Drira Fadoua and Franck Lebourgeois. 2012. Denoising textual images using local/non-local smoothing filters: A comparative study. In Proceedings of the International Conference on Frontiers in Handwriting Recognition. IEEE, 521–526.Google Scholar
Atallah M. Al-Shatnawi and Khairuddin Omar. 2009. Skew detection and correction technique for Arabic document images based on centre of gravity. J. Comput. Sci. 5, 5 (2009), 363.Google ScholarCross Ref
Abuhaiba S. I. Ibrahim. 2003. Skew correction of textural documents. J. King Saud Univ.-Comput. Inf. Sci. 15 (2003), 73–93.Google Scholar
Khairuddin bin Omar, Ramlan bin Mahmoud, Md Nasir bin Sulaiman, and Abd Rahman bin Ramli. 2000. The removal of secondaries of Jawi characters. In 2000 TENCON Proceedings: Intelligent Systems and Technologies for the New Millennium (Cat. No. 00CH37119). IEEE, 49–152.Google ScholarCross Ref
Arica Nafiz and Fatos T. Yarman-Vural. 2002. Optical character recognition for cursive handwriting. IEEE Trans. Patt. Anal. Mach. Intell. 24, 6 (2002), 801–813.Google ScholarDigital Library
Shafait Faisal, Daniel Keysers, and Thomas M. Breuel. 2006. Layout analysis of Urdu document images. In Proceedings of the IEEE International Multitopic Conference. IEEE, 293–298.Google Scholar
Jim R. Parker. 2010. Algorithms for Image Processing and Computer Vision. John Wiley & Sons, 2010.Google ScholarDigital Library
Richard G. Casey and Eric Lecolinet. 1996. A survey of methods and strategies in character segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18, 7 (1996), 690–706.Google ScholarDigital Library
Lu Yi and Malayappan Shridhar. 1996. Character segmentation in handwritten words—An overview. Pattern Recog. 29, 1 (1996), 77–96.Google ScholarCross Ref
Kudo Mineichi and Jack Sklansky. 2000. Comparison of algorithms that select features for pattern classifiers. Pattern Recog. 33, 1 (2000), 25–41.Google ScholarCross Ref
Mark A. Hall and Lloyd A. Smith. 1997. Feature subset selection: A correlation based filter approach. (1997), 855–858.Google Scholar
Liu Huan and Rudy Setiono. 1996. A probabilistic approach to feature selection—A filter solution. In Proceedings of the International Conference on Machine Learning. 319–327.Google Scholar
Yu Lei and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th International Conference on Machine Learning (ICML’03). 856–863.Google Scholar
David W. Aha and Richard L. Bankert. 1994. Feature selection for case-based classification of cloud types: An empirical comparison. In Proceedings of the AAAI-94 Workshop on Case-Based Reasoning.Google Scholar
George H. John, Ron Kohavi, and Karl Pfleger. 1994. Irrelevant features and the subset selection problem. In Machine Learning Proceedings 1994. Morgan Kaufmann, 121–129.Google ScholarDigital Library
Pierre A. Devijver and Josef Kittler. 1982. Pattern Recognition: A Statistical Approach. Prentice Hall.Google Scholar
Ben-Bassat Moshe. 1982. Use of distance measures, information measures and error bounds in feature evaluation. In Handbook of Statistics, Vol. 2. Elsevier, 773–791.Google Scholar
Dash Manoranjan and Huan Liu. 1997. Feature selection for classification. Intell. Data Anal. 1, 1--4 (1997), 131–156.Google Scholar
B. B. Chaudhuri and U. Pal. 1997. An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi). In Proceedings of the 4th International Conference on Document Analysis and Recognition. IEEE, 1011–1015.Google Scholar
Avrim L. Blum and Pat Langley. 1997. Selection of relevant features and examples in machine learning. Artif. Intell. 97, 1--2 (1997), 245–271.Google ScholarDigital Library
Richard E. Bellman. 2015. Adaptive Control Processes: A Guided Tour. Princeton University Press.Google Scholar
D. E. Rumelhart and J. L. McClelland. 1986. Learning internal representations by error propagation. In Parallel Distributed Processing. The MIT Press.Google Scholar
Vladimir N. Vapnik. 1995. The nature of statistical learning. Theory (1995).Google ScholarDigital Library
Christopher J. C. Burges. 1998. A tutorial on support vector machines for pattern recognition. Data Mining Knowl. Discov. 2, 2 (1998), 121–167.Google ScholarDigital Library
Chapelle Olivier, Vladimir Vapnik, Olivier Bousquet, and Sayan Mukherjee. 2002. Choosing multiple parameters for support vector machines. Mach. Learn. 46, 1--3 (2002), 131–159.Google Scholar
Cristianini Nello and John Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, 2000.Google Scholar
Y. Fataicha, J. Y. Nie Mohamed Cheriet, and Ching Y. Suen. 2006. Retrieving poorly degraded OCR documents. Int. J. Doc. Anal. Recog. 8, 1 (2006), 15.Google ScholarDigital Library
Natarajan Prem, Shirin Saleem, Rohit Prasad, Ehry MacRostie, and Krishna Subramanian. 2006. Multi-lingual offline handwriting recognition using hidden Markov models: A script-independent approach. In Proceedings of the Summit on Arabic and Chinese Handwriting Recognition. Springer, Berlin, 231–250.Google Scholar
Bassil Youssef and Mohammad Alwani. 2012. OCR post-processing error correction algorithm using Google online spelling suggestion. Arxiv Preprint Arxiv:1204.0191 (2012).Google Scholar
Ching Y. Suen, Marc Berthod, and Shunji Mori. 1980. Automatic recognition of handprinted characters—the state of the art. Proc. IEEE 68, 4 (1980), 469–487.Google ScholarCross Ref
J. Mantas. 1986. An overview of character recognition methodologies. Pattern Recog. 19, 6 (1986), 425–430.Google ScholarCross Ref
V. K. Govindan and A. P. Shivaprasad. 1990. Character recognition—A review. Pattern Recog. 23, 7 (1990), 671–683.Google ScholarDigital Library
Mori Shunji, Ching Y. Suen, and Kazuhiko Yamamoto. 1992. Historical review of OCR research and development. Proc. IEEE 80, 7 (1992), 1029–1058.Google ScholarCross Ref
Bunke Horst and Patrick Shen-pei Wang. 1997. Handbook of Character Recognition and Document Image Analysis. World Scientific.Google Scholar
Nagy George. 2000. Twenty years of document image analysis in PAMI. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2000), 38–62.Google Scholar
Ubul Kurban, Gulzira Tursun, Alimjan Aysa, Donato Impedovo, Giuseppe Pirlo, and Tuergen Yibulayin. 2017. Script identification of multi-script documents: A survey. IEEE Access 5 (2017), 6546–6559.Google Scholar
Peng Liangrui, Changsong Liu, Xiaoqing Ding, and Hua Wang. 2006. Multilingual document recognition research and its application in China. In Proceedings of the 2nd International Conference on Document Image Analysis for Libraries (DIAL’06). IEEE.Google ScholarDigital Library
Nakanishi. 1980. Akira. Writing Systems of the World: Alphabets, Syllabaries, Pictograms. Tuttle Publishing.Google Scholar
Silva Cláudia. 2011. Writing in Portuguese chats: A new writing system? Writ. Lang. Lite. 14, 1 (2011) 143–156.Google ScholarCross Ref
Sk Md Obaidullah, Chayan Halder, K. C. Santosh, Nibaran Das, and Kaushik Roy. 2018. PHDIndic_11: Page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimedia Tools Applic. 77, 2 (2018), 1643–1678.Google ScholarDigital Library
Alaei Alireza, Umapada Pal, and P. Nagabhushan. 2012. Dataset and ground truth for handwritten text in four different scripts. Int. J. Pattern Recog. Artif. Intell. 26, 4 (2012), 1253001.Google ScholarCross Ref
Bhattacharya Ujjwal and B. B. Chaudhuri. 2005. Databases for research on recognition of handwritten characters of Indian scripts. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR’05). IEEE, 789–793.Google Scholar
Ghosh Debashis, Tulika Dube, and Adamane Shivaprasad. 2010. Script recognition—A review. IEEE Trans. Pattern Anal. Mach. Intell. 32, 12 (2010), 2142–2161.Google ScholarDigital Library
Maitra Durjoy Sen, Ujjwal Bhattacharya, and Swapan K. Parui. 2015. CNN based common approach to handwritten character recognition of multiple scripts. In Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR’15). IEEE, 1021–1025.Google Scholar
A. Soumya and G. Hemantha Kumar. 2014. Classification of ancient epigraphs into different periods using random forests. In Proceedings of the 5th International Conference on Signal and Image Processing. IEEE, 171–178.Google Scholar
Easwaramoorthy Sathishkumar, Usha Moorthy, Chunduru Anil Kumar, S. Bharath Bhushan, and Vishnupriya Sadagopan. 2017. Content based image retrieval with enhanced privacy in cloud using Apache Spark. In Proceedings of the International Conference on Data Science Analytics and Applications. Springer, 114–128.Google Scholar
Mehul Gupta, Patel Ankita, Dave Namrata, Goradia Rahul, and Saurin Sheth. 2014. Text-based image segmentation methodology. Procedia Technol. 14 (2014), 465–472.Google ScholarCross Ref

Index Terms

Study on Automated Approach to Recognize Characters for Handwritten and Historical Document
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Optical character recognition
2. Computing methodologies
  1. Artificial intelligence

Index terms have been assigned to the content through auto-classification.

Recommendations

Radon transform and dynamic programming for the Persian handwritten zip code recognition

Pattern recognition is one of the major research areas in computer sciences. Optical character recognition OCR as one of the pattern recognition topics has specifically attracted the interests of many researchers. This paper presents a method for ...
Read More
A hierarchical approach to recognition of handwritten Bangla characters

A novel hierarchical approach is presented here for optical character recognition (OCR) of handwritten Bangla words. Instead of dealing with isolated characters as found in selected works [T.K. Bhowmik, U. Bhattacharya, S.K. Parui, Recognition of Bangla ...
Read More
Understanding NFC-Net: a deep learning approach to word-level handwritten Indic script recognition
Abstract
This paper presents a deep learning architecture modified for resource-constrained environments, called Non-Fully-Connected Network or NFC-Net, based on convolutional neural network architecture in order to solve the problem of Indic script ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 20, Issue 3
May 2021
240 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3457152
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Copyright © 2021 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 August 2021
- Online AM: 7 May 2020
- Revised: 1 April 2020
- Accepted: 1 April 2020
- Received: 1 February 2020
Published in tallip Volume 20, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Historical manuscripts
script recognition
multi-script document
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 83
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Study on Automated Approach to Recognize Characters for Handwritten and Historical Document

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Radon transform and dynamic programming for the Persian handwritten zip code recognition

A hierarchical approach to recognition of handwritten Bangla characters

Understanding NFC-Net: a deep learning approach to word-level handwritten Indic script recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Study on Automated Approach to Recognize Characters for Handwritten and Historical Document

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Radon transform and dynamic programming for the Persian handwritten zip code recognition

A hierarchical approach to recognition of handwritten Bangla characters

Understanding NFC-Net: a deep learning approach to word-level handwritten Indic script recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media