Abstract
Computational epigraphy is the study of an ancient script where the computer science and mathematical model is relatively built for epigraphy. The Tamil-Brahmi inscriptions are the most ancient of the extant written of the Tamil. The inscriptions furnish valuable information on many aspects of life in the ancient Tamil country from a period anterior to the literary age of Sangam. The recognition of the script and systematic analysis of the script is required. The recognition of this script is complex, containing various curves for a single character and the style of writing overlap with curves and lines. Generating corpus of the script is necessary, since it is the initial step for computational epigraphy. The archaeological department has supported the raw data that helped to develop a corpus of Tamizhi. In this article, we have implemented a convolution neural network in various ways, i.e., (i) Training the CNN model from scratch a Softmax classifier in a sequential model (ii) using MobileNet: Transfer learning paradigm from a pre-trained model on a Tamizhi dataset (iii) Building Model with CNN and SVM (iv) SVM for evaluation of best accuracy to recognize handwritten Brahmi characters. To train the CNN Model an extensive TAMIZHİ handwritten Brahmi Dataset of 1lakh and 90,000 isolated samples for the character has been created and deployed. The designed dataset consists of 9 vowels and 18 consonants and 209 class so researchers can use machine learning. MobileNet outperformed among all the models implemented with the accuracy of 68.3%, whereas other algorithm ranges from 58% to 67% with respect to the Tamizhi dataset. MobileNet model is trained and tested for the dataset of vowels (8 class), consonants (18 class), and consonants vowels (26 class) with the accuracy of 98.1%, 97.7%, 97.5%, respectively.
- Thiru. I. Mahadevan. 1970. Tami-Brahmi inscriptions. Lectures delivered at the seminar on archaeology, conducted by the Tamil Nadu state department of archaeology, under the auspices of Madurai University. The archeological library book.Google Scholar
- T. Sri. Sridhar. Tamil-Brahmi kalvettukal. Tamil Nadu State Department of Archaeology. The archeological library book.Google Scholar
- Mahadevan Iravatham. 2003. Early Tamil epigraphy. From the Earliest Times to the Sixth Century AD (2003).Google Scholar
- Rabby, A. K. M. Shahariar Azad, Sadeka Haque, Sheikh Abujar, and Syed Akhter Hossain. 2018. Ekushnet: Using convolutional neural network for Bangla handwritten recognition. Procedia Comput. Sci. 143 (2018), 603–610.Google ScholarCross Ref
- P. Rajan and S. Sridhar. 2017. Identification of ancient Tamil letters and its characters: Automatic date fixation based on contour-let technique. In Proceedings of the International Conference on Graphics and Signal Processing. 40–43. Google ScholarDigital Library
- Papaodysseus Constantin, Panayiotis Rousopoulos, Fotios Giannopoulos, Solomon Zannos, Dimitris Arabadjis, Mihalis Panagopoulos, E. Kalfa, Christopher Blackwell, and Stephen Tracy. 2014. Identifying the writer of ancient inscriptions and byzantine codices. A novel approach. Comput. Vis. Image Underst. 121 (2014), 57–73. Google ScholarDigital Library
- Idicula Sumam Mary. 2012. An online character recognition system to convert Grantha script to Malayalam. Arxiv Preprint ArXiv:1208.4316 (2012).Google Scholar
- Elleuch Mohamed, Najiba Tagougui, and Monji Kherallah. 2017. Optimization of DBN using regularization methods applied for recognizing Arabic handwritten script. Procedia Comput. Sci. 108 (2017), 2292–2297.Google ScholarCross Ref
- Chaudhari Shailesh and Ravi M. Gulati. 2016. Script identification using Gabor feature and SVM classifier. Procedia Comput. Sci. 79 (2016), 85–92.Google ScholarCross Ref
- Getu Siranesh. 2016. Ancient Ethiopic Manuscript Recognition Using Deep Learning Artificial Neural Network. Ph.D. Dissertation. Addis Ababa University.Google Scholar
- Sarkhel Ritesh, Nibaran Das, Aritra Das, Mahantapas Kundu, and Mita Nasipuri. 2017. A multi-scale deep quad tree–based feature extraction method for the recognition of isolated handwritten characters of popular indic scripts. Pattern Recog. 71 (2017), 78–93.Google ScholarCross Ref
- Nguyen Cong Kha, Cuong Tuan Nguyen, and Nakagawa Masaki. 2017. Tens of thousands of nom character recognition by deep convolution neural networks. In Proceedings of the 4th International Workshop on Historical Document Imaging and Processing. 37–41. Google ScholarDigital Library
- Das Nibaran, Kallol Acharya, Ram Sarkar, Subhadip Basu, Mahantapas Kundu, and Mita Nasipuri. 2014. A benchmark image database of isolated Bangla handwritten compound characters. Int. J. Docum. Anal. Recog. 17, 4 (2014), 413–431. Google ScholarDigital Library
- Jonathan J. Hull. 1994. A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intel. 16, 5 (1994), 550–554. Google ScholarDigital Library
- C. V. Jawahar, Anand Kumar, A. Phaneendra, and K. J. Jinesh. 2009. Building datasets for Indian language OCR research. In Guide to OCR for Indic Scripts. Springer, London, 3–25.Google Scholar
- Liu Cheng-Lin, Fei Yin, Da-Han Wang, and Qiu-Feng Wang. 2011. CASIA online and offline Chinese handwriting databases. In Proceedings of the IEEE International Conference on Document Analysis and Recognition. 37–41. Google ScholarDigital Library
- Su Tonghua, Tianwen Zhang, and Dejun Guan. 2007. Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text. Int. J. Doc. Anal. Recog. 10, 1 (2007), 27. Google ScholarDigital Library
- Vikas J. Dongre and Vijay H. Mankar. 2012. Development of comprehensive Devnagari numeral and character database for offline handwritten character recognition. Appl. Comput. Intell. Soft Comput. 2012 (2012). Google ScholarDigital Library
- Khan Haider Adnan, Abdullah Al Helal, and Khawza I. Ahmed. 2014. Handwritten Bangla digit recognition using sparse representation classifier. In Proceedings of the IEEE International Conference on Informatics, Electronics & Vision (ICIEV’14). 1–6.Google Scholar
- Agrawal Mudit, Ajay S. Bhaskarabhatla, and Sriganesh Madhvanath. 2004. Data collection for handwriting corpus creation in Indic scripts. In Proceedings of the International Conference on Speech and Language Technology and Oriental COCOSDA (ICSLT-COCOSDA’04).Google Scholar
- Chen Feiyang, Nan Chen, Hanyang Mao, and Hanlin Hu. 2018. Assessing four neural networks on handwritten digit recognition dataset (MNIST). Arxiv Preprint Arxiv:1811.08278 (2018).Google Scholar
- Sabri A. Mahmoud, Irfan Ahmad, Mohammad Alshayeb, Wasfi G. Al-Khatib, Mohammad Tanvir Parvez, Gernot A. Fink, Volker Märgner, and Haikal El Abed. 2012. Khatt: Arabic offline handwritten text database. In Proceedings of the IEEEInternational Conference on Frontiers in Handwriting Recognition. 449–454. Google ScholarDigital Library
- Dana H. Ballard. 1987. Generalizing the Hough transform to detect arbitrary shapes. In Readings in Computer Vision. Morgan Kaufmann, 714–725. Google ScholarDigital Library
- LeCun Yann, Fu Jie Huang, and Leon Bottou. 2004. Learning methods for generic object recognition with invariance to pose and lighting. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04). Google ScholarDigital Library
- Chollet Francois. 2016. Building powerful image classification models using very little data. International Conference paper.Google Scholar
- Srivastava Nitish, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929–1958. Google ScholarDigital Library
- Sharma Richa and Tarun Mudgal. 2019. Primitive feature-based optical character recognition of the Devanagari script. In Progress in Advanced Computing and Intelligent Engineering. Springer, Singapore, 249–259.Google Scholar
- Ghosh Rajib, Chirumavila Vamshi, and Prabhat Kumar. 2019. RNN based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning. Pattern Recog. 92 (2019), 203–218.Google ScholarCross Ref
- Roy Partha Pratim, Ayan Kumar Bhunia, Ayan Das, Prasenjit Dey, and Umapada Pal. 2016. HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recog. 60 (2016), 1057–1075. Google ScholarDigital Library
- Soora Narasimha Reddy and Parag S. Deshpande. 2018. A novel local skew correction and segmentation approach for printed multilingual Indian documents. Alexandria Eng. J. 57, 3 (2018), 1609–1618.Google ScholarCross Ref
- Varghese K. Sonu, Ajay James, and Saravanan Chandran. 2016. A novel tri-stage recognition scheme for handwritten Malayalam character recognition. Procedia Technol. 24, 1 (2016), 1333–1340.Google ScholarCross Ref
- Zhangrila Louis Lady. 2018. Accuracy level of $p algorithm for Javanese script detection on Android-based application. Procedia Comput. Sci. 135 (2018), 416–424.Google ScholarCross Ref
- Raj V. Amrutha, R. L. Jyothi, and A. Anilkumar. 2017. Grantha script recognition from ancient palm leaves using histogram of orientation shape context. In Proceedings of the IEEE International Conference on Computing Methodologies and Communication (ICCMC’17). 790–794.Google Scholar
- Saleem Sajid, Fabian Hollaus, and Robert Sablatnig. 2014. Recognition of degraded ancient characters based on dense SIFT. In Proceedings of the 1st International Conference on Digital Access to Textual Cultural Heritage. 15–20. Google ScholarDigital Library
- B. R. Kavitha and C. Srimathi. 2019. Benchmarking on offline handwritten Tamil character recognition using convolutional neural networks. J. King Saud Univ.-Comput. Inf. Sci. (2019).Google Scholar
- P. B. Khanale and S. D. Chitnis. 2011. Handwritten Devanagari character recognition using artificial neural network. J. Artif. Intell. 4, 1 (2011), 55–62.Google ScholarCross Ref
- Khaled S. Younis. 2017. Arabic handwritten character recognition based on deep convolutional neural networks. Jordanian J. Comput. Inf. Technol. 3, 3 (2017), 186–200.Google ScholarCross Ref
- Samir Benbakreti and Aoued Boukelif. 2018. New approach for online Arabic manuscript recognition by deep belief network. (2018).Google Scholar
- Al-Aziz, Ahmad M. Abd, Mervat Gheith, and Ayman F. Sayed. 2011. Recognition for old Arabic manuscripts using spatial gray level dependence (SGLD). Egyptian Inf. J. 12, 1 (2011), 37–43.Google ScholarCross Ref
- Zhong Guoqiang and Mohamed Cheriet. 2015. Tensor representation learning based image patch analysis for text identification and recognition. Pattern Recog. 48, 4 (2015), 1211–1224. Google ScholarDigital Library
- Sural Shamik and P. K. Das. 1999. An MLP using Hough transform based fuzzy feature extraction for Bengali script recognition. Pattern Recog. Lett. 20, 8 (1999), 771–782. Google ScholarDigital Library
- D. T. Mane and U. V. Kulkarni. 2018. Visualizing and understanding customized convolutional neural network for recognition of handwritten Marathi numerals. Procedia Comput. Sci. 132 (2018), 1123–1137.Google ScholarDigital Library
- Hasan Md, Fatima Tuz Zohora Asha, and Talha Zubaer. 2019. Bangla handwritten character recognition using convolutional neural network. (2019).Google Scholar
- Soselia Davit, Magda Tsintsadze, Levan Shugliashvili, Irakli Koberidze, Shota Amashukeli, and Sandro Jijavadze. 2018. On Georgian handwritten character recognition. IFAC-Papers OnLine 51, 30 (2018), 161–165.Google ScholarCross Ref
- Guruprasad Prathima and Jharna Majumdar. 2016. Multimodal recognition framework: an accurate and powerful Nandinagari handwritten character recognition model. Procedia Comput. Sci. 89 (2016), 836–844.Google ScholarCross Ref
- Gautam Neha and Soo See Chai. 2017. Optical character recognition for Brahmi script using geometric method. J. Telecommun., Electron. Comput. Eng. 9, 3–11 (2017), 131–136.Google Scholar
- Supriana Iping and Albadr Nasution. 2013. Arabic character recognition system development. Procedia Technol. 11 (2013), 334–341.Google ScholarCross Ref
- Naz Saeeda, Saad Bin Ahmed, Riaz Ahmad, and Muhammad Imran Razzak. 2016. Zoning features and 2DLSTM for Urdu text-line recognition. In Proceedings of the International Conference on Knowledge-based and Intelligent Information & Engineering Systems. 16–22.Google Scholar
- Lehal Gurpreet Singh and Ankur Rana. 2013. Recognition of Nastalique Urdu ligatures. In Proceedings of the 4th International Workshop on Multilingual OCR. 1–5. Google ScholarDigital Library
- Diem Markus and Robert Sablatnig. 2010. Recognizing characters of ancient manuscripts. In Computer Vision and Image Analysis of Art, 7531, 753106. International Society for Optics and Photonics.Google ScholarCross Ref
- K. C. Kamal, Zhendong Yin, Mingyang Wu, and Zhilu Wu. 2019. Depthwise separable convolution architectures for plant disease classification. Comput. Electron. Agri. 165 (2019), 104948.Google ScholarCross Ref
- Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv Preprint Arxiv:1704.04861 (2017).Google Scholar
- Easwaramoorthy Sathishkumar, F. Sophia, and A. Prathik. 2016. Biometric authentication using finger nails. In Proceedings of the IEEE International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS’16). 1–6.Google Scholar
- Easwaramoorthy Sathishkumar, Usha Moorthy, Chunduru Anil Kumar, S. Bharath Bhushan, and Vishnupriya Sadagopan. 2017. Content based image retrieval with enhanced privacy in cloud using Apache Spark. In Proceedings of the International Conference on Data Science Analytics and Applications. Springer Singapore, 114–128.Google Scholar
Index Terms
- TAMIZHİ: Historical Tamil-Brahmi Script Recognition Using CNN and MobileNet
Recommendations
Off-line cursive handwritten Tamil character recognition
In spite of several advancements in technologies pertaining to Optical character recognition, handwriting continues to persist as means of documenting information for day-to-day life. The process of segmentation and recognition pose quiets a lot of ...
Bangla Handwritten Digit Recognition Using Deep Convolutional Neural Network
ICCA 2020: Proceedings of the International Conference on Computing AdvancementsHandwritten Bangla digit recognition is one of the most challenging computer vision problems due to its diverse shapes and writing style. Recently deep learning based convolutional neural network known as deep CNN finds wide-spread applications in ...
Gujarati Script Recognition
AbstractCharacter recognition is the extraction of printed or handwritten text from images into machine-readable format. The extracted text can be easily edited, modified and efficiently stored. While there are several Optical Character Recognition (OCR) ...
Comments