Skip to main content
Log in

Performance enhancement of online handwritten Tamil symbol recognition with reevaluation techniques

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In this article, we aim at reducing the error rate of the online Tamil symbol recognition system by employing multiple experts to reevaluate certain decisions of the primary support vector machine classifier. Motivated by the relatively high percentage of occurrence of base consonants in the script, a reevaluation technique has been proposed to correct any ambiguities arising in the base consonants. Secondly, a dynamic time-warping method is proposed to automatically extract the discriminative regions for each set of confused characters. Class-specific features derived from these regions aid in reducing the degree of confusion. Thirdly, statistics of specific features are proposed for resolving any confusions in vowel modifiers. The reevaluation approaches are tested on two databases (a) the isolated Tamil symbols in the IWFHR test set, and (b) the symbols segmented from a set of 10,000 Tamil words. The recognition rate of the isolated test symbols of the IWFHR database improves by 1.9 %. For the word database, the incorporation of the reevaluation step improves the symbol recognition rate by 3.5 % (from 88.4 to 91.9 %). This, in turn, boosts the word recognition rate by 11.9 % (from 65.0 to 76.9 %). The reduction in the word error rate has been achieved using a generic approach, without the incorporation of language models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Sundaram S (2011) Lexicon-free recognition strategies for online handwritten Tamil words, PhD Thesis, Indian Institute of Science

  2. Sundaresan CS, Keerthi SS (1999) A study of representations for pen based handwriting recognition of Tamil characters In: Proceedings International Conference on Document Analysis and Recognition, pp 422–425

  3. Toselli AH, Pastor M, Vidal E (2007) On-line handwriting recognition system for Tamil handwritten characters, In: Proceedings Pattern Recognition Image Analysis, pp 370–377

  4. Prasanth L, Babu J, Sharma R, Rao P, Dinesh M (2007) Elastic matching of online handwritten Tamil and Telugu scripts using local features In: Proceedings International Conference on Document Analysis and Recognition, pp 1028–1032

  5. Joshi N, Sita G, Ramakrishnan AG, Madhvanath S (2004) Comparison of elastic matching algorithms for online Tamil handwritten character recognition In: Proceedings International Workshop Frontiers Handwriting Recognition, pp 444–449

  6. Deepu V, Madhvanath S, Ramakrishnan AG (2004) Principal component analysis for online handwritten character recognition In: Proceedings International Conference Pattern Recognition, pp 327–330

  7. Raghavendra BS, Narayanan CK, Sita G, Ramakrishnan AG, Sriganesh M (2005) Prototype learning methods for online handwriting recognition In: Proceedings International Conference on Document Analysis and Recognition, pp 287–291

  8. Swethalakshmi H, Chandra Sekhar C, Chakravarthy VS (2007) Spatiostructural features for recognition of online handwritten characters in Devanagari and Tamil scripts. Proc Intern Conf Artif Neural Netw 2:230–239

    Google Scholar 

  9. Aparna KH, Subramanian V, Kasirajan M, Prakash GV, Chakravarthy VS, Madhvanath S (2004) Online handwriting recognition for Tamil In: Proceedings International Worshop Frontiers Handwriting Recognition, pp 438–443

  10. Vuurpijl L, Schomaker L, Van Erp M (2003) Architectures for detecting and solving conflicts: two-stage classification and support vector classifiers. Intern J Doc Aanal Recogn, 5(4):213–223

    Article  Google Scholar 

  11. Bellili A, Gilloux M, Gallinari P (2003) An MLP–SVM combination architecture for offline handwritten digit recognition. Intern J Doc Aanal Recogn 5(4):244–252

    Article  Google Scholar 

  12. Prevost L, Oudot L, Moises A, Michel-Sendis C, Milgram M (2005) Hybrid generative/discriminative classifier for unconstrained character recognition. Pat Recogn Lett 26(12):1840–1848

    Article  Google Scholar 

  13. Alaei A, Nagabhushan P, Pal U (2009) Fine classification of unconstrained handwritten persian/arabic numerals by removing confusion amongst similar classes In: Proceedings International Conference on Document Analysis and Recognition, pp 601–605

  14. Sharma DV, Lehal GS, Mehta S (2009) Shape encoded post processing of Gurmukhi OCR In: Proceedings International Conference on Document Analysis and Recognition, pp 788–792

  15. Lehal GS, Singh C (2002) A post processor for Gurmukhi OCR. SADHANA 27(1):99–112

    Article  Google Scholar 

  16. Nair K, Jawahar CV (2010) A post-processing scheme for Malayalam using statistical sub-character language models In: Proceedings Document Analysis System, pp 363–370

  17. Chaudhuri BB, Pal U (1996) OCR error detection and correction of an inflectional Indian language script. Proc Intern Conf Pat Recogn 3:245–249

    Google Scholar 

  18. Nethravathi B, Archana CP, Shashikiran K, Ramakrishnan AG, Kumar V (2010) Creation of a huge annotated database for Tamil and Kannada OHR In: Proceedings International Workshop Frontiers Handwriting Recognition, pp 415–420

  19. Isolated IWFHR 2006 Tamil Handwritten Character Dataset www.hpl.hp.com/india/research/penhw-interfaces-1linguistics.html

  20. Burges JC (1998) A tutorial on support vector machines for pattern recognition. Data Mining Knowl Dis 2:121–167

    Article  Google Scholar 

  21. Duda, Hart, Stork (1995) Pattern classification, Springer Wiley

  22. Chang CC, Lin CJ (2011) LIBSVM : a library for support vector machines, ACM transactions on intelligent systems and technology, Vol 2, Issue 3

  23. Rahman AFR, Fairhurst MC (1997) Selective partition algorithm for finding regions of maximum pairwise dissimilarity among statistical class models. Pat Recogn Lett 18(7):605–611

    Article  Google Scholar 

  24. Leung KC, Leung CH (2010) Recognition of handwritten Chinese characters by critical region analysis. Pat Recogn 43(3):949–961

    Article  MATH  Google Scholar 

  25. Sundaram S, Ramakrishnan AG (2011) Lexicon-free, novel segmentation of online handwritten Indic words In: Proceedings International Conference on Document Analysis and Recognition, pp 1175–1179

  26. Suresh S, Ramakrishnan AG (2013) Attention-feedback based robust segmentation of online handwritten isolated Tamil words. ACM Trans Asian Lang Inform Process vol 12, Issue 1, Article 4, (March 2013)

Download references

Acknowledgements

The authors thank Technology Development for Indian Languages (TDIL), Department of Information Technology, Govt of India for funding this work. The help rendered by the staff of Medical Intelligence and Language Engineering (MILE) Laboratory in data collection and truthing is acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suresh Sundaram.

Additional information

Originality and contributions

1. In the literature, in the context of online Indic handwriting, there is hardly any comprehensive work that addresses the problem of disambiguating confused characters. To the knowledge of the authors, this may be a maiden attempt at reducing the error rate of online handwritten Tamil symbols with reevaluation strategies.

2. A dynamic time-warping approach has been proposed to capture the regions of the trace that discriminate confused Tamil symbols. Thereafter, novel class-specific discriminative features are proposed from the extracted regions to disambiguate these symbols.

3. Dedicated to each confusion set (derived from the confusion matrix), an SVM classifier (referred to as expert) has been proposed. The expert classifier operates on the novel discriminative features.

4. A set of novel features have been proposed to reduce the confusions of vowel modifiers in CV combinations.

5. A systematic study of the occurrence frequency of linguistically similar Tamil symbols has been performed on a text corpus.

Appendix: The complete list of Tamil characters

Appendix: The complete list of Tamil characters

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sundaram, S., Ramakrishnan, A.G. Performance enhancement of online handwritten Tamil symbol recognition with reevaluation techniques. Pattern Anal Applic 17, 587–609 (2014). https://doi.org/10.1007/s10044-013-0353-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-013-0353-7

Keywords

Navigation