Abstract
Writing identification based on handwriting has many applications in the real world. Due to various forms of written letters in different languages, one of the major challenges in this context is offering an efficient method not being dependent on any specific language. In this paper, we have proposed a new approach based on image mining techniques for offline and text independent writer identification. In this method, each writers’ prominent features are found from training samples, and then identification is done according to them. In the image mining part of the proposed approach, certain techniques including SVM classifier and genetic algorithms are employed. To evaluate this method and show its performance in different languages, CASIA for Chinese, IAM dataset for English and two datasets for Kannada and Persian language handwriting were examined. The experiment results demonstrate that the presented method has over 99% accuracy for these languages. Regarding the results in tested languages and the method details, it is highly likely that our method would have good results in other languages also.
Similar content being viewed by others
References
Alaei A, Nagabhushan P, Pal U (2011) A benchmark Kannada handwritten document dataset and its segmentation. In: International conference on document analysis and recognition (ICDAR). International conference on IEEE, pp 141–145
Baghshah S, Shouraki MB, Kasaei S (2005) A novel fuzzy classifier using fuzzy LVQ to recognize online Persian handwriting. In: Proceedings of the 5h international conference on intelligent systems design and applications, pp 268–273
Bertolini D, Oliveira LS, Justino E, Sabourin R (2013a) Texture-based descriptors for writer identification and verification. Expert Syst Appl 40(6):2069–2080
Bertolini D, Oliveira LS et al (2013b) Texture-based descriptors for writer identification and verification. Int J Expert Syst Appl 40:2069–2080
Bharadwaj A, Thomas A et al (2010) Retrieving handwriting styles: a content based approach to handwritten document retrieval. In: International conference on frontiers in handwriting recognition (ICFHR), pp 265–270
Bovik AC, Clark M, Geisler WS (1990) Multichannel texture analysis using localized spatial filters. IEEE Trans Pattern Anal Mach Intell 12(1):57–73
Brink AA, Smit J et al (2012) Writer identification using directional ink-trace width measurements. Pattern Recognit 45:162–171
Chawla N, Bowyer K, Hall L (2002) SMOTE: synthetic minority over sampling technique. J Artif Intell Res 16:321–357
Cormen TH, Leiserson CE, Rives RL, Stein C (2001) Introduction to algorithms, 2nd edn. The MIT Press, Cambridge
Daniels A, Baird S (2013) Discriminating features for writer identification. In: 12th International conference on document analysis and recognition, p. 1385–1389
Dhandra BV, Vijayalaxmi MB (2012) Writer identification by texture analysis based on Kannada handwriting. Int J Comm Netw Secur 1(4):80–85
Dhandra BV, Vijayalaxmi MB (2015) A novel approach to text dependent writer identification of Kannada handwriting. Proc Comput Sci 49:33–41
Ding H, Wu H, Zhang X et al (2014) Writer identification based on local contour distribution feature. Int J Signal Process Image Process Pattern Recognit 7(1):169–180
Duda RO, Hart PE, Stork DH (2000) Pattern classification, 2nd edn. Wiley, Hoboken
Fiel S, Sablatnig R (2013) Writer identification and writer retrieval using the fisher vector on visual vocabularies. In: Proceeding of 12th international conference on document analysis and recognition (ICDAR), pp 545–549
Gabor D (1946) Theory of communication. J Inst Electr Eng 93(3):429–457
Ghiasi G, Safabakhsh R (2013) Offline text-independent writer identification using codebook and efficient code extraction methods. Int J Image Vis Comput 31:379–391
Gonzalez RC (2006) Digital image processing, 2nd edn. Prentice Hall, New Delhi
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Han J, Kamber M (2000) Data mining: concepts and techniques. Morgan Kaufmann. ISBN 1-55860-489-8
Hannad Y, Siddiqi I, El Kettani MEY (2016) Writer identification using texture descriptors of handwritten fragments. Expert Syst Appl 47:14–22
Helli B, Moghaddam ME (2008) A text independent Persian writer identification system using LCS based classifier. IEEE Int Symp Signal Process Inf Technol 6(10):623–630
Helli B, Moghaddam ME (2010) A text-independent Persian writer identification based on feature relation graph (FRG). Pattern Recognit 43:2199–2209
Helli B, Moghaddam ME (2011) An off-line text-independent Persian writer identification method. Int J Artif Intell Tools 20(3):489–509
Hsu W, Lee M, Zhang J (2002) Image mining: trends and developments. J Intell Inf Syst 19(1):7–233
Hu Y, Yang W, Chen Y (2014) Bag of features approach for offline text-independent Chinese writer identification. In: 2014 IEEE international conference on image processing (ICIP). IEEE
Huang C, Wang C (2006) A GA-based feature selection and parameters optimization for support vector machines. ESWA 31(2):231–240
Jain R, Doermann D (2013) Writer identification using an alphabet of contour gradient descriptors. In: 12th International conference on document analysis and recognition, pp 550–554
Jiang Liangxiao, Zhang Harry, Cai Zhihua (2009) A novel Bayes model: hidden naive Bayes. IEEE Trans Knowl Data Eng 21(10):1361–1371
Kannan A, Mohan V, Anbazhagan N (2010) Image clustering and retrieval using image mining techniques. In: IEEE international conference on computational intelligence and computing research, vol 2
Karunakara K, Mallikarjunaswamy BP (2011) Writer Identification based on offline handwritten document images in Kannada language using empirical mode decomposition method. Int J Comput Appl 30(6):31–36
Khalifa E, Al-maadeed S (2015) Off-line writer identification using an ensemble of grapheme codebook features. Int J Pattern Recognit Lett 59:18–25
Li X, Ding X (2009) Writer identification of Chinese handwriting using grid micro structure feature. In: Proceedings of the third international conference on advances in biometrics, pp 1230–1239
Liu CL, Yin F, Wang DH et al (2011) CASIA online and offline chinese handwriting databases. In: International conference on document analysis and recognition (ICDAR). International conference on IEEE, pp 37–41
Luiz E, Oliveira SD, Benahmed N et al (2001) Feature subset selection using genetic algorithms for handwritten digit recognition. In: Proceedings of 14th Brazilian symposium on IEEE, pp 362–369
Marti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. J Doc Anal Recognit 5:39–46
Marti UV, Messerli R, Bunke H (2001) Writer identification using text line based features. In: Proceeding of sixth international conference on document analysis and recognition (ICDAR), pp 101–105
Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide baseline stereo from maximally stable external regions. In: Proceedings of British machine vision conference, pp 384–393
Ordonez C, Omiecinski E (1998) Image mining: a new approach for data mining. Georgia Institute of Technology, Atlanta
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of Max-dependency, Max_relerance and Min_redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Porwal U, Ramaiah C et al (2012) Structural learning for writer identification in offline handwriting. In: International conference on frontiers in handwriting recognition, pp 417–422
Rajendran P, Madheswaran M (2010) Novel fuzzy association rule image mining algorithm for medical decision support system. Int J Comput Appl 1(20):87–94
Ram SS, Moghaddam ME (2009a) Text-independent Persian writer identification using fuzzy clustering approach. In: International conference on information management and engineering Malaysia, (ICIME), pp 728–731
Ram SS, Moghaddam ME (2009b) A Persian writer identification method based on gradient features and neural networks. In: International congress on image and signal processing, pp 1–4
Sadeghi Ram S, Moghaddam ME (2014) A Persian writer identification method using swarm-based feature selection approach. Int J Biom 6(1):53–74
Saeys Y, Inza I, Larran P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Said HES, Peake GS, Tan TN (1998) Writer identification from non-uniformly skewed handwriting images. In: British machine vision conference, pp 1–10
Schlapbach A, Bunke H (2005) Writer identification using an HMM-based hand-writing recognition system: to normalize the input or not?. In: 12th Conference of the international graphonomics society, Salerno, Italy, pp 138–142
Schlapbach A, Bunke H (2007) A writer identification and verification system using HMM based recognizers. Pattern Anal Appl 10:33–43
Schlapbach A, Bunke H (2008) Off-line writer identification and verification using Gaussian mixture models, studies in computational intelligence. Mach Learn Doc Anal Recognit 90:409–428
Shivram A, Ramaiah C, Govindaraju V (2013a) A hierarchical bayesian approach to online writer identification. IET Biom 2(4):191–198
Shivram A, Ramaiah C, Govindaraju V (2013b) A hierarchical bayesian approach to online writer identification. Int J Biom 5(4):191–198
Singh G, Sundaram S (2015) A subtractive clustering scheme for text-independent online writer identification. In: 13th International conference on document analysis and recognition (ICDAR). IEEE
Stamatatos E, Fakotakis N, Kokkinakis G (2000) Automatic text categorization in terms of genre and author. Comput Linguist 26(4):471–495
Sudhir R (2011) A survey on image mining techniques: theory and applications. Comput Eng Intell Syst 2(6):44–52
Wirth MD, Nikitenko J (2005) Segmentation of the breast region in mammograms using a rule—based fuzzy reasoning algorithm. ICGST-GVIP J 5(2):45–54
Wu X, Tang Y, Bu W (2014) Offline text-independent writer identification based on scale invariant feature transform. IEEE Trans Inf Forensics Secur 9(3):526–536
Yan Y, Chen Q, Deng W et al (2009) Chinese handwriting identification based on stable spectral feature of texture images. Int J Intell Eng Syst 2(1):17–22
Yuan J, Zhao G, Senior YF (2012) Discovering thematic objects in image collections and videos. IEEE Trans Image Process 21(4):2207–2219
Zuo L, Wang Y, Tan T (2002) Personal handwriting identification based on PCA. In: Proceedings of SPIE second international conference on image and graphics, pp 766–771
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Mohammadi, M., Ebrahimi Moghaddam, M. & Saadat, S. A multi-language writer identification method based on image mining and genetic algorithm techniques. Soft Comput 23, 7655–7669 (2019). https://doi.org/10.1007/s00500-018-3393-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-018-3393-5