Skip to main content

Advertisement

Log in

A multi-language writer identification method based on image mining and genetic algorithm techniques

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Writing identification based on handwriting has many applications in the real world. Due to various forms of written letters in different languages, one of the major challenges in this context is offering an efficient method not being dependent on any specific language. In this paper, we have proposed a new approach based on image mining techniques for offline and text independent writer identification. In this method, each writers’ prominent features are found from training samples, and then identification is done according to them. In the image mining part of the proposed approach, certain techniques including SVM classifier and genetic algorithms are employed. To evaluate this method and show its performance in different languages, CASIA for Chinese, IAM dataset for English and two datasets for Kannada and Persian language handwriting were examined. The experiment results demonstrate that the presented method has over 99% accuracy for these languages. Regarding the results in tested languages and the method details, it is highly likely that our method would have good results in other languages also.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Alaei A, Nagabhushan P, Pal U (2011) A benchmark Kannada handwritten document dataset and its segmentation. In: International conference on document analysis and recognition (ICDAR). International conference on IEEE, pp 141–145

  • Baghshah S, Shouraki MB, Kasaei S (2005) A novel fuzzy classifier using fuzzy LVQ to recognize online Persian handwriting. In: Proceedings of the 5h international conference on intelligent systems design and applications, pp 268–273

  • Bertolini D, Oliveira LS, Justino E, Sabourin R (2013a) Texture-based descriptors for writer identification and verification. Expert Syst Appl 40(6):2069–2080

    Article  Google Scholar 

  • Bertolini D, Oliveira LS et al (2013b) Texture-based descriptors for writer identification and verification. Int J Expert Syst Appl 40:2069–2080

    Article  Google Scholar 

  • Bharadwaj A, Thomas A et al (2010) Retrieving handwriting styles: a content based approach to handwritten document retrieval. In: International conference on frontiers in handwriting recognition (ICFHR), pp 265–270

  • Bovik AC, Clark M, Geisler WS (1990) Multichannel texture analysis using localized spatial filters. IEEE Trans Pattern Anal Mach Intell 12(1):57–73

    Article  Google Scholar 

  • Brink AA, Smit J et al (2012) Writer identification using directional ink-trace width measurements. Pattern Recognit 45:162–171

    Article  Google Scholar 

  • Chawla N, Bowyer K, Hall L (2002) SMOTE: synthetic minority over sampling technique. J Artif Intell Res 16:321–357

  • Cormen TH, Leiserson CE, Rives RL, Stein C (2001) Introduction to algorithms, 2nd edn. The MIT Press, Cambridge

    Google Scholar 

  • Daniels A, Baird S (2013) Discriminating features for writer identification. In: 12th International conference on document analysis and recognition, p. 1385–1389

  • Dhandra BV, Vijayalaxmi MB (2012) Writer identification by texture analysis based on Kannada handwriting. Int J Comm Netw Secur 1(4):80–85

  • Dhandra BV, Vijayalaxmi MB (2015) A novel approach to text dependent writer identification of Kannada handwriting. Proc Comput Sci 49:33–41

    Article  Google Scholar 

  • Ding H, Wu H, Zhang X et al (2014) Writer identification based on local contour distribution feature. Int J Signal Process Image Process Pattern Recognit 7(1):169–180

    Google Scholar 

  • Duda RO, Hart PE, Stork DH (2000) Pattern classification, 2nd edn. Wiley, Hoboken

    MATH  Google Scholar 

  • Fiel S, Sablatnig R (2013) Writer identification and writer retrieval using the fisher vector on visual vocabularies. In: Proceeding of 12th international conference on document analysis and recognition (ICDAR), pp 545–549

  • Gabor D (1946) Theory of communication. J Inst Electr Eng 93(3):429–457

    Google Scholar 

  • Ghiasi G, Safabakhsh R (2013) Offline text-independent writer identification using codebook and efficient code extraction methods. Int J Image Vis Comput 31:379–391

    Article  Google Scholar 

  • Gonzalez RC (2006) Digital image processing, 2nd edn. Prentice Hall, New Delhi

    Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  • Han J, Kamber M (2000) Data mining: concepts and techniques. Morgan Kaufmann. ISBN 1-55860-489-8

  • Hannad Y, Siddiqi I, El Kettani MEY (2016) Writer identification using texture descriptors of handwritten fragments. Expert Syst Appl 47:14–22

    Article  Google Scholar 

  • Helli B, Moghaddam ME (2008) A text independent Persian writer identification system using LCS based classifier. IEEE Int Symp Signal Process Inf Technol 6(10):623–630

    Google Scholar 

  • Helli B, Moghaddam ME (2010) A text-independent Persian writer identification based on feature relation graph (FRG). Pattern Recognit 43:2199–2209

    Article  Google Scholar 

  • Helli B, Moghaddam ME (2011) An off-line text-independent Persian writer identification method. Int J Artif Intell Tools 20(3):489–509

    Article  Google Scholar 

  • Hsu W, Lee M, Zhang J (2002) Image mining: trends and developments. J Intell Inf Syst 19(1):7–233

    Article  Google Scholar 

  • Hu Y, Yang W, Chen Y (2014) Bag of features approach for offline text-independent Chinese writer identification. In: 2014 IEEE international conference on image processing (ICIP). IEEE

  • Huang C, Wang C (2006) A GA-based feature selection and parameters optimization for support vector machines. ESWA 31(2):231–240

    Google Scholar 

  • Jain R, Doermann D (2013) Writer identification using an alphabet of contour gradient descriptors. In: 12th International conference on document analysis and recognition, pp 550–554

  • Jiang Liangxiao, Zhang Harry, Cai Zhihua (2009) A novel Bayes model: hidden naive Bayes. IEEE Trans Knowl Data Eng 21(10):1361–1371

    Article  Google Scholar 

  • Kannan A, Mohan V, Anbazhagan N (2010) Image clustering and retrieval using image mining techniques. In: IEEE international conference on computational intelligence and computing research, vol 2

  • Karunakara K, Mallikarjunaswamy BP (2011) Writer Identification based on offline handwritten document images in Kannada language using empirical mode decomposition method. Int J Comput Appl 30(6):31–36

    Google Scholar 

  • Khalifa E, Al-maadeed S (2015) Off-line writer identification using an ensemble of grapheme codebook features. Int J Pattern Recognit Lett 59:18–25

    Article  Google Scholar 

  • Li X, Ding X (2009) Writer identification of Chinese handwriting using grid micro structure feature. In: Proceedings of the third international conference on advances in biometrics, pp 1230–1239

  • Liu CL, Yin F, Wang DH et al (2011) CASIA online and offline chinese handwriting databases. In: International conference on document analysis and recognition (ICDAR). International conference on IEEE, pp 37–41

  • Luiz E, Oliveira SD, Benahmed N et al (2001) Feature subset selection using genetic algorithms for handwritten digit recognition. In: Proceedings of 14th Brazilian symposium on IEEE, pp 362–369

  • Marti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. J Doc Anal Recognit 5:39–46

    Article  MATH  Google Scholar 

  • Marti UV, Messerli R, Bunke H (2001) Writer identification using text line based features. In: Proceeding of sixth international conference on document analysis and recognition (ICDAR), pp 101–105

  • Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide baseline stereo from maximally stable external regions. In: Proceedings of British machine vision conference, pp 384–393

  • Ordonez C, Omiecinski E (1998) Image mining: a new approach for data mining. Georgia Institute of Technology, Atlanta

    Google Scholar 

  • Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of Max-dependency, Max_relerance and Min_redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  • Porwal U, Ramaiah C et al (2012) Structural learning for writer identification in offline handwriting. In: International conference on frontiers in handwriting recognition, pp 417–422

  • Rajendran P, Madheswaran M (2010) Novel fuzzy association rule image mining algorithm for medical decision support system. Int J Comput Appl 1(20):87–94

    Google Scholar 

  • Ram SS, Moghaddam ME (2009a) Text-independent Persian writer identification using fuzzy clustering approach. In: International conference on information management and engineering Malaysia, (ICIME), pp 728–731

  • Ram SS, Moghaddam ME (2009b) A Persian writer identification method based on gradient features and neural networks. In: International congress on image and signal processing, pp 1–4

  • Sadeghi Ram S, Moghaddam ME (2014) A Persian writer identification method using swarm-based feature selection approach. Int J Biom 6(1):53–74

    Google Scholar 

  • Saeys Y, Inza I, Larran P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Article  Google Scholar 

  • Said HES, Peake GS, Tan TN (1998) Writer identification from non-uniformly skewed handwriting images. In: British machine vision conference, pp 1–10

  • Schlapbach A, Bunke H (2005) Writer identification using an HMM-based hand-writing recognition system: to normalize the input or not?. In: 12th Conference of the international graphonomics society, Salerno, Italy, pp 138–142

  • Schlapbach A, Bunke H (2007) A writer identification and verification system using HMM based recognizers. Pattern Anal Appl 10:33–43

    Article  MathSciNet  Google Scholar 

  • Schlapbach A, Bunke H (2008) Off-line writer identification and verification using Gaussian mixture models, studies in computational intelligence. Mach Learn Doc Anal Recognit 90:409–428

    Article  Google Scholar 

  • Shivram A, Ramaiah C, Govindaraju V (2013a) A hierarchical bayesian approach to online writer identification. IET Biom 2(4):191–198

    Article  Google Scholar 

  • Shivram A, Ramaiah C, Govindaraju V (2013b) A hierarchical bayesian approach to online writer identification. Int J Biom 5(4):191–198

    Google Scholar 

  • Singh G, Sundaram S (2015) A subtractive clustering scheme for text-independent online writer identification. In: 13th International conference on document analysis and recognition (ICDAR). IEEE

  • Stamatatos E, Fakotakis N, Kokkinakis G (2000) Automatic text categorization in terms of genre and author. Comput Linguist 26(4):471–495

    Article  Google Scholar 

  • Sudhir R (2011) A survey on image mining techniques: theory and applications. Comput Eng Intell Syst 2(6):44–52

    Google Scholar 

  • Wirth MD, Nikitenko J (2005) Segmentation of the breast region in mammograms using a rule—based fuzzy reasoning algorithm. ICGST-GVIP J 5(2):45–54

    Google Scholar 

  • Wu X, Tang Y, Bu W (2014) Offline text-independent writer identification based on scale invariant feature transform. IEEE Trans Inf Forensics Secur 9(3):526–536

    Article  Google Scholar 

  • Yan Y, Chen Q, Deng W et al (2009) Chinese handwriting identification based on stable spectral feature of texture images. Int J Intell Eng Syst 2(1):17–22

    Article  Google Scholar 

  • Yuan J, Zhao G, Senior YF (2012) Discovering thematic objects in image collections and videos. IEEE Trans Image Process 21(4):2207–2219

    Article  MathSciNet  MATH  Google Scholar 

  • Zuo L, Wang Y, Tan T (2002) Personal handwriting identification based on PCA. In: Proceedings of SPIE second international conference on image and graphics, pp 766–771

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohsen Ebrahimi Moghaddam.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohammadi, M., Ebrahimi Moghaddam, M. & Saadat, S. A multi-language writer identification method based on image mining and genetic algorithm techniques. Soft Comput 23, 7655–7669 (2019). https://doi.org/10.1007/s00500-018-3393-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3393-5

Keywords

Navigation