Ensemble of HMM classifiers based on the clustering validity index for a handwritten numeral recognizer

Ko, Albert Hung-Ren; Sabourin, Robert; Britto, Alceu de Souza

doi:10.1007/s10044-007-0094-6

Ensemble of HMM classifiers based on the clustering validity index for a handwritten numeral recognizer

Theoretical Advances
Published: 29 November 2007

Volume 12, pages 21–35, (2009)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Albert Hung-Ren Ko¹,
Robert Sabourin¹ &
Alceu de Souza Britto Jr.²

167 Accesses
Explore all metrics

Abstract

A new scheme for the optimization of codebook sizes for Hidden Markov Models (HMMs) and the generation of HMM ensembles is proposed in this paper. In a discrete HMM, the vector quantization procedure and the generated codebook are associated with performance degradation. By using a selected clustering validity index, we show that the optimization of HMM codebook size can be selected without training HMM classifiers. Moreover, the proposed scheme yields multiple optimized HMM classifiers, and each individual HMM is based on a different codebook size. By using these to construct an ensemble of HMM classifiers, this scheme can compensate for the degradation of a discrete HMM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering and maximum likelihood search for efficient statistical classification with medium-sized databases

Article 16 September 2015

The efficiency of the NSHP^Z-HMM: theoretical and practical study

Article 19 July 2018

Data Classification with Ensembles of One-Class Support Vector Machines and Sparse Nonnegative Matrix Factorization

References

Altincay H (2005) A Dempster-Shafer theoretic framework for boosting based ensemble design. Pattern Anal Appl J 8(3):287–302
Article MathSciNet Google Scholar
Arica N, Vural FTY (2000) A shape descriptor based on circular Hidden Markov Model. In: 15th International conference on pattern recognition (ICPR00)
Bandyopadhyay S, Maulik U (2001) Non-parametric genetic clustering: comparison of validity indices. IEEE Trans Syst Man Cybern Part-C 31(1):120–125
Article Google Scholar
Bengio Y (1999) Markovian models for sequential data. Neural Comput Surv 2:129–162
Google Scholar
Britto A Jr. (2001) A two-stage HMM-based method for recognizing handwritten numeral strings. Ph.D. Thesis, Pontifical Catholic University of Paraná
Britto AS, Sabourin R, Bortolozzi F, Suen CY (2003) Recognition of handwritten numeral strings using a two-stage Hmm-based method. Int J Doc Anal Recognit 5(2–3):102–117
Google Scholar
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Int J Inf Fusion 6(1):5–20
Article Google Scholar
Conversano C (2002) Bagged mixtures of classifiers using model scoring criteria. Pattern Anal Appl 5(4):351–362
Article MathSciNet Google Scholar
Davis RIA, Lovell BC (2004) Comparing and evaluating HMM ensemble training algorithms using train and test and condition number criteria. Pattern Anal Appl 6(4):327–335
MathSciNet Google Scholar
Dietterich TG (2002) Machine learning for sequential data: a review. In: Structural, Structural, syntactic, and statistical pattern recognition, Lecture Notes in Computer Science, vol 2396. Springer, Heidelberg, , pp 15–30
Eppstein D (1998) Fast hierarchical clustering and other applications of dynamic closest pairs. In: Proceedings of the ninth ACM-SIAM symposium on discrete algorithms, pp 619–628
Grove A, Schuurmans D (1998) Boosting in the limit: maximizing the margin of learned ensembles. In: Proceedings of the fifteenth national conference on artificial intelligence, pp 692–699
Guenter S, Bunke H (2005) Off-line cursive handwriting recognition using multiple classifier systems—on the influence of vocabulary, ensemble, and training set size. Opt Lasers Eng 43:437–454
Article Google Scholar
Guenter S, Bunke H (2004) Ensembles of classifiers derived from multiple prototypes and their application to handwriting recognition. International workshop on multiple classifier systems (MCS 2004), pp 314–323
Guenter S, Bunke H (2003) Off-line cursive handwriting recognition—on the influence of training set and vocabulary size in multiple classifier systems. In: Proceedings of the 11th conference of the international graphonomics society
Guenter S, Bunke H (2002) A new combination scheme for HMM-based classifiers and its application to handwriting recognition. In: Proceedings of 16th international conference on pattern recognition II, pp 332–337
Guenter S, Bunke H (2002) Generating classifier ensembles from multiple prototypes and its application to handwriting recognition. In: Proceedings of the 3rd international workshop on multiple classifier systems, pp 179–188
Guenter S, Bunke H (2002) Creation of classifier ensembles for handwritten word recognition using feature selection algorithms. In: Proceedings of the 8th international workshop on frontiers in handwriting recognition, pp 183–188
Guenter S, Bunke H (2003) Ensembles of classifiers for handwritten word recognition. Int J Doc Anal Recognit 5(4):224–232
Article Google Scholar
Guenter S, Bunke H (2003) New boosting algorithms for classification problems with large number of classes applied to a handwritten word recognition task. In: Proceedings of the 4th international workshop on multiple classifier systems, pp 326–335
Guenter S, Bunke H (2003) Fast feature selection in an HMM-based multiple classifier system for handwriting recognition. Pattern recognition, proceedings of the 25th DAGM symposium, pp 289–296
Guenter S, Bunke H (2004) Optimization of weights in a multiple classifier handwritten word recognition system using a genetic algorithm. Electron Lett Comput Vis Image Anal 3(1):25–44
Google Scholar
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst 17(2–3)
Google Scholar
Halkidi M, Batistakis Y, Vazirgiannis M (2002) Clustering validity checking methods: part II. SIGMOD Rec 31(3):19–27
Article Google Scholar
Ho TK (1998) The random space method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Article Google Scholar
Huang X, Acero A, Hon H (2001) Spoken language processing—a guide to theory, algorithm, and system development. Prentice-Hall, Englewood Cliffs
Google Scholar
Johnson E, Kargupta H (1999) Collective, hierarchical clustering from distributed, heterogeneous data. In: Large-scale parallel KDD systems, pp 221–244
Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239
Article Google Scholar
Ko A, Sabourin R, Britto A Jr. (2006) Combining diversity and classification accuracy for ensemble selection in random subspaces. In: IEEE world congress on computational intelligence (WCCI 2006)—international joint conference on neural networks (IJCNN 2006)
Ko A, Sabourin R, Britto A Jr. (2006) Evolving ensemble of classifiers in random subspace. Genetic and evolutionary computation conference (GECCO 2006)
Kuncheva LI (2002) A theoretical study on six classifier fusion strategies. IEEE Trans Pattern Anal Mach Intell 24(2):281–286
Article Google Scholar
Kuncheva LI, Skurichina M, Duin RPW (2002) An experimental study on diversity for bagging and boosting with linear classifiers. Int J Inf Fusion 3(2):245–258
Article Google Scholar
Masulli F, Valentini G (2004) Effectiveness of error correcting output coding methods in ensemble and monolithic learning machines. Pattern Anal Appl 6(4):285–300
Article MathSciNet Google Scholar
Maulik U, Bandyopadhyay S (2002) Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal Mach Intell 24(12):1650–1654
Article Google Scholar
Milgram J, Cheriet M, Sabourin R (2005) Estimating accurate multi-class probabilities with support vector machines. International joint conference on neural networks (IJCNN 05), pp 1906–1911
Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2002) Automatic recognition of handwritten numerical strings: a recognition and verification strategy. IEEE Trans Pattern Anal Mach Intell 24(11):1438–1454
Article Google Scholar
Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recognit 37(3):487–501
Article MATH Google Scholar
Pekalska E, Skurichina M, Duin RPW (2004) Combining dissimilarity-based one-class classifiers. international workshop on multiple classifier systems (MCS 2004), pp 122–133
Rabiner LR (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Article Google Scholar
Rabiner LR, Juang BH (1993) Fundamentals of speech recognition. Prentice-Hall, Engelwood Cliffs
Google Scholar
Ruta D, Gabrys B (2005) Classifier selection for majority voting. Int J Inf Fusion, pp 63–81
Schapire RE, Freund Y, Bartlett P, Lee WS (1998) Boosting the margin: a new explanation for the effectiveness of voting methods. Ann Stat 26(5):1651–1686
Article MATH MathSciNet Google Scholar
Seo J, Shneiderman B (2002) Interactively exploring hierarchical clustering results. IEEE Comput 35(7):80–86
Google Scholar
Shipp CA, Kuncheva LI (2002) Relationships between combination methods and measures of diversity in combining classifiers. Int J Inf Fusion 3(2):135–148
Article Google Scholar
Smyth P, Heckerman D, Jordan MI (1997) Probabilistic independence networks for hidden Markov probability models. Neural Comput 9:227–269
Article MATH Google Scholar
Wang X (1994) Durationally constrained training of HMM without explicit state durational. Proc Inst Phonetic Sci 18:111–130
Google Scholar
Wolpert DH, Macready WG (1997) No free lunch theorems for search. In: IEEE transactions on evolutionary computation
Whitley D (2000) Functions as permutations: regarding no free lunch, walsh analysis and summary statistics. Parallel problem solving from nature (PPSN 2000), pp 169–178
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE transactions of pattern analysis and machine intellegence, pp 841–847
Xu L, Krzyzak A, Suen CY (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans Syst Man Cybern 22(3):418–435
Article Google Scholar
Xu L, Krzyzak A, Suen CY (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans Syst Man Cybern 22(3):418–435
Article Google Scholar
Zouari H, Heutte L, Lecourtier Y, Alimi A (2004) Building diverse classifier outputs to evaluate the behavior of combination methods: the case of two classifiers. International workshop on multiple classifier systems (MCS 2004), pp 273–282

Download references

Acknowledgment

This work was supported in part by grant OGP0106456 to Robert Sabourin from the NSERC of Canada.

Author information

Authors and Affiliations

LIVIA, École de Technologie Supérieure, University of Quebec, 1100 Notre-Dame West Street, Montreal, Quebec, H3C 1K3, Canada
Albert Hung-Ren Ko & Robert Sabourin
PPGIA, Pontifical Catholic University of Parana, Rua Imaculada Conceicao, 1155, Curitiba, PR 80215-901, Brazil
Alceu de Souza Britto Jr.

Authors

Albert Hung-Ren Ko
View author publications
You can also search for this author in PubMed Google Scholar
Robert Sabourin
View author publications
You can also search for this author in PubMed Google Scholar
Alceu de Souza Britto Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Albert Hung-Ren Ko.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ko, A.HR., Sabourin, R. & Britto, A.d. Ensemble of HMM classifiers based on the clustering validity index for a handwritten numeral recognizer. Pattern Anal Applic 12, 21–35 (2009). https://doi.org/10.1007/s10044-007-0094-6

Download citation

Received: 09 June 2006
Accepted: 05 October 2007
Published: 29 November 2007
Issue Date: February 2009
DOI: https://doi.org/10.1007/s10044-007-0094-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensemble of HMM classifiers based on the clustering validity index for a handwritten numeral recognizer

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Clustering and maximum likelihood search for efficient statistical classification with medium-sized databases

The efficiency of the NSHP^Z-HMM: theoretical and practical study

Data Classification with Ensembles of One-Class Support Vector Machines and Sparse Nonnegative Matrix Factorization

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Ensemble of HMM classifiers based on the clustering validity index for a handwritten numeral recognizer

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Clustering and maximum likelihood search for efficient statistical classification with medium-sized databases

The efficiency of the NSHPZ-HMM: theoretical and practical study

Data Classification with Ensembles of One-Class Support Vector Machines and Sparse Nonnegative Matrix Factorization

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

The efficiency of the NSHP^Z-HMM: theoretical and practical study