Abstract
Classification problems with more than two classes can be handled in different ways. The most used approach is the one which transforms the original multiclass problem into a series of binary subproblems which are solved individually. In this approach, should the same base classifier be used on all binary subproblems? Or should these subproblems be tuned independently? Trying to answer this question, in this paper we propose a method to select a different base classifier in each subproblem—following the one-versus-one strategy—making use of data complexity measures. The experimental results on 17 real-world datasets corroborate the adequacy of the method.
Similar content being viewed by others
References
Milgram, J., Cheriet, M., Sabourin, R.: One against one or one against all: which one is better for handwriting recognition with SVMS? In: Tenth International Workshop on Frontiers in Handwriting Recognition. Suvisoft (2006)
Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15), 2429–2437 (2004)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature selection and classification in multiple class datasets: an application to KDD cup 99 dataset. Expert Syst. Appl. 38(5), 5947–5957 (2011)
Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3(Mar), 1289–1305 (2003)
Fürnkranz, J.: Round robin classification. J. Mach. Learn. Res. 2(Mar), 721–747 (2002)
Lorena, A.C., De Carvalho, A.C.: Evolutionary tuning of SVM parameter values in multiclass problems. Neurocomputing 71(16), 3326–3334 (2008)
Reid, S.R.: Model combination in multiclass classification. Ph.D. thesis, University of Colorado (2010)
Mendialdua, Í., María Martínez-Otzeta, J., Rodríguez-Rodríguez, I., Ruiz-Vázquez, T., Sierra, B.: Dynamic selection of the best base classifier in one versus one. Knowl.-Based Syst. 85, 298–306 (2015)
Kam Ho, T., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 2, 263–286 (1995)
Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: a unifying approach for margin classifiers. J. Mach. Learn. Res. 1, 113–141 (2000)
Hsu, C.-W., Lin, C.-J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognit. 44(8), 1761–1776 (2011)
Liepert, M.: Topological fields chunking for German with SVMS: optimizing SVM-parameters with gas. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (2003)
De Souza, B.F., De Carvalho, A.C., Calvo, R., Ishii, R.P.: Multiclass SVM model selection using particle swarm optimization. In: Sixth International Conference on Hybrid Intelligent Systems, 2006. HIS’06, pp. 31–31. IEEE (2006)
Lebrun, G., Lezoray, O., Charrier, C., Cardot, H.: An ea multi-model selection for SVM multiclass schemes. In: International Work-Conference on Artificial Neural Networks, pp. 260–267. Springer (2007)
Kuncheva, L.I.: Switching between selection and fusion in combining classifiers: an experiment. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 32(2), 146–156 (2002)
Liu, R., Yuan, B.: Multiple classifiers combination by clustering and selection. Inf. Fusion 2(3), 163–168 (2001)
Szepannek, G., Bischl, B., Weihs, C.: On the combination of locally optimal pairwise classifiers. Eng. Appl. Artif. Intell. 22(1), 79–85 (2009)
Kang, S., Cho, S.: Optimal construction of one-against-one classifier based on meta-learning. Neurocomputing 167, 459–466 (2015)
Woods, K., Philip Kegelmeyer, W., Bowyer, K.: Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell. 19(4), 405–410 (1997)
Ko, A.H.R., Sabourin, R., Souza Britto Jr., A.: From dynamic classifier selection to dynamic ensemble selection. Pattern Recognit. 41(5), 1718–1731 (2008)
Dos Santos, E.M., Sabourin, R., Maupin, P.: A dynamic overproduce-and-choose strategy for the selection of classifier ensembles. Pattern Recognit. 41(10), 2993–3009 (2008)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: Dynamic classifier selection for one-vs-one strategy: avoiding non-competent classifiers. Pattern Recognit. 46(12), 3412–3424 (2013)
Morán-Fernández, L., Bolón-Canedo, V., Alonso-Betanzos, A.: Selection of the best base classifier in one-versus-one using data complexity measures. In: Conference of the Spanish Association for Artificial Intelligence, pp. 110–120. Springer (2016)
Cano, J.-R.: Analysis of data complexity measures for classification. Expert Syst. Appl. 40(12), 4820–4831 (2013)
Morán-Fernández, L., Bolón-Canedo, V., Alonso-Betanzos, A.: Can classification performance be predicted by complexity measures? A study using microarray data. Knowl. Inf. Syst. 51(3), 1067–1090 (2016)
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Vapnik, V.N., Vapnik, V.: Statistical Learning Theory, vol. 1. Wiley, New York (1998)
Ho, T.K., Bernadó-Mansilla, E.: Classifier domains of competence in data complexity space. In: Data Complexity in Pattern Recognition, pp. 135–152. Springer (2006). doi:10.1007/978-1-84628-172-3_7
Bache, K., Linchman, M.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml/. Accessed February 2017
Statnikov, A., Aliferis, C., Tsardinos, I.: Gems: gene expression model selector. http://www.gems-system.org/. Accessed February 2017
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32(200), 675–701 (1937)
Tukey, J.W.: Comparing individual means in the analysis of variance. Biometrics 5, 99–114 (1949)
Acknowledgements
This research has been financially supported in part by the Spanish Ministerio de Economía y Competitividad (Research Project TIN2015-65069-C2-1-R), by European Union FEDER funds and by the Consellería de Industria of the Xunta de Galicia (Research Project GRC2014/035). Financial support from the Xunta de Galicia (Centro singular de investigación de Galicia accreditation 2016–2019) and the European Union (European Regional Development Fund—ERDF) is gratefully acknowledged (Research Project ED431G/01).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
This appendix reports the experimental results achieved in this work. Table 4 depicts the classification accuracy obtained by the different approaches (OVO-kNN, OVO-SVM and CSC) with the three decoding techniques—Weighted voting (WV), Hamming (H-dec) and Loss-based (LB-dec)—for the 17 multiclass datasets.
Rights and permissions
About this article
Cite this article
Morán-Fernández, L., Bolón-Canedo, V. & Alonso-Betanzos, A. On the use of different base classifiers in multiclass problems. Prog Artif Intell 6, 315–323 (2017). https://doi.org/10.1007/s13748-017-0126-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13748-017-0126-4