Abstract
Currently many classification algorithms exist and no algorithm exists that would outperform all the others. Therefore it is of interest to determine which classification algorithm is the best one for a given task. Although direct comparisons can be made for any given problem using a cross-validation evaluation, it is desirable to avoid this, as the computational costs are significant. We describe a method which relies on relatively fast pairwise comparisons involving two algorithms. This method is based on a previous work and exploits sampling landmarks, that is information about learning curves besides classical data characteristics. One key feature of this method is an iterative procedure for extending the series of experiments used to gather new information in the form of sampling landmarks. Metalearning plays also a vital role. The comparisons between various pairs of algorithm are repeated and the result is represented in the form of a partially ordered ranking. Evaluation is done by comparing the partial order of algorithm that has been predicted to the partial order representing the supposedly correct result. The results of our analysis show that the method has good performance and could be of help in practical applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)
Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining. Springer, Heidelberg (2009)
Brazdil, P., Soares, C., Costa, J.: Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning 50, 251–277 (2003)
Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Proc. of the 12th International Conference on Machine Learning, Tahoe City, CA, July 9-12, pp. 115–123. Morgan Kaufmann, San Francisco (1995)
Cook, W.D., Kress, M., Seiford, L.W.: A general framework for distance-based consensus in ordinal ranking models. European Journal of Operational Research 96(2), 392–397 (1996)
Costa, J.P., Soares, C.: A weighted rank measure of correlation. Australian and New Zealand Journal of Statistics 47(4), 515–529 (2005)
Fürnkranz, J., Petrak, J.: An evaluation of landmarking variants. In: Proceedings of the ECML/PKDD Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning (IDDM 2001), pp. 57–68. Springer, Heidelberg (2001)
le Cessie, S., van Houwelingen, J.C.: Ridge estimators in logistic regression. Applied Statistics 41(1), 191–201 (1992)
Leite, R., Brazdil, P.: Predicting relative performance of classifiers from samples. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 497–503. ACM Press, New York (2005)
Leite, R., Brazdil, P.: An iterative process for building learning curves and predicting relative performance of classifiers. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS (LNAI), vol. 4874, pp. 87–98. Springer, Heidelberg (2007)
Leite, R., Brazdil, P.: An iterative process of building learning curves and predicting relative performance of classifiers. In: Brazdil, P., Bernstein, A. (eds.) Proceedings of the Planning to Learn Workshop (PlanLearn 2007), held at ECML/ PKDD 2007, pp. 31–40 (2007)
Leite, R., Brazdil, P.: Selecting classifiers using metalearning with sampling landmarks and data characterization. In: Brazdil, P., Bernstein, A., Hunter, L. (eds.) Proceedings of the Planning to Learn Workshop (PlanLearn 2008), held at ICML/COLT/UAI 2008, Helsinki, Finland, pp. 35–41 (2008)
Ler, D., Koprinska, I., Chawla, S.: A new landmarker generation based on correlativity. In: Proceedings of the IEEE International Conference on Machine Learning and Applications, pp. 178–185. IEEE Press, Louisville (2004)
Ler, D., Koprinska, I., Chawla, S.: Utilizing regression-based landmarkers within a meta-learning framework for algorithm selection. In: Proceedings of the Workshop on Meta-Learning, associated with 22nd International Conference on Machine Learning, Bonn, Germany, pp. 44–51 (2005)
Metal project site (1999), http://www.metal-kdd.org/
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education, London (2003)
Soares, C.: Learning Rankings of Learning Algorithms. PhD thesis, Department of Computer Science, Faculty of Sciences, University of Porto (2004)
Soares, C., Petrak, J., Brazdil, P.: Sampling-based relative landmarks: Systematically test-driving algorithms before choosing. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS (LNAI), vol. 2258, pp. 88–94. Springer, Heidelberg (2001)
Witten, I., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.: Weka: Practical machine learning tools and techniques with Java implementations (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Brazdil, P., Leite, R. (2010). Determining the Best Classification Algorithm with Recourse to Sampling and Metalearning. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning I. Studies in Computational Intelligence, vol 262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05177-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-05177-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05176-0
Online ISBN: 978-3-642-05177-7
eBook Packages: EngineeringEngineering (R0)