Abstract
This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in metalearning, and Landmarking that is the most recently developed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
C. E. Brodley. Recursive automatic bias selection for classifier construction. Machine Learning, 20:63–94, 1995.
H. Bensusan, and C. Giraud-Carrier. Discovering Task Neighbourhoods through Landmark Learning Performances. In Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery. 325–330, 2000.
H. Bensusan, C. Giraud-Carrier, and C. Kennedy. Higher-order Approach to Metalearning. The ECML’2000 workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination, 109–117, 2000.
C. Blake, E. Keogh, C. Merz. http://www.ics.uci.edu/~mlearn/mlrepository.html. University of California, Irvine, Dept. of Information and Computer Sciences,1998.
P. Brazdil, J. Gama, and R. Henery. Characterizing the Applicability of Classification Algorithms using Meta Level Learning. In Proceeedings of the European Conference on Machine Learning, ECML-94, 83–102, 1994.
T. G. Dietterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895–1924, 1998.
E. Gordon, and M. desJardin. Evaluation and Selection of Biases. Machine Learning, 20(1-2):5–22, 1995.
A. Kalousis, and M. Hilario. Model Selection via Meta-learning: a Comparative Study. In Proceedings of the 12th International IEEE Conference on Tools with AI, Vancouver. IEEE press. 2000.
A. Kalousis, and M. Hilario. Feature Selection for Meta-Learning. In Proceedings of the 5th Pacific Asia Conference on Knowledge Discovery and Data Mining. 2001.
C. Koepf, C. Taylor, and J. Keller. Meta-analysis: Data characterisation for classification and regression on a meta-level. In Antony Unwin, Adalbert Wilhelm, and Ulrike Hofmann, editors, Proceedings of the International Symposium on Data Mining and Statistics, Lyon, France, (2000).
R. Kohavi. Scaling up the Accuracy of Naïve-bayes Classifier: a Decision Tree hybrid. 2nd Int. Conf. on Knowledge Discovery and Data Mining, 202–207. (1996)
M. Lagoudakis, and M. Littman. Algorithm selection using reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), 511–518, Stanford, CA. (2000)
C. Linder, and R. Studer. AST: Support for Algorithm Selection with a CBR Approach. Proceedings of the 16th International Conference on Machine Learning, Workshop on Recent Advances in Meta-Learning and Future Work. 1999.
D. Michie, D. Spiegelhalter, and C. Toylor. Machine Learning, Neural Network and Statistical Classification. Ellis Horwood Series in Artificial Intelligence. 1994.
T. Mitchell. Machine Learning. MacGraw Hill. 1997.
S. Salzberg. On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery, 1:3, 317–327, 1997.
C. Schaffer. Selecting a Claassification Methods by Cross Validation, Machine Learning, 13, 135–143,1993.
C. Schaffer. Cross-validation, stacking and bi-level stacking: Meta-methods for classification learning. In P. Cheeseman and R. W. Oldford, editors, Selecting Models from Data: Artificial Intelligence and Statistics IV, 51–59, 1994.
B. Pfahringer, H. Bensusan, and C. Giraud-Carrier. Tell me who can learn you and I can tell you who you are: Landmarking various Learning Algorithms. Proceedings of the 17th Int. Conf. on Machine Learning. 743–750, 2000.
F. Provost, and B. Buchanan. Inductive policy: The pragmatics of bias selection. Machine Learning, 20:35–61, 1995.
J. R. Quinlan. C4.5: Programs for Machine Learning, Morgan Kaufman, 1993.
J. R. Quinlan. C5.0: An Informal Tutorial, RuleQuest, http//www.rulequest.com, 1998.
L. Rendell, R. Seshu, and D. Tcheng. Layered Concept Learning and Dynamically Variable Bias Management. 10th Inter. Join Conference on AI. 308–314, 1987.
C. Schaffer. A Conservation Law for Generalization Performance. Proceedings of the 11th International Conference on Machine Learning, 1994.
C. Soares. Ranking Classification Algorithms on Past Performance. Master’s Thesis, Faculty of Economics, University of Porto, 2000.
C. Soares. Zoomed Ranking: Selection of Classification Algorithms based on Relevant Performance Information. Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, 126–135, 2000.
S. Y. Sohn. Meta Analysis of Classification Algorithms for Pattern Recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 21, 1137–1144, 1999.
L. Todorvoski, and S. Dzeroski. Experiments in Meta-Level Learning with ILP. Proceedings of the 3th European Conference on Principles on Data Mining and Knowledge Discovery, 98–106, 1999.
A. Webster. Applied Statistics for Business and Economics, Richard D Irwin Inc, 779–784, 1992.
D. Wolpert. The lack of a Priori Distinctions between Learning Algorithms. Neural Computation, 8, 1341–1390, 1996.
D. Wolpert. The Existence of a Priori Distinctions between Learning Algorithms. Neural Computation, 8, 1391–1420, 1996.
H. Bensusan. God doesn’t always shave with Occam’s Razor-learning when and how to prune. In Proceedigs of the 10th European Conference on Machine Learning, 119–124, Berlin, Germany, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peng, Y., Flach, P.A., Soares, C., Brazdil, P. (2002). Improved Dataset Characterisation for Meta-learning. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_14
Download citation
DOI: https://doi.org/10.1007/3-540-36182-0_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00188-1
Online ISBN: 978-3-540-36182-4
eBook Packages: Springer Book Archive