Improved Dataset Characterisation for Meta-learning

Peng, Yonghong; Flach, Peter A.; Soares, Carlos; Brazdil, Pavel

doi:10.1007/3-540-36182-0_14

Yonghong Peng⁷,
Peter A. Flach⁷,
Carlos Soares⁸ &
…
Pavel Brazdil⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2534))

Included in the following conference series:

International Conference on Discovery Science

1257 Accesses

Abstract

This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in metalearning, and Landmarking that is the most recently developed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automated Data Pre-processing via Meta-learning

Towards Automatic Generation of Metafeatures

Attribute-Based Decision Graphs and Their Roles in Machine Learning Related Tasks

References

C. E. Brodley. Recursive automatic bias selection for classifier construction. Machine Learning, 20:63–94, 1995.
Google Scholar
H. Bensusan, and C. Giraud-Carrier. Discovering Task Neighbourhoods through Landmark Learning Performances. In Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery. 325–330, 2000.
Google Scholar
H. Bensusan, C. Giraud-Carrier, and C. Kennedy. Higher-order Approach to Metalearning. The ECML’2000 workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination, 109–117, 2000.
Google Scholar
C. Blake, E. Keogh, C. Merz. http://www.ics.uci.edu/~mlearn/mlrepository.html. University of California, Irvine, Dept. of Information and Computer Sciences,1998.
P. Brazdil, J. Gama, and R. Henery. Characterizing the Applicability of Classification Algorithms using Meta Level Learning. In Proceeedings of the European Conference on Machine Learning, ECML-94, 83–102, 1994.
Google Scholar
T. G. Dietterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895–1924, 1998.
Article Google Scholar
E. Gordon, and M. desJardin. Evaluation and Selection of Biases. Machine Learning, 20(1-2):5–22, 1995.
Article Google Scholar
A. Kalousis, and M. Hilario. Model Selection via Meta-learning: a Comparative Study. In Proceedings of the 12th International IEEE Conference on Tools with AI, Vancouver. IEEE press. 2000.
Google Scholar
A. Kalousis, and M. Hilario. Feature Selection for Meta-Learning. In Proceedings of the 5th Pacific Asia Conference on Knowledge Discovery and Data Mining. 2001.
Google Scholar
C. Koepf, C. Taylor, and J. Keller. Meta-analysis: Data characterisation for classification and regression on a meta-level. In Antony Unwin, Adalbert Wilhelm, and Ulrike Hofmann, editors, Proceedings of the International Symposium on Data Mining and Statistics, Lyon, France, (2000).
Google Scholar
R. Kohavi. Scaling up the Accuracy of Naïve-bayes Classifier: a Decision Tree hybrid. 2nd Int. Conf. on Knowledge Discovery and Data Mining, 202–207. (1996)
Google Scholar
M. Lagoudakis, and M. Littman. Algorithm selection using reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), 511–518, Stanford, CA. (2000)
Google Scholar
C. Linder, and R. Studer. AST: Support for Algorithm Selection with a CBR Approach. Proceedings of the 16th International Conference on Machine Learning, Workshop on Recent Advances in Meta-Learning and Future Work. 1999.
Google Scholar
D. Michie, D. Spiegelhalter, and C. Toylor. Machine Learning, Neural Network and Statistical Classification. Ellis Horwood Series in Artificial Intelligence. 1994.
Google Scholar
T. Mitchell. Machine Learning. MacGraw Hill. 1997.
Google Scholar
S. Salzberg. On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery, 1:3, 317–327, 1997.
Article Google Scholar
C. Schaffer. Selecting a Claassification Methods by Cross Validation, Machine Learning, 13, 135–143,1993.
Google Scholar
C. Schaffer. Cross-validation, stacking and bi-level stacking: Meta-methods for classification learning. In P. Cheeseman and R. W. Oldford, editors, Selecting Models from Data: Artificial Intelligence and Statistics IV, 51–59, 1994.
Google Scholar
B. Pfahringer, H. Bensusan, and C. Giraud-Carrier. Tell me who can learn you and I can tell you who you are: Landmarking various Learning Algorithms. Proceedings of the 17th Int. Conf. on Machine Learning. 743–750, 2000.
Google Scholar
F. Provost, and B. Buchanan. Inductive policy: The pragmatics of bias selection. Machine Learning, 20:35–61, 1995.
Google Scholar
J. R. Quinlan. C4.5: Programs for Machine Learning, Morgan Kaufman, 1993.
Google Scholar
J. R. Quinlan. C5.0: An Informal Tutorial, RuleQuest, http//www.rulequest.com, 1998.
L. Rendell, R. Seshu, and D. Tcheng. Layered Concept Learning and Dynamically Variable Bias Management. 10th Inter. Join Conference on AI. 308–314, 1987.
Google Scholar
C. Schaffer. A Conservation Law for Generalization Performance. Proceedings of the 11th International Conference on Machine Learning, 1994.
Google Scholar
C. Soares. Ranking Classification Algorithms on Past Performance. Master’s Thesis, Faculty of Economics, University of Porto, 2000.
Google Scholar
C. Soares. Zoomed Ranking: Selection of Classification Algorithms based on Relevant Performance Information. Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, 126–135, 2000.
Google Scholar
S. Y. Sohn. Meta Analysis of Classification Algorithms for Pattern Recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 21, 1137–1144, 1999.
Article Google Scholar
L. Todorvoski, and S. Dzeroski. Experiments in Meta-Level Learning with ILP. Proceedings of the 3th European Conference on Principles on Data Mining and Knowledge Discovery, 98–106, 1999.
Google Scholar
A. Webster. Applied Statistics for Business and Economics, Richard D Irwin Inc, 779–784, 1992.
Google Scholar
D. Wolpert. The lack of a Priori Distinctions between Learning Algorithms. Neural Computation, 8, 1341–1390, 1996.
Article Google Scholar
D. Wolpert. The Existence of a Priori Distinctions between Learning Algorithms. Neural Computation, 8, 1391–1420, 1996.
Article Google Scholar
H. Bensusan. God doesn’t always shave with Occam’s Razor-learning when and how to prune. In Proceedigs of the 10th European Conference on Machine Learning, 119–124, Berlin, Germany, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Bristol, UK
Yonghong Peng & Peter A. Flach
LIACC/Fac. of Economics, University of Porto, Portugal
Carlos Soares & Pavel Brazdil

Authors

Yonghong Peng
View author publications
You can also search for this author in PubMed Google Scholar
Peter A. Flach
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Soares
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Brazdil
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Deutsches Forschungszentrum für Künstliche Intelligenz, Stuhlsatzenhausweg 3, 66123, Saarbrücken, Germany
Steffen Lange
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Ken Satoh
Department of Computer Science, University of Maryland, College Park, 20742, Maryland, MD, USA
Carl H. Smith

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, Y., Flach, P.A., Soares, C., Brazdil, P. (2002). Improved Dataset Characterisation for Meta-learning. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_14

Download citation

DOI: https://doi.org/10.1007/3-540-36182-0_14
Published: 08 November 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00188-1
Online ISBN: 978-3-540-36182-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Improved Dataset Characterisation for Meta-learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Automated Data Pre-processing via Meta-learning

Towards Automatic Generation of Metafeatures

Attribute-Based Decision Graphs and Their Roles in Machine Learning Related Tasks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improved Dataset Characterisation for Meta-learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Automated Data Pre-processing via Meta-learning

Towards Automatic Generation of Metafeatures

Attribute-Based Decision Graphs and Their Roles in Machine Learning Related Tasks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation